【读论文】AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction

阅读量：

文章目录

1. What
2. Why
3. How
- 3.1 Input
- 3.2 Background Reconstruction
- 3.3 Foreground Reconstruction
- - 3.3.1 Constructing Template Gaussians
  - 3.3.2 Reflected Gaussian Consistency
  - 3.3.3 Dynamic Appearance Modeling
4. Experiment
- 4.1 Experimental Setup
- 4.2 Main Results
- 4.3 Ablation Studies

1. What

For scene construction and novel view synthesis, this paper imposes geometric constraints on Gaussian representing the road and sky regions, leverages 3D templates to initialize the foreground points, and introduces a reflected Gaussian consistency to supervise the unseen sides of the foreground objects. Moreover, it uses residual spherical harmonics for foreground objects. Finally, it achieves sota on Pandaset and KITTI datasets in both construction tasks and novel view synthesis with lateral ego-vehicle trajectory adjustments.

2. Why

Question of PVG: this method does not tackle the simulation of novel scenarios, such as ego-vehicle lane changes and adjusting object trajectories(It can’t edit scenes).

3. How

3.1 Input

A series of N images ( $I_i$ ) taken by a camera with its corresponding intrinsic ( $K_i$ ) and extrinsic ( $E_i$ ) matrices, along with the 3D LiDAR point clouds $L_i$ **and corresponding dynamic objects trajectories $T_i$ .

3.2 Background Reconstruction

The road and sky regions are decomposed from the rest of the background using semantic masks.

By projecting LiDAR points to the image plane at each time step $i$ , each Gaussian is assigned to one of the road, sky, and other class.

When splatting road and sky Gaussians, these Gaussians are constrained to be flat by minimizing their roll and pitch angles as well as their vertical scale.

Finally, the loss was defined as:

$\mathcal{L}_{BG}=(1-\lambda)\mathcal{L}_{1}(I_{g},\hat{I}_{g})+\lambda\mathcal{L}_{DSSIM}(I_{g},\hat{I}_{g})+\beta\mathcal{C}_{g}\quad g\in\{road,sky,other\}\\ \mathcal{C}_{g}=\begin{cases}\frac{1}{N_g}\sum_{i=1}^{N_g}\left(|\phi_i|+|\theta_i|+|s_{z_i}|\right)&\mathrm{if} g\in\{road,sky\}\\ 0&\mathrm{else}\end{cases}$

In the second phase of background reconstruction, all Gaussians are splatted together and supervised on the whole image using LBG with $g\in\{road\cup sky \cup other\}$ .

3.3 Foreground Reconstruction

3.3.1 Constructing Template Gaussians

Notably, we employ [23], which generates 3D shapes of objects such as vehicles from a single image. Then, given a sequence of frames with $K$ foreground objects, the template is copied $K$ times and placed into the scene based on the object trajectories.

3.3.2 Reflected Gaussian Consistency

Foreground objects exhibit symmetry in their structure. Leveraging this assumption helps to improve the reconstruction quality, especially in scenarios with limited views [39].

The reflection matrix M for the Gaussians can be defined as:

$M=I-2\frac{aa^T}{\|a\|^2}$

where $a$ represents the axis of reflection and $I$ is identity. And the property of Gaussian can be reflected by:

$\begin{aligned}\tilde{x}&=Mx\\ \tilde{R}&=MR\\ \tilde{f}_{SH}&=D_{M}f_{SH}\end{aligned}$

where $D_M$ is a Wigner D-matrix.

This reflected consistency constraint enforces the rendering results of the Gaussian of the two symmetrical sides of the object to be similar. It can be seen from the pipeline of training phase below:
在这里插入图片描述

3.3.3 Dynamic Appearance Modeling

Similar to StreetGaussian, to simulate vital signals such as indicator lights, headlights, and taillights, this paper models the dynamic appearance of foreground objects by:

$\Delta f_{SH,t}=MLP(E_{t},x,f_{SH})\\ f_{SH,t}=f_{SH}+\Delta f_{SH,t}$

Finally, the overall loss for foreground objects is:

$\mathcal{L}_{FG}=(1-\lambda)\mathcal{L}_{1}(I_{g},\hat{I}_{g})+\lambda\mathcal{L}_{DSSIM}(I_{g},\hat{I}_{g})+(1-\lambda)\mathcal{L}_{1}(I_{g},\tilde{I}_{g})+\\ \lambda\mathcal{L}_{DSSIM}(I_{g},\tilde{I}_{g})+\gamma\mathcal{L}_{1}(\Delta f_{SH,t})\quad g\in\{fg_{1},fg_{2},...,fg_{K}\}$

where $K$ is the number of foreground objects, and $fg_k$ is a set of Gaussians representing the $k$ -th object. $I_g,\hat{I}_g$ , and $\tilde{I}_g$ denote masked ground-truth images, rasterized images of foreground Gaussians, and reflected foreground Gaussians.

4. Experiment

4.1 Experimental Setup

Dataset : KITTI and Pandaset. Pandaset includes 103 urban driving scenarios in San Francisco, each with 80 image frames and corresponding LiDAR point clouds.

Evaluation Metric : FID is an abbreviation for Frechet Inception Distance, which is a measure of the quality of the resulting image. It compares the distribution of generated and real images in the feature space of the Inception network, a deep learning model.

Implementation Details :

Background: 15K+15K maintain a fixed positioning of road and sky Gaussians

Foreground: 5K+10K iterations for scene fusion, wherein both foreground and background Gaussians are fine-tuned together.

4.2 Main Results

Pandaset experiments: Sota + qualitative
The lateral shift for the ego-vehicle: FID Sota
KITTI: Sota, but LPIPS is not the best. It may be that the fitting of the sky and the road causes the texture to be insufficient, such as clouds

4.3 Ablation Studies

Background Geometry Constraints $\longrightarrow$ FID
Foreground Initialization $\longrightarrow$ FID + qualitative
Reflected Gaussian Consistency Constraint $\longrightarrow$ qualitative
Effect of Dynamic Appearance Modeling $\longrightarrow$ PSNR + high-frequency details
Novel Scenario Simulation
Different types of templates were used for initializing foreground objects

全部评论 (0)

还没有任何评论哟~

【读论文】AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction

文章目录 1\.What 2\.Why 3\.How 3.1Input 3.2BackgroundReconstruction 3.3ForegroundReconstruction 3.3.1Con...

DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes

Abstract WepresentDrivingGaussian,anefficientandeffectiveframeworkforsurroundingdynamicautonomousdri...

[小白读文献] Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting

摘要组织形变对精确的手术场景重建提出了关键的挑战。尽管重建质量较高，但现有方法绘制速度慢、训练时间长，限制了其在术中的应用。基于3DGS这一新兴的实时3D渲染技术的最新进展，本文提出了一种新的快速重...

论文阅读：Enhanced 3D Urban Scene Reconstruction and Point Cloud Densification using Gaussian Splatting

论文阅读：Enhanced3DUrbanSceneReconstructionandPointCloudDensificationusingGaussianSplattingandGoogleEart...

【读论文】【速读】4D Gaussian Splatting for Real-Time Dynamic Scene Rendering

文章目录 1\.What 2\.Preliminary 3\.What 3.1SpatialTemporalStructureEncoder 3.2MultiheadGaussianDeformati...

论文笔记：《Object-Aware Guidance for Autonomous Scene Reconstruction》

一、目的利用机器人和RGBD深度传感器可以自动探索和重建未知区域。（主要是室内）二、背景介绍：移动机器人和深度传感器传感器的发展的促进。在此研究方向中，机器人系统一般由两个级别的导航方式构成：...

SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain

Abstract WiththeemergenceofGaussianSplats,recenteffortshavefocusedonlargescalescenegeometricreconstr...

[小白读文献] Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction

摘要三维高斯溅射3DGS在图像绘制和表面重建方面显示出良好的效果。然而，其在体积重建任务如x射线计算机断层扫描中的潜力仍未得到充分开发。本文介绍了第一个基于3DGS的稀疏视图层析重建框架Gaussi...

SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain(2)

RelatedWork GaussianSplatsonLargeScaleSceneReconstruction Largescalescenereconstructionfaceschalleng...

论文阅读：HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction

摘要我们提出了HISLAM2，一个几何感知的高斯SLAM系统，仅使用RGB输入即可实现快速准确的单目场景重建。现有的神经SLAM或基于3dgs的SLAM方法经常在渲染质量和几何精度之间进行权衡，我们...

是否确定退出登录?

【读论文】AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction

文章目录

1. What

2. Why

3. How

3.1 Input

3.2 Background Reconstruction

3.3 Foreground Reconstruction

3.3.1 Constructing Template Gaussians

3.3.2 Reflected Gaussian Consistency

3.3.3 Dynamic Appearance Modeling

4. Experiment

4.1 Experimental Setup

4.2 Main Results

4.3 Ablation Studies

全部评论 (0)

相关文章推荐

【读论文】AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction

DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes

[小白读文献] Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting

论文阅读：Enhanced 3D Urban Scene Reconstruction and Point Cloud Densification using Gaussian Splatting

【读论文】【速读】4D Gaussian Splatting for Real-Time Dynamic Scene Rendering

论文笔记：《Object-Aware Guidance for Autonomous Scene Reconstruction》

SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain

[小白读文献] Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction

SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain(2)

论文阅读：HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction