SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain

阅读量：

Abstract

With the emergence of Gaussian Splats, recent efforts have focused on large-scale scene geometric reconstruction. However, most of these efforts either concentrate on memory reduction or spatial space division , neglecting information in the semantic space.

In this paper, we propose a novel method, named SA-GS, for fine-grained 3D geometry reconstruction using semantic-aware 3D Gaussian Splats.

Specifically, we leverage prior information stored in large vision models(such as SAM and DINO) to generate semantic masks.

We then introduce a geometric complexity measurement function to serve as soft regularization , guiding _theshape of each Gaussian Splat _within specific semantic areas.

Additionally, we present a method that estimates the expected** number** of Gaussian Splats in different semantic areas, effectively providing a lower bound for Gaussian Splats in these areas.

Subsequently, we extract the point cloud using a novel probability density-based extraction method, transforming Gaussian Splats into a point cloud crucial for downstream tasks.

Our method also offers the potential for detailed** semantic inquiries** while maintaining high image-based reconstruction results.

We provide extensive experiments on publicly available large-scale scene reconstruction datasets with highly accurate point clouds as ground truth and our novel dataset. Our results demonstrate the superiority of our method over current state-of-theart Gaussian Splats reconstruction methods by a significant margin in terms of geometric-based measurement metrics.

project page

Figure

Figure 1

Qualitative comparison between our method and other 3DGS based methods.

We proposed Shape constrain, alpha constrain and point cloud extraction in the current study.

Figure 2

The blue section illustrates common methods for reconstructing geometrically aligned Gaussian Splats.

The input for all Gaussian Splatting methods includes a COLMAP initialization consisting of images, camera positions, and SfM sparse point clouds.

The output will be a traditional representation such as a mesh or point cloud.

During training , in addition to the common image rendering loss, most methods encourage all 3D Gaussians to form adisk-like shape.

After several training iterations, or at the end of the training process, other methods select ahard threshold for the alpha value and use the _ remaining Gaussians for geometric reconstruction_.

However, thesehard constraints often result in poorer reconstruction.

Instead of encouraging all Gaussians to adoptthe same shape , our method uses semantic information to control the shape in detail.

① produce semantic masks for each input image,

② extract shape information for each semantic group, and use this information to locally control the shape of each Gaussian.

③ provide an opacity field sampling method that can dynamically allocate the desired number of points and ignore defective reconstruction parts.

Figure 3

the results of reconstructing the Campus and College scenes from GauUsceneV2.

Using SuGaR , many surfaces incorrectly model the lighting conditions due to complex effects, such as how glass reflects sunlight at different angles and how clouds block sunlight.

These imaginary surfaces, which do not represent the true surface, are regarded as fantasy surfaces.

SA-GS largely alleviates this problem.

Another major source of geometric error occurs at theedges of unbounded scenes.

However, this issue is common to all methods due to the sparsity of images at the edges and is not the focus of our current work.

Figure 4

Inconsistency problem.

The semantic segmentation results are sometimes inconsistent with previous judgments.

(a-b) two tunnels are regarded as ground usingGroundingSAM.

(c-d) However, in the images captured from a camera position immediately adjacent to them, the left tunnel is not regarded as ground.

This inconsistency between consecutive images is the primary cause of failure in naive reconstruction methods.

Figure 5

Our method pipeline consists ofthree main stages.

Firstly, we utilize the same input as 3DGS, but enhance it with semantic information extracted via Grounding SAM.

Next, we assess the geometric complexity of each semantic group by calculating high-frequency power.

Our geometric constraint is implemented through a soft regularization , facilitated by a semantic loss function. This guides the Gaussian shapes to match the expected shapes determined earlier.

The rendering loss further **refines the shape and attributes **of the 3DGS, while the shape constraint , indicated by a negative sign, ensures alignment between rendered and real images.

Controlling the shapes of different 3DGS is achieved by mapping their projected pixels onto the semantic map obtained earlier.

Additionally, by reducing thenumber of low-opacity Gaussian splats to the expected count, we minimize GPU memory consumption during training.

Finally, we offer a user-friendly point cloud extraction method viahierarchical probability density sampling.

① we create amultinomial distribution using the opacity values stored in each 3DGS.

② based on user inputs and the multinomial distribution, we determine thenumber of points to sample from each Gaussian distribution.

Figure 6

The comparison between ours and 3DGS. Ours largely sharpening the edge of image.

(a) the tower merges together and sharpened in ours.

(b) eliminates the noise around the high building in ours.

(c) shows that our steadilyalpha decreasing strategy is successful.

Conclusion

We propose a semantic-aware geometric constraint algorithm that dynamically assigns expected shapes to Gaussian splats projected into different semantic groups.

We present an algorithm capable of computing the geometric complexity of Gaussian splats based on spectrum analysis.

Furthermore, we utilize geometric complexity measurement to determine the number of Gaussian splats.

Subsequently, we introduce a hierarchical probability density sampling method that can extract as many points as desired by users while maintaining a dynamic alpha value to mitigate the fantasy surface problem.

Limitation

① during training, we constrain the shape of all Gaussians that project onto the same pixel __without explicitly ignoring Gaussiansblocked by those with high opacity values before them__. This may result in****all Gaussians conforming to the shape of the semantic group that occupies the largest region in the scene.

② our algorithm relies on key semantics provided by users, which may sometimes be absent.

③ while theinconsistency between consecutive images can be addressed by our robust loss, the direct resolution of inconsistency in the 3D world itself has not been achieved.

全部评论 (0)

还没有任何评论哟~

SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain

Abstract WiththeemergenceofGaussianSplats,recenteffortshavefocusedonlargescalescenegeometricreconstr...

SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain(2)

RelatedWork GaussianSplatsonLargeScaleSceneReconstruction Largescalescenereconstructionfaceschalleng...

GeoGaussian: Geometry-aware Gaussian Splatting for Scene Rendering

Abstract Duringthe3DGSoptimizationprocess,thescene’sgeometrycangraduallydeteriorateifitsstructureisn...

Pixel-GS: Density Control with Pixel-aware Gradient for 3D Gaussian Splatting

Abstract 3DGShasdemonstratedimpressivenovelviewsynthesisresultswhileadvancingrealtimerenderingperfor...

论文阅读：HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction

摘要我们提出了HISLAM2，一个几何感知的高斯SLAM系统，仅使用RGB输入即可实现快速准确的单目场景重建。现有的神经SLAM或基于3dgs的SLAM方法经常在渲染质量和几何精度之间进行权衡，我们...

SA-GS: Scale-Adaptive Gaussian Splatting for Training-Free Anti-Aliasing

Abstract Inthispaper,wepresentaScaleadaptivemethodforAntialiasingGaussianSplattingSAGS. WhileMipSpla...

[小白读文献] Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting

摘要组织形变对精确的手术场景重建提出了关键的挑战。尽管重建质量较高，但现有方法绘制速度慢、训练时间长，限制了其在术中的应用。基于3DGS这一新兴的实时3D渲染技术的最新进展，本文提出了一种新的快速重...

Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing

Abstract 3DGS,emergingasagroundbreakingapproach,hasdrawnincreasingattentionforitscapabilitiesofhighf...

【读论文】AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction

文章目录 1\.What 2\.Why 3\.How 3.1Input 3.2BackgroundReconstruction 3.3ForegroundReconstruction 3.3.1Con...

GauU-Scene: A Scene Reconstruction Benchmark on Large Scale 3D Reconstruction Dataset Using Gaussian

Abstract Weintroduceanovellargescalescenereconstructionbenchmarkusingthenewlydeveloped3Drepresentati...

是否确定退出登录?

SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain

Abstract

Figure

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Conclusion

Limitation

全部评论 (0)

相关文章推荐

SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain

SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain(2)

GeoGaussian: Geometry-aware Gaussian Splatting for Scene Rendering

Pixel-GS: Density Control with Pixel-aware Gradient for 3D Gaussian Splatting

论文阅读：HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction

SA-GS: Scale-Adaptive Gaussian Splatting for Training-Free Anti-Aliasing

[小白读文献] Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting

Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing

【读论文】AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction

GauU-Scene: A Scene Reconstruction Benchmark on Large Scale 3D Reconstruction Dataset Using Gaussian