MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

阅读量：

Abstract

We present MVSGaussian, a new generalizable 3D Gaussian representation approach derived from Multi-View Stereo (MVS) that can efficiently reconstruct unseen scenes.

we leverage MVS to encode geometry-aware Gaussian representations and decode them into Gaussian parameters.
To further enhance performance, we propose ahybrid Gaussian rendering that integrates an efficient volume rendering design for novel view synthesis.
To support fast fine-tuning for specific scenes, we introduce a multi-view geometric consistent aggregation strategy to effectively aggregate the point clouds generated by the generalizable model, serving as the initialization for per-scene optimization.

Compared with previous generalizable NeRF-based methods, which typically require minutes of fine-tuning and seconds of rendering per image, MVSGaussian achieves real-time rendering withbetter synthesis quality for each scene.

Compared with the vanilla 3DGS, MVSGaussian achieves better view synthesis with less training computational cost. Extensive experiments on DTU , Real Forward-facing , NeRF Synthetic , and Tanks and Temples datasets validate that MVSGaussian attains state-of-the-art performance with convincing generalizability , real-time rendering speed , and fast per-scene optimization.

Figure

Figure 1

Comparison with existing methods.

(a) We present the generalizable results on the Real Forward-facing dataset. Compared with other competitors, our method achievesbetter performance at a faster inference speed.

(b) The results after per-scene optimization, where circle size represents optimization time. Our method achieves optimal performance injust 45 seconds.

(c) We illustrate a scene (“room”), showcasing the (PSNR/optimization time) of synthesized views, with**"-"** indicating results from direct inference using the generalizable model.

Figure 2

Overview of MVSGaussian. We first extractfeatures $eftf_iight_{i=1}^N$ from input source views $eftI_iight_{i=1}^N$ using FPN. These features are then aggregated into a cost volume , regularized by 3D CNNs to produce depth. Subsequently, for each 3D point at the estimated depth , we use a pooling network to aggregate warped source features , obtaining the aggregated feature $f_{psilon}$ . This feature is then enhanced using a 2D UNet , yielding the enhanced feature $f_{g}$ . $f_{g}$ is decoded into Gaussian parameters for splatting, while $f_{psilon}$ is decoded into volume density and radiance for depth-aware volume rendering. Finally, the two rendered images are averaged to produce the final rendered result.

Figure 3

Consistent aggregation. With depth maps and point clouds produced by the generalizable model, we first conduct geometric consistency checks on depths to derive masks for filtering out unreliable points. The****_filtered point clouds _are then concatenated to obtain a point cloud, serving as theinitialization for per-scene optimization.

Figure 4

Qualitative comparison of rendering quality under generalization and 3-view settings with state-of-the-art methods.

Figure 5

Qualitative comparison of rendering quality with state-of-the-art methods after per-scene optimization.

Figure 6

Analysis of the Optimization process.

(a) The evolution of view quality (PSNR) on the Real Forward-facing dataset during thefirst 2000 iterations of our method and 3DGS.

(b) Qualitative comparison of our method and 3DGS on the “trex” scene, where (PSNR/iteration number) is shown.

Figure 7

Visualization of camera calibration and point cloud reconstructio n by COLMAP.

Figure 8

Depth maps visualization. We visualize the depth maps predicted by our method on different datasets.

Figure 9

Qualitative comparison of rendering quality with state-of-the-art methods under generalization and three views settings.

Figure 10

Point cloud visualization under different aggregation strategies.

Figure 11

Qualitative comparison of rendering quality with state-of-the-art methods after per-scene optimization.

Conclusion and Limitations

We present MVSGaussian, an efficient generalizable Gaussian Splatting approach. Specifically, _we leverage MVS to inferdepth, establishing a pixel-aligned Gaussian representation. To enhance generalization, we propose a hybrid rendering approach that integrates depth-aware volume rendering. Besides, thanks to high-quality initialization, our models can be fine-tuned quickly for specific scenes._

Compared with generalizable NeRFs, which typically require minutes of fine-tuning and seconds of rendering per image, MVSGaussian achieves real-time rendering with superior synthesis quality.

Compared with 3DGS, MVSGaussian achieves better view synthesis with reduced training time.

As our method relies on MVS for depth estimation, it inherits limitations from MVS, such as decreased depth accuracy in areas with weak textures or specular reflections , resulting in degraded view quality.

全部评论 (0)

还没有任何评论哟~

MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

Abstract WepresentMVSGaussian,anewgeneralizable3DGaussianrepresentationapproachderivedfromMultiViewS...

Deep Multi-view Stereo for Dense 3D Reconstruction from Monocular Endoscopic Video

DeepMultiviewStereoforDense3DReconstructionfromMonocularEndoscopicVideo 用于单目内窥镜视频稠密三维重建的深度多视角立体成像技术 ...

《MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images》阅读笔记

论文地址：<https://arxiv.org/pdf/2403.14627 项目地址：<https://github.com/donydchen/mvsplat \ 任务：通过稀疏（即少至两张）图...

pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction

文章目录前置知识 1）几种常见的伪影 2）smallbaseline与largebaseline 3）Epipolarline 正文 1）引言 2）相关工作 3）Background:3DGauss...

Gaussian Splatting: 3D Reconstruction and Novel View Synthesis, a Review(3)

ABSTRACT Imagebased3Dreconstructionisachallengingtaskthatinvolvesinferringthe3Dshapeofanobjectorscen...

Gaussian Splatting: 3D Reconstruction and Novel View Synthesis, a Review(1)

ABSTRACT Imagebased3Dreconstructionisachallengingtaskthatinvolvesinferringthe3Dshapeofanobjectorscen...

Gaussian Splatting: 3D Reconstruction and Novel View Synthesis, a Review(2)

ABSTRACT Imagebased3Dreconstructionisachallengingtaskthatinvolvesinferringthe3Dshapeofanobjectorscen...

GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing

Abstract WeproposeGaussCtrl,atextdrivenmethodtoedita3Dscenereconstructedbythe3DGS. Ourmethodfirstren...

[小白读文献] Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting

摘要组织形变对精确的手术场景重建提出了关键的挑战。尽管重建质量较高，但现有方法绘制速度慢、训练时间长，限制了其在术中的应用。基于3DGS这一新兴的实时3D渲染技术的最新进展，本文提出了一种新的快速重...

3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting

Abstract Inthispaper,wepresentanimplicitsurfacereconstructionmethodwith3DGS,namely3DGSR,thatallowsfo...

是否确定退出登录?

MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

Abstract

Figure

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Figure 11

Conclusion and Limitations

全部评论 (0)

相关文章推荐

MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

Deep Multi-view Stereo for Dense 3D Reconstruction from Monocular Endoscopic Video

《MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images》阅读笔记

pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction

Gaussian Splatting: 3D Reconstruction and Novel View Synthesis, a Review(3)

Gaussian Splatting: 3D Reconstruction and Novel View Synthesis, a Review(1)

Gaussian Splatting: 3D Reconstruction and Novel View Synthesis, a Review(2)

GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing

[小白读文献] Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting

3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting