Advertisement

Recent Advances in 3D Gaussian Splatting

阅读量:

Abstract

The breakthrough in 3DGS has significantly enhanced the rendering performance of novel viewpoint synthesis.

While other methods, such as those based on neural implicit representations (e.g., NeRF), represent a 3D scene using position and viewpoint-conditioned neural networks, 3DGS employs a collection of Gaussian ellipsoids to model the scene. This allows for efficient rendering by rasterizing these ellipsoids into images.

Apart from the fast rendering speed, the explicit representation of 3DGS facilitates downstream tasks like dynamic reconstruction, geometry editing, and physical simulation.

Figure

Figure 1

Structure of the literature review and taxonomy of 3DGS.

Figure 2

A brief timeline of representative works with the 3DGS representation.

Figure 3

Overview of GaussianPro. The consistency of the normal directions among neighboring views is recognized as a critical factor in achieving accurate 3D surface reconstruction.

Figure 4

Pipeline from EAGLES. Vector Quantization (VQ) is applied to reduce the dimensionality of Gaussian features.

Figure 5

The system named Deformable3DGS achieved significant outcomes. By processing a set of single-view images captured from diverse angles, it successfully implemented advanced spatial reconstruction techniques and optimized temporal data processing, outperforming existing methods like HyperNeRF in rendering quality.

Figure 6

The results achieved by GaussianObject demonstrate significant improvements over earlier NeRF-based methods and more recent advancements in the field of 3D reconstruction. By employing just four views, GaussianObject successfully realizes high-quality 3D reconstruction.

Figure 7

This pipeline, known as Mesh-based Gaussian Splatting for Real-time Large-scale Deformation, enables large-scale geometric editing through the binding of 3D Gaussians onto the mesh.

Figure 8

By disassembling the materials and lighting components, GS-IR technology allows for appearance alteration by supporting relighting operations and enabling material-level modifications.

Figure 9

Processing pipeline for PhysGaussian. Handling 3D Gaussian distributions as a continuum, the software is capable of generating physically accurate simulation outcomes.

Figure 10

Geometry reconstruction results by SuGaR.

Figure 11

The achievements of GPS-Gaussian, MonoGaussianAvatar, and MANUS are notable. In turn, they investigate the 3DGS-based approaches in whole-body, upper-head, and lower-hand modeling respectively.

Figure 12

The outputs of text-based 3D generation by Luciddreamer are achieved through extracting generative priors from pre-trained 2D diffusion models, employing the Interval Score Matching (ISM) objective for enhanced precision in reconstructing 3D shapes from a given text prompt.

Figure 13

The outputs of the four-dimensional data generation process for AYG [264] reveal diverse four-dimensional motion sequences. These sequences are visualized using dotted lines to depict the deformation dynamics of the transformation field.

Figure 14

有代表性的研究工作基于3DGS表示在多个任务领域中展开

3D representations

Mesh

Mesh is composed of a collection of vertices, edges, and faces, which can represent complex geometric details at relatively low storage requirements.

Among the most commonly utilized 3D representations within the industrial sector, it enables the creation of high-fidelity visual outcomes when supported by physically-based materials during real-time processing.

However, most meshes are typically created by artists or creators, which requires considerable effort. Despite efforts to automate mesh generation using neural networks, the generative ability of such systems remains constrained by the limited quantity of available datasets.

SDF and NeRF

Both SDF and NeRF are rooted in implicit neural fields , which can be automatically learned from multi-view image datasets. The explicit geometric representations or meshes can also be derived from SDF or NeRF using the marching cubes algorithm. Consequently, SDF and NeRF offer significant advantages in tasks such as inverse rendering, where accurate surface representation is crucial.

However, owing to the dense sampling in 3D space, rendering becomes less efficient and may severely restrict the practicality of such methods on consumer-grade hardware. Meanwhile, SDF and NeRF approaches prove ineffective for dynamic scene reconstruction due to their inherent reliance on implicit representations.

3DGS

与传统的网格结构不同的是,尽管3DGS同样具备明确的几何特征,在其构建过程中并未引入边或面来连接不同的Gaussian分量。为了弥补这种缺失的信息连接,在此方法中,默认情况下每个Gaussian被赋予了各向异性尺度参数,并通过填补相邻Gaussian之间的间隙来生成具有真实感的新视角。

With the 栅格化渲染技术 , 3DGS enables real-time 3D scene visualization on consumer-grade devices , which facilitates applications including large-scale scene reconstruction, simultaneous localization and mapping (SLAM), and generative tasks requiring computational efficiency.

Additionally, due to its clear geometric modeling, this approach allows flexible point reprojection between different viewpoints. This simplifies tasks such as simultaneous geometry reconstruction and camera pose optimization, making dynamic reconstruction simpler. Furthermore, it facilitates tasks like SLAM or large-scale scene reconstruction, ensuring that these challenging operations are more manageable.

However, owing to the discretized geometric representation, the quality of geometry reconstruction in current 3DGS-based methods is on par with previous SDF-based approaches such as NeuS. I see potential in integrating complementary 3D representations alongside the 3DGS framework, which could pave the way for enhanced applications such as autonomous vehicles and animation workflows.

Challenges

Robust and generalizable novel view synthesis

Despite having demonstrated realistic novel view synthesis capabilities, the method's reconstruction quality tends to diminish significantly when handling difficult cases such as low-count projection sets, complicated lighting conditions, and extensive environments.

Despite numerous efforts to enhance performance, there remain areas for improvement. Improving its reconstruction robustness across diverse inputs presents a critical challenge.

This study aims to create a transferable reconstruction framework either in the presence of/absence of data prior, which would effectively decrease the training cost.

Geometry reconstruction

While significant efforts have been dedicated to enhancing rendering quality, fewer approaches effectively address geometry and surface reconstruction using the 3DGS representation. The limitations of 3DGS in terms of geometry quality stem from its discrete nature, which remains a challenge compared to continuous implicit representations such as NeRF and SDF.

Independent and efficient 3D editing

A few methods have explored the area of editing 3DGS's geometry, texture, and lighting.

However, these methods are unable to decompose geometry, texture, and lighting precisely or require re-optimization of Gaussians' attributes. Consequently, these approaches still lack independent editing capabilities or face inefficiency during the editing process.

It is expected that advanced rendering techniques will be used to extract geometry, texture, and lighting, allowing for independent editing and establishing a link between 3DGS and mesh-based representation, thus making efficient editing possible.

Realistic 4D generation

With the assistance of a novel loss function derived from SDS and SD, generative systems employing a 3DGS-based representation are capable of accurately reconstructing complex structures in three dimensions.

However, the 4D generation results achieved by current methods do not fully capture realistic geometric features, visual characteristics, and physics-based movement.

Incorporating prior data, such as outcomes derived from video generative models and physical laws, could potentially enhance the quality of generated 4D content.

Platform

Major implementations of methods and frameworks such as GauStudio for 3DGS representation are implemented using Python, relying on the CUDA-enabled PyTorch framework. This approach may constrain its broader usability on various platforms in the future.

By utilizing deep learning-based frameworks, such as Tensorflow and Jittor, the process of implementing these models becomes easier to deploy on additional platforms.

全部评论 (0)

还没有任何评论哟~