Advertisement

3D Gaussian Splatting for Real-Time Radiance Field Rendering

阅读量:

Abstract

Radiance Fields have recently revolutionized novel viewpoint synthesis for scenes captured using multiple photos or videos. Constructing such fields remains computationally intensive due to the extensive training and rendering requirements of neural networks. While modern faster approaches necessitate a trade-off between computational efficiency and image fidelity, as of today, no existing solution has been able to consistently deliver real-time performance at 1080p resolution for comprehensive radiance fields.

We present three critical components that enable us to achieve leading-edge visual quality without sacrificing competitive training efficiency. Crucially, these elements also support high-quality, real-time novel-view synthesis at a minimum of 30 frames per second, specifically at a resolution of 1080p.

Firstly, based on sparse point distributions generated during camera calibration processes, we model the scene using 3D Gaussian functions that maintain critical characteristics of continuous volumetric radiance fields for scene optimization. This approach thus circumvents redundant computations in vast, empty regions.

In this step, we carry out sequential optimisation and density control procedures for 3D Gaussian distributions, which focuses on achieving optimal isotropy. This approach ensures a precise depiction of the scene with high precision.

Third, we develop a high-speed view-dependent visibility-aware rendering algorithm that supports anisotropic surface texturing, which both accelerates training and enables real-time rendering. We achieve state-of-the-art visual quality as well as real-time performance across multiple benchmark datasets.

Figure

Figure 1

Our method realizes real-time rendering of radiance fields at a level of quality matching the state-of-the-art Mip-NeRF360 approach, while needing only comparable computational effort to the most efficient existing methods like Plenoxels and InstantNGP.

Central to this performance is the innovative 3D Gaussian scene representation paired with a time-varying differentiable rendering technique , which enhances efficiency in both scene reconstruction and novel view generation.

While achieving comparable training times to InstantNGP, our approach matches their performance; however, by only requiring 51 minutes of training, we attain top-tier performance, which is marginally superior to Mip-NeRF360.

Figure 2

From the sparseSfM point cloud and its generated series of 3D Gaussian distributions, we proceed to optimize and dynamically adjust their density. During this optimization phase, we employ an efficient tile-based renderer, enabling competitive training speeds relative to current state-of-the-art methods focused on fast radiance fields. Once fully trained, our system provides real-time navigation capabilities across diverse scene types.

Figure 3

After optimization, we visualize the 3D Gaussians by shrinking them to 60% on the far right side. This vividly demonstrates how anisotropic shapes emerge from these 3D Gaussians, which compactly encapsulate complex geometries following optimization. Leaving behind a detailed reference image.

Figure 4

Our adaptive Gaussian densification scheme.

under-reconstruction: When small-scale geometry (black outline) is insufficiently covered, the respective Gaussian is cloned.

over-reconstruction: When small-scale geometry is represented by a single large splat, the system divides the geometry into two parts.

Figure 5

We demonstrate comparisons between our approach and existing techniques, presenting reference images from unlabelled test sets.

From top to bottom, the scenes consist of Bicycle, Garden, and Stump. Arrows or insets are used to highlight non-obvious differences in quality.

Figure 6

It is evident that, even after just 7K iterations (approximately 5 minutes for this scene), our method has achieved a clear depiction of the train. Upon reaching 30K iterations (approximately 35 minutes), the background artifacts have been notably minimized.

(b) The difference is imperceptible; 7,000 runs (∼8min) achieve an exceptionally high level of quality.

Figure 7

Initialization with SfM points helps.

(a) initialization with a random point cloud.

(b) initialization using SfM points.

Figure 8

Ablation of densification strategy for the two cases "clone" and "split" .

Figure 9

By limiting the number of gradient-receiving points, the effect on visual quality has a notable impact.

(a) limit of 10 Gaussians that receive gradients.

(b) our full method.

Figure 10

During the training process, we experimented with Gaussian anisotropy disabled and enabled. The application of anisotropic volumetric splats allows the modeling of intricate details, significantly enhancing visual fidelity. As a note, we limited Ficus to employ no more than 5k Gaussians across both configurations for illustrative clarity.

Figure 11

Comparison of failure artifacts on Train scene.

(a) Mip-NeRF360 has “floaters” and grainy appearance.

(b)

Figure 12

In the perspective of the limited agreement shown by those observed during training on DrJohnson scene.

(a) our method may produce artifacts.

(b) Mip-NeRF360also has artifacts in these cases.

DISCUSSION AND CONCLUSIONS

It has been our initial approach that enables real-time and high-quality radiance rendering technologies, across diverse scene types and capture methods, while maintaining training times competitive with the fastest previous approaches.

The use of a 3D Gaussian primitive in our approach ensures the maintenance of key properties inherent to volumetric rendering, enabling direct and efficient fast splat-based rasterization. Our research conclusively shows that – contrary to the commonly held belief – discrete representations can effectively support rapid and high-fidelity radiance field training.

Approximately 80% of our training duration is dedicated to writing Python code. This was necessitated by the use of PyTorch, which enables our approach to be straightforwardly accessible to other researchers. Notably, only the rasterization process has been implemented using optimized CUDA kernels. We anticipate that fully migrating the remaining optimizations to CUDA, akin to how InstantNGP handles similar tasks, could result in substantial speed improvements for high-performance applications.

We also highlighted the significance of incorporating real-time rendering principles, leveraging the capabilities of GPUs and efficiency of software rasterization pipeline architecture. These design choices serve as a cornerstone for achieving optimal performance across both training and real-time rendering tasks, offering a competitive edge over previous volumetric ray-marching methods.

Whether it is feasible for our Gaussians to facilitate mesh reconstruction of the captured scene remains an intriguing question. Taking into account their practical implications, which are significant due to their common application across meshes, this approach could offer deeper insights into precisely where our method occupies within the spectrum between volumetric and surface-based representations.

In summary, we have introduced a pioneering real-time rendering approach for radiance fields, matching the rendering capabilities of the most expensive previous methods and achieving training efficiency on par with the fastest existing solutions.

全部评论 (0)

还没有任何评论哟~