Advertisement

CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians

阅读量:

Abstract

Marked progress has been made in advancing real-time 3D scene reconstruction and novel view synthesis, largely due to the development of 3DGS. Yet, efficiently training large-scale 3DGS while rendering them in real-time at different scales presents significant challenges.

This study presents CityGaussian (CityGS), which is implemented using the divide-and-conquer methodology and incorporates the Level-of-Detail (LoD) strategy to ensure efficient large-scale 3DGS training and rendering.

Specifically, our method leverages the global scene prior and the self-adaptive training data selection, enabling efficient training processes and seamless integration of components. Our framework systematically generates varying detail resolutions via compression techniques, ensuring that images are rendered efficiently at multiple scales with high fidelity.

全局场景中的先验信息与自适应选择机制能够优化训练过程并实现无缝融合。基于高斯图元的压缩生成多级细节层次,并结合提出的一分块细节层次选择与合并策略, 使得不同尺度的内容能够高效呈现。

Comprehensive experimental results on large-scale scenes vividly illustrate the robustness and effectiveness of our approach, which consistently achieves leading-edge rendering quality. It enables real-time scene rendering performance across varying spatial scales without compromising consistency.

project page

Figure

Figure 1

Our proposed method, CityGS, succeeds in achieving a state-of-the-art (SOTA) rendering fidelity in the small city scene category of the MatrixCity dataset, comprising 740 test and 5620 training image samples.

(d)**HereZ represents camera height. Without level of detail (LoD), the CityGS algorithm would render more than 20 million points, resulting in significant VRAM and computational time costs.

(e)

Figure 2

CityGS的训练流程中包含多个关键步骤:通过粉红色正方形框定前景区域(via a pink square),该操作不仅能够促进后续收缩(promoting subsequent contraction)以及高斯分区(Gaussian partitioning)的过程,并且能够在增强整体一致性(enhancing uniformity)方面产生显著效果。随后根据指定块中内容投影贡献(contributions of projected content in a specified block)选择有效姿态(valid poses)。并行训练的部分高斯随后被整合(merged),从而呈现出完整的场景描述。

Figure 3

Rendering of CityGS. Based upon optimized dense Gaussians, we produce detail levels employing various compression rates.

r_{1}

,

r_{2}

and

r_{3}

._

During the rendering process, every Gaussian within a group will maintain an equal detail grade, which is based on the group’s distance from the camera.

Due to the contraction-based block partition, certain irregular block shapes are formed. These shapes' bounding boxes are estimated after removing floaters. Their intersection with the estimated bounding box determines if a given block is fed into the rasterizer.

Figure 4

Qualitative comparison with SOTA methods on real-scene datasets.

Figure 5

Qualitative comparison with SOTA methods on MatrixCity dataset.

Figure 6

Validierung der Methode LoD. Under varying camera heights, we evaluate the rendering performance.

On the left side, it represents the average FPS. Whereas, on the right side, it represents the minimum FPS.

The shadowed area under 25 FPS line is the non-real-time zone.

Figure 7

展示该城市的 scene manipulation 通过 CityGS 的显式表示驱动的场景操作

the dome of the original building located in the first row has been coated with a new layer of paint to match the design depicted in the second row.

the construction or installation of the first row has been taken out and will be replaced with a depiction or illustration from the second row.

(c) The vehicles located at specific spots in images on the left panel have been relocated to new positions displayed on the right panel, aiming to simulate the necessary traffic scenarios. NeRF-based approaches find it challenging to execute such vehicle repositioning operations.

Figure S1

More qualitative comparison with SOTA methods on MatrixCity dataset.

Figure S2

A qualitative assessment for boundary consistency within both synthetic and real datasets when visible Gaussians are present within multiple blocks.

The spatial point distributions in the contracted spatial domain are depicted alongside their rendered images.

Within the context of point distribution, blue-colored points are utilized to represent overall Gaussians, while red-colored points are employed to signify visible Gaussians. The grey grid serves as a framework for dividing and illustrating block partition within a contracted space.

Conclusions

Within this study, we introduce CityGS as a system that achieves real-time large-scale scene reconstruction with high fidelity. By employing a blocking and Level-of-Detail (LoD) strategy specifically detailed in Gaussian geometry, we achieve SOTA rendering fidelity across standard benchmarks while notably reducing time costs during the reconstruction of scenes with vastly differing scales.

Despite the restrictive nature of its hidden static scene perspective, this method’s generalization capability is significantly constrained. The approach fails to enhance performance when combining such diverse view perspectives, including aerial and street viewpoints. Exploring these intricate mechanisms requires in-depth investigation and effective resolution.

全部评论 (0)

还没有任何评论哟~