Advertisement

Reducing the Memory Footprint of 3D Gaussian Splatting

阅读量:

Abstract

3DGS offers outstanding visual quality for novel viewpoint synthesis while providing rapid training and real-time rendering capabilities; unfortunately, the memory demands of this method required for storage and transmission processes are unreasonably high. We begin by investigating the factors contributing to this issue, pinpointing three primary categories where storage reductions can be achieved:

① the number of 3D Gaussian primitives used to represent a scene.

the quantity of coefficients employed in the spherical harmonics for representing directional radiometric measurements.

③ the precision required to store Gaussian primitive attributes.

We present a solution to each of these issues.

Initially, we introduce a novel high-efficiency and resolution-aware primitive thinning technique to achieve halving the number of primitives in our system.

Second, we propose an adaptive mechanism to determine the count of coefficients employed to represent directional radiance on each Gaussian primitive.

最后采用了基于码本的量化方法(一种半浮点表示法),以实现额外的内存缩减。

Taken together, the three components collectively achieve a ×27 reduction in overall disk size across standard datasets tested, coupled with a ×1.7 improvement in rendering speed.

In this study, we evaluate our approach across standard benchmark datasets and demonstrate how our solution achieves substantial reduction in download times, particularly when implemented via a mobile device.

Figure

Figure 1

Left: screen capture acquired via an smartphone executing a modified version of gsplat.js within the bicycle scene.

Right:** same scene, processed with our method, reaching a significantly lower memory footprint and shorter download time.

Figure 2

3DGS generates high-fidelity 3D models by analyzing input images and sparse point data.

Representing the Gaussians as ellipsoids rather than splats demonstrates that each scene is composed of millions of primitives.

Each primitive stores a significant amount of information :__position

p

, rotation quaternion

q

, scale

s

, opacity

lpha

彩色通道及三个球谐函数带。这些因素导致3DGS场景中...substantial increase in memory demand.

Figure 3

During training, every 1K iterations, our method systematically computes a redundancy score in space. That redundancy score is subsequently projected onto the primitives that form the foundation of the system. As a result, redundant primitives are identified and eliminated during each iteration.

At 15,000 iterations, once the process of densification concludes, our method evaluates the SH coefficients in order to identify which primitives can be represented using either 0 (just RGB), 1, 2, or 3 SH bands. This enables us to eliminate the need to store unnecessary SH coefficients.

Right: After finishing the training phase, we execute a codebook quantization process for all non-primitive positions.

Each stage exhibits a relative reduction as demonstrated in the figure, achieving a cumulative total of 27 times reduction in memory consumption with an average PSNR decrease of 0.21 dB across all our datasets.

Figure 4

Our high-resolution and multi-scale redundancy metric assesses the degree to which a Gaussian is crucial for representing the scene.

Each camera is capable of capturing details with specific resolutions. As one steps further away from the camera, its spatial resolution decreases.

Considering several camera setups for a given primitive in the scene, multiple resolution levels are attainable.

c) For each Gaussian

g_{i}

, we consider the highest resolution

a_{min}^{i}

基于输入的相机数据,在该区域内统计高斯的数量,并移除高斯的一部分,这些被移除的部分涉及受更多区域影响的高斯。

K

other Gaussians.

In this example, Gaussian

g_{0}

will not be pruned because there is at least one region ,

a_{min}^{0}

, influenced by no other Gaussian. While Gaussian

g_{2}

intersects with regions

a_{min}^{1,2,3}

. All these regions have many Gaussians influencing them, hence

g_{2}

is a good candidate for pruning.

Figure 5

(a) primitive Reduction only.

(b) primitive Reduction and Adaptive SH.

(c) full method(ours).

(d) error image between ours and baseline.

(e) the baseline (3DGS).

Figure 6

Visual comparison between INGP, MeRF, 3DGS and ours.

CONCLUSION

In this paper, we have developed a thorough and effectivememory reduction method for 3DGS.

This is accomplished by utilizing a resolution-aware primitive reduction approach, which reduces the number of primitives to half (50%). An adaptive adjustment approach is employed to determine the optimal number of SH bands needed per primitive. A codebook-based quantization approach is utilized to effectively reduce data complexity.

Our method achieves a 27-fold reduction in memory usage alongside a 1.7-fold increase in rendering efficiency. We present our findings within the framework of a streaming setup utilizing WebGL for rendering, which cuts down on download time by as much as 20-30 times while boosting rendering speed by approximately threefold. Our solution stands out as an initial pioneering streaming/mobile 3DGS approach that upholds high visual standards. The extent of our memory reduction effectively eliminates one key constraint associated with traditional 3DGS techniques; consequently, our method emerges as the most suitable Next-Generation Video Compression (NVS) solution across all three critical metrics: speed, image quality, and memory consumption.

In future work, it would be intriguing to explore methods for further minimizing the number of primitives required and more significantly, avoiding over-densification from the outset. Initial experiments suggest that this presents a substantial challenge; one promising avenue might involve incorporating data-driven priors, such as using supervision on depth as an example.

全部评论 (0)

还没有任何评论哟~