【读论文】Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle
文章目录
-
1. What
-
2. Why
-
- 2.1 Introduction
- 2.2 Related work
-
3. How
-
- 3.1 Dual-Domain Deformation Model
- 3.2 Adaptive Timestemp Scaling
- 3.3 Regularizations
- 3.4 Experiments
1. What
A novel point-based approach with a novel Dual-Domain Deformation Model for dynamic scene reconstruction.
Contribution:
- Gaussian-Flow, which is a novel point-based differentiable rendering approach for dynamic 3D scene reconstruction, setting a new sota for training speed, rendering FPS, and novel view synthesis quality for 4D scene reconstruction.
- Propose a Dual-Domain Deformation Model for efficient 4D scene training and rendering, which preserves a running speed on par with the original 3DGS with minimum overhead.
- Can be used for downstream tasks
2. Why
2.1 Introduction
- NeRF still remains a challenge for high-fidelity real-time rendering.
- 3DGS has been used on 4D tasks but it significantly lowers the rendering speed of the original 3DGS.
2.2 Related work
Remarkable work
- Dynamic Neural Radiance Field: dynamic neural scene flow methods have been proposed [27, 30],
- Accelerated Neural Radiance Field
- Differentiable Point-based Rendering: PointRF [41], DSS [39], and 3D Gaussians splatting(3DGS) [13].
3. How

3.1 Dual-Domain Deformation Model
Assume that only the rotation q, radiance c, and position \mu of a 3D Gaussian particle change over time, while the scaling s and opacity \alpha remain constant.
Then, we use a time-dependent attribute residual D(t) to adjust the error between the base attribute S_{0}\in\{\mu_{0},c_{0},q_{0}\} and the attribute at time t. This is:
S(t)=S_{0}+D(t),
where D(t)=P_{N}(t)+F_{L}(t) is combined by a polynomial P_{N}(t) with coefficients a=\{a\}_{n=0}^{N}
and a Fourier series F_{L}(t) with coefficients f=\{f_{sin}^{l},f_{cos}^{l}\}_{l=0}^{L}. These are respectively denoted as:
P_{N}(t)=\sum_{n=0}^{N}a_{n}t^{n},\\ F_{L}(t)=\sum_{l=1}^{L}\left(f_{sin}^{l}\cos(lt)+f_{cos}^{l}\sin(lt)\right).
Notice that we assume different dimensions of an attribute are independently changed over time. For instance, we utilize \{D_{\mu_{i}}(t)\}_{i=0}^{3} to describe the motion of a 3D position \mu.
Meanwhile, we need to distinguish the function between polynomials and Fourier. The Fourier series excels at capturing the variations associated with violent motions while polynomials yield a good fit with smooth motion with a small order of polynomials.
3.2 Adaptive Timestemp Scaling
In the actual process, we use the normalized time t, ranging from 0 to 1. But adhering to the standard temporal division would necessitate an exceedingly large coefficient to accommodate highly intense movements within a very short
time frame. So we use a dilation factor \lambda to scale the temporal input for each Gaussian point:
t_s=\lambda_s\cdot t+\lambda_b
To summarize, in our dynamic scene setting, a Gaussian particle contains multiple attributes to be optimized, including base attributes \{\mu_0,q_0,s_0,c_0,\alpha_0\}at reference frame t_0 polynomial coefficients and Fourier coefficients in \{D_{\boldsymbol{\mu}}(t),D_{\boldsymbol{q}}(t),D_{\boldsymbol{c}}(t)\}
3.3 Regularizations
Time Smoothness Loss : To ensure temporal smoothness over time, the time smoothness term is defined as
\mathcal{L}_t=\|D(t)-D(t+\epsilon)\|_2
where \epsilon = 0.1/frames.
KNN Rigid Loss: The local rigid constraint is incorporated in every latter stage, and it is defined as:
\mathcal{L}_s=\sum_{j\in\mathcal{N}_i}\|D(t)_i-D(t)_j\|_2
where \mathcal{N} represents the K nearest neighbor of i-th Gaussian.
3.4 Experiments
-
Two datasets: Plenoptic Video dataset and HyperNeRF dataset
-
Ablation study
- Deformation Models(Fourier or polynomial)
- Regularizations(Time or KNN)
-
Quantitative Comparisons
-
Qualitative Comparisons
