Advertisement

【读论文】Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle

阅读量:

文章目录

  • 1. What

  • 2. Why

    • 2.1 Introduction
    • 2.2 Related work
  • 3. How

    • 3.1 Dual-Domain Deformation Model
    • 3.2 Adaptive Timestemp Scaling
    • 3.3 Regularizations
    • 3.4 Experiments

1. What

A novel point-based approach with a novel Dual-Domain Deformation Model for dynamic scene reconstruction.

Contribution:

  1. Gaussian-Flow, which is a novel point-based differentiable rendering approach for dynamic 3D scene reconstruction, setting a new sota for training speed, rendering FPS, and novel view synthesis quality for 4D scene reconstruction.
  2. Propose a Dual-Domain Deformation Model for efficient 4D scene training and rendering, which preserves a running speed on par with the original 3DGS with minimum overhead.
  3. Can be used for downstream tasks

2. Why

2.1 Introduction

  1. NeRF still remains a challenge for high-fidelity real-time rendering.
  2. 3DGS has been used on 4D tasks but it significantly lowers the rendering speed of the original 3DGS.

Remarkable work

  1. Dynamic Neural Radiance Field: dynamic neural scene flow methods have been proposed [27, 30],
  2. Accelerated Neural Radiance Field
  3. Differentiable Point-based Rendering: PointRF [41], DSS [39], and 3D Gaussians splatting(3DGS) [13].

3. How

在这里插入图片描述

3.1 Dual-Domain Deformation Model

Assume that only the rotation q, radiance c, and position \mu of a 3D Gaussian particle change over time, while the scaling s and opacity \alpha remain constant.

Then, we use a time-dependent attribute residual D(t) to adjust the error between the base attribute S_{0}\in\{\mu_{0},c_{0},q_{0}\} and the attribute at time t. This is:

S(t)=S_{0}+D(t),

where D(t)=P_{N}(t)+F_{L}(t) is combined by a polynomial P_{N}(t) with coefficients a=\{a\}_{n=0}^{N}
and a Fourier series F_{L}(t) with coefficients f=\{f_{sin}^{l},f_{cos}^{l}\}_{l=0}^{L}. These are respectively denoted as:

P_{N}(t)=\sum_{n=0}^{N}a_{n}t^{n},\\ F_{L}(t)=\sum_{l=1}^{L}\left(f_{sin}^{l}\cos(lt)+f_{cos}^{l}\sin(lt)\right).

Notice that we assume different dimensions of an attribute are independently changed over time. For instance, we utilize \{D_{\mu_{i}}(t)\}_{i=0}^{3} to describe the motion of a 3D position \mu.

Meanwhile, we need to distinguish the function between polynomials and Fourier. The Fourier series excels at capturing the variations associated with violent motions while polynomials yield a good fit with smooth motion with a small order of polynomials.

3.2 Adaptive Timestemp Scaling

In the actual process, we use the normalized time t, ranging from 0 to 1. But adhering to the standard temporal division would necessitate an exceedingly large coefficient to accommodate highly intense movements within a very short
time frame. So we use a dilation factor \lambda to scale the temporal input for each Gaussian point:

t_s=\lambda_s\cdot t+\lambda_b

To summarize, in our dynamic scene setting, a Gaussian particle contains multiple attributes to be optimized, including base attributes \{\mu_0,q_0,s_0,c_0,\alpha_0\}at reference frame t_0 polynomial coefficients and Fourier coefficients in \{D_{\boldsymbol{\mu}}(t),D_{\boldsymbol{q}}(t),D_{\boldsymbol{c}}(t)\}

3.3 Regularizations

Time Smoothness Loss : To ensure temporal smoothness over time, the time smoothness term is defined as

\mathcal{L}_t=\|D(t)-D(t+\epsilon)\|_2

where \epsilon = 0.1/frames.

KNN Rigid Loss: The local rigid constraint is incorporated in every latter stage, and it is defined as:

\mathcal{L}_s=\sum_{j\in\mathcal{N}_i}\|D(t)_i-D(t)_j\|_2

where \mathcal{N} represents the K nearest neighbor of i-th Gaussian.

3.4 Experiments

  1. Two datasets: Plenoptic Video dataset and HyperNeRF dataset

  2. Ablation study

    • Deformation Models(Fourier or polynomial)
    • Regularizations(Time or KNN)
  3. Quantitative Comparisons

  4. Qualitative Comparisons

全部评论 (0)

还没有任何评论哟~