VR-GS: A Physical Dynamics-Aware Interactive Gaussian Splatting System in Virtual Reality
Abstract
Given the growing prevalence of 3D content across various platforms, there is an increasing emphasis on developing interactions with 3D virtual content. However, traditional methods for creating, editing, and interacting with this type of content often encounter significant challenges. Engaging in these tasks can be both technically demanding and skill-intensive, thus contributing to the frustration and inefficiency often associated with manipulating virtual objects.
Our proposed VR-GS system represents significant advancements marked by smooth, uncluttered, and user-friendly interactions with 3D content that are centered around the needs of the user.
By constructing a physics-based motion-aware interactive guess system within a Virtual Reality (VR) environment, and establishing a sophisticated dual-level integration framework alongside advanced deformable body simulations, VR-GS achieves real-time performance with precise dynamic simulation outcomes.
The components of our system are aimed at achieving high performance and robustness, beginning with high-fidelity scene reconstruction and object segmentation, proceeding via multi-view image inpainting, moving towards interactive physics-based editing.
The system integrates real-time-based deformation embedding technology and dynamic shadow mapping technology , offering an immersive and interactive virtual environment.
Figure
Figure 1

Using our system, individuals can engage in engaging and intuitive physics-driven interactive experiences with deformable virtual animals and realistic environments represented with 3DGS.
Figure 2

VR-GS is an interactive platform specifically developed for integrating 3DGS and eXtended Position-based Dynamics (XPBD) to deliver a real-time interactive experience.
Starting with multi-view imagery, the pipeline integrates scene reconstruction tasks, segmentation processes, and inpainting operations by employing Gaussian kernels for inpainting.
These kernels serve as the basis for VR-GS applications' implementation of the sparse volumetric data structure VDB, enabling the construction of bounding meshes and proceeding to tetrahedralization steps.
VR-GS further adopts this new type of two-level Gaussian embedding, XPBD, collision detection, and shadow casting techniques; all working together to achieve a captivating and immersive user experience.
Figure 3

Our two-level embedding effectively resolves spiky artifacts.
The every Gaussian kernel is incorporated into each local tetrahedron. These nodes of the local tetrahedron are separately embedded in the global mesh.
Figure 4

Two-level Embedding Evaluation.
Such a method mitigates the ubiquitously observed sharp artifacts of deformed GS kernels.
Figure 5

Inpainting Evaluation.
GS encounters challenges in generating hidden or occluded surfaces. By making use of LAMA, we generate Inpainted images that assist in reconstructing a complete 3D scene, rendering it more realistic.
Figure 6

**Shadow Map Evaluation. **
GS traditionally represents shadows as stationary exterior textures. Our innovative method, employing advanced techniques like shadow mapping, effectively creates highly dynamic and realistic shadows.
Figure 7

Visual Quality Comparison.
Our method achieves competitive visual results by competing with PhysGuassian and substantially surpassesPAC-NeRF.
Figure 8

Just Dance.
VR-GS generates high-fidelity dance given a reconstructed human body.
Figure 9

VR-GS is permitted for users to interact with 3DGS in real time, generating physically plausible reactions.
Figure 10

Active participants typically engage in interactive and engaging ring toss and table brick breakout games taking place within authentic simulation environments.
Figure 11

This time, they place all the toys on the ground first. Subsequently, they transfer them into the basket.
Figure 12

Trade-offs between Quality and Performance.
(a) cage meshes at varying resolutions.
(b) the corresponding simulation dynamics.
Low-resolution meshes fail to capture fine dynamic details ,
Light-resolution meshes require real-time performance and may lead to excessive softness artifacts within the simulated object due to non-convergent simulations.
We utilize intermediate resolution meshes in practice, aiming to strike an ideal balance between high framerates and realistic physical dynamics.
Figure 13

Timing Breakdown of Demos
Figure 14

Study Results.
(a) personality ( __M1-M5: manipulation of objects in various scenes and enhancing lighting effects through motion aspects within the physical framework __ ) and system response ( M6-M9: The intuitiveness of operation across different scenarios with minimal processing delay for a wide range of operational features to ensure a sense of fulfillment from interaction. )
(b) Physics-based interaction enhances immersion and realism in editing, whereas transform-based interaction led to less realistic outcomes, for instance, an undeformed and penetrated toy.
CONCLUSION AND FUTURE WORK
We developed a physics-based dynamic-aware user-friendly GS system aimed at solving problems encountered when editing real-time high-fidelity virtual content.
By utilizing advancements in 3DGS, VR-GS reduces the quality gap traditionally observed between machine-generated and manually-made 3D content. The system's benefits include significantly enhancing both realism and immersion through physically-based dynamics, while also providing fine-grained interaction and manipulation control.
Despite all study participants acknowledging the efficacy of our system, it continues to require improvement.
Firstly, rendering high-quality Gaussian kernels in VR requires significant computational resources. As a consequence, attempting to render generative dynamics within a large scene of 2K resolution could potentially lead to latency problems in our system.
此外,在我们的系统中,_物理参数_是通过人工指定的方式进行设定的。估计从视频中提取参数的方法类似于PAC-NeRF等模型所采用的方式;而利用大型视觉模型则可能提供更丰富的特征提取途径。
In future work, we aim to incorporate a broader range of materials(such as fluid and cloth), to enhance the system’s capabilities.
To examine as well, it holds promise to investigate approaches for leveraging large multimodal models in evaluating the accuracy of generated dynamics.
