A Survey on 3D Gaussian Splatting(2)
Structure

Problem Formulation
Radiance Field
A radiance field is a representation of light distribution in a three-dimensional space, which captures how light interacts with surfaces and materials in the environment.
Mathematically, a radiance field can be described as a function
, where
maps a point in space
, and a direction specified by spherical coordinates
, to a non-negative radiance value.
Radiance fields can be encapsulated through implicit or explicit representations, each with specific advantages for scene representation and rendering.
Implicit Radiance Field
An implicit radiance field represents light distribution in a scene without explicitly defining the geometry of the scene.
In the deep learning era, it often uses neural networks to learn a continuous volumetric scene representation. The most prominent example is NeRF.
In NeRF, theMLP network is used to map a set of spatial coordinates
and viewing directions
to color and density values.
The radiance at any point is not stored explicitly but is computed on-the-fly by querying the neural network.
Hence, the function can be written as:

This format allows for a differentiable and compact representation of complex scenes, albeit often at the cost of high computational load during rendering due to volumetric ray marching.
Explicit Radiance Field
In contrast, an explicit radiance field directly represents the distribution of light in a discrete spatial structure , such as a voxel grid or a set of points.
Each element in this structure stores the radiance information for its respective location in space.
This approach allows for more direct and often faster access to radiance data but at the cost of higher memory usage and potentially lower resolution.
A generic form for an explicit radiance field representation can be written as:

where DataStructure could be a grid or point cloud, and
is a function that modifies the radiance based on the viewing direction.
3D Gaussian Splatting: The Best of Both Worlds
3DGS represents a shift from implicit to explicit radiance fields. It leverages the strengths of both approaches byutilizing 3D Gaussians as a flexible and efficient representation.
These Gaussians are optimized to represent the scene accurately, combining the benefits of neural network-based optimization and explicit, structured data storage.
This hybrid approach aims to achieve high-quality rendering with faster training and real-time performance, particularly for complex scenes and high-resolution outputs.
The 3D Gaussian representation is formulated as:

where
is the Gaussian function with mean
and covariance
, and
represents the view-dependent color.
Context and Terminology
Scene Reconstruction and Rendering
Scene reconstruction involves creating a 3D model of a scene from a collection of images or other data.
Rendering is a more specific term which focuses on transforming computer-readable information (e.g., 3D objects in the scene) to pixel-based images.
Early techniques generated realistic images based on the light fields.
The structure-from-motion (SfM) and multi-view stereo (MVS) algorithms further advanced this field by estimating 3D structures from image sequences.
These historical methods set the stage for more complex scene reconstruction and rendering techniques.
Neural Rendering and Radiance Fields
Neural rendering integrates deep learning with traditional graphics techniques to create photorealistic images.
Early attempts utilized convolutional neural networks (CNNs) for estimating blending weights or texturespace solutions.
The radiance field represents a function which _describes the amount of light traveling _in every direction through every point in space.
NeRFs use neural networks to model the radiance fields, enabling detailed and realistic scene rendering.
Volumetric Representations and Ray-Marching
Volumetric representations model objects and scenes not just as surfaces but as volumesfilled with materials or empty space.
This approach allows for more accurate rendering of phenomena like fog, smoke, or translucent materials.
Ray-marching is a technique used with volumetric representations to render images by incrementally tracing the path of light through a volume.
NeRF shares the same spirit of volumetric ray-marching and introduce importance sampling and positional encoding to improve the quality of synthesized images.
While providing highquality results, volumetric ray-marching is computationally expensive , motivating the search for more efficient methods like 3DGS.
Point-based Rendering
Point-based rendering is a technique for visualizing 3D scenes using points rather than traditional polygons.
This method is particularly effective for renderingcomplex, unstructured, or sparse geometric data.
Points can be augmented with additional properties like l e arnable neural descriptors, and rendered efficiently, but this approach can suffer from issues like holes in the rendering or aliasing effects.
3DGS extends this concept by using anisotropic Gaussians for a more continuous and cohesive representation of the scene.
