🗒️ 3d-research index lesson 01 EN · IT

Lesson 01

The primitive: the anisotropic 3D Gaussian

The building block of everything else: an orientable blob with ~59 degrees of freedom.

Both traditional splatting and 3DGS use the same primitive: an anisotropic Gaussian in 3D,

$$G(\mathbf{x}) \;=\; \exp\!\Big(-\tfrac12\,(\mathbf{x}-\boldsymbol{\mu})^{\!\top}\,\Sigma^{-1}\,(\mathbf{x}-\boldsymbol{\mu})\Big)$$

where \(\boldsymbol\mu \in \mathbb{R}^3\) is the center and \(\Sigma\) the covariance, which controls its shape, size, and orientation. 3DGS does not optimize \(\Sigma\) directly — an arbitrary matrix would not stay positive semidefinite under gradient descent — but factorizes it:

$$\Sigma \;=\; R\, S\, S^{\!\top} R^{\!\top}$$

with \(S=\mathrm{diag}(s_x,s_y,s_z)\) (scales, 3 numbers) and \(R\) a rotation parametrized by a quaternion \(q\) (4 numbers). This way \(\Sigma \succeq 0\) by construction, whatever values the parameters take: a constraint turned into a parametrization — a classic optimization trick.

ParameterDimRole
mean \(\boldsymbol\mu\)3position in the world
scale \(s\)3extent along its own axes
quaternion \(q\)4orientation of its own axes
opacity \(\alpha\)1how much it absorbs in compositing (lesson 03)
spherical harmonics (degree 3)48view-direction-dependent color

Total ≈ 59 degrees of freedom per Gaussian — and a real scene has millions of them. The scene is this parameter vector.

Fig. 1 — interactive. 6000 samples drawn from \(\mathcal N(\mu,\Sigma)\) with the 1σ and 2σ shells. The sliders move exactly the parameters that 3DGS optimizes: scale \(s\), rotation (→ quaternion), opacity. Drag to orbit, scroll to zoom.
Intuition — anisotropy is not a flourish: a flat surface is well represented by squashed Gaussians (a very small s along the normal, large ones along the tangent). Use the slider to bring \(s_z\) to its minimum: you get a "surfel", the oriented disc of classic splatting. The Gaussian generalizes the surfel.

Color: spherical harmonics

The color of each Gaussian is not a fixed RGB but a function of the view direction \(\mathbf d\), expanded in spherical harmonics up to degree 3: \(c(\mathbf d) = \sum_{\ell=0}^{3} \sum_{m=-\ell}^{\ell} c_{\ell m} Y_{\ell m}(\mathbf d)\) — 16 coefficients × 3 channels = 48 numbers. Degree 0 is the mean color; the higher degrees capture low-frequency view-dependent effects (soft reflections, iridescence). It is an inductive bias: sharp speculars and refractions lie outside the representable space, and that is exactly where 3DGS fails.