HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

  • failed: epic

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: CC BY 4.0
arXiv:2403.03641v1 [cs.GR] 06 Mar 2024

Online Photon Guiding with 3D Gaussians for Caustics Rendering

Jiawei Huang Chuzhou UniversityChina Void DimensionsChina Hajime Tanaka Tohoku UniversityJapan Taku Komura The University of Hong KongHong Kong Tohoku UniversityJapan  and  Yoshifumi Kitamura Tohoku UniversityJapan
(2018)
Abstract.

In production rendering systems, caustics are typically rendered via photon map** and gathering, a process often hindered by insufficient photon density. In this paper, we propose a novel photon guiding method to improve the photon density and overall quality for caustic rendering. The key insight of our approach is the application of a global 3D Gaussian mixture model, used in conjunction with an adaptive light sampler. This combination effectively guides photon emission in expansive 3D scenes with multiple light sources. By employing a global 3D Gaussian mixture, our method precisely models the distribution of the points of interest. To sample emission directions from the distribution at any observation point, we introduce a novel directional transform of the 3D Gaussian, which ensures accurate photon emission guiding. Furthermore, our method integrates a global light cluster tree, which models the contribution distribution of light sources to the image, facilitating effective light source selection. We conduct experiments demonstrating that our approach robustly outperforms existing photon guiding techniques across a variety of scenarios, significantly advancing the quality of caustic rendering.

copyright: acmcopyrightjournalyear: 2018doi: XXXXXXX.XXXXXXXjournal: TOGjournalvolume: 37journalnumber: 4article: 111publicationmonth: 8submissionid: 927
Reference (a) G3D (Ours) (b) H2D (c) MLT
\begin{overpic}[trim=200 0 200 0,clip,height=165.02606pt]{figures/Teaser/% ground_truth_marked} \put(1.0,1.0){Reference} \put(91.0,67.0){Ours} \end{overpic} Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
SSIM: 0.9505 0.9089 0.9499
MSE ×103absentsuperscript103\times 10^{3}× 10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT: 2.88 5.42 2.91
Time: 77s 80s 141s
Figure 1. Comparative rendering results of the Ring scene, featuring intricate and aesthetically appealing caustics cast by metallic rings and diamonds, with various photon guiding techniques under a fixed 1024 sample budget. (a) Our method leverages 3D Gaussian mixture (G3D) to guide photon emission to improve the photon density in visible area, achieving accurate photon density estimation, with minimal overhead. (b) 2D histogram (H2D) approach fails to fit accurate distribution for area lights due to parallax issue, yielding noisy result. (c) Despite more computational overhead, state-of-the-art Metropolis light transport (MLT) approach suffers from non-uniform convergence, leading to a higher occurrence of outliers and a slightly elevated overall error in comparison to our method. The reference is produced with the same integrated rendering system for 64K iterations.

1. Introduction

In the realm of computer graphics, the rendering of caustics stands out for its visual appeal. In production rendering systems, caustics are usually separately sampled with photon map** / photon tracing, while other components are path-traced. However, it requires significant photon density to render caustics accurately and effectively with photon map**. This becomes a challenge when only a small portion of scene is rendered, or when only a selection of objects cast caustics. Many rendering systems thus necessitate photon guiding techniques to address this issue to achieve robust photon map**.

There has been significant research focus on develo** sufficient photon guiding techniques. Lightweight Photon Map** (2018) utilizes 2D histograms to learn distributions and guide emission for each light source. Hachisuka et al. (2011) leverage Metropolis Light Transport (MLT) for more efficient photon emission sampling. However, limitations exist in these approaches. Lightweight Photon Map** has to build a distribution for each individual light source, which leads to inaccurate fitting and large overhead with increasing number of the light sources; also the accuracy of 2D guiding map is inherently limited by the parallax issue with none-delta light sources (e.g., rectangular light with nonnegligible area, or environment map with multiple peaks). MLT occasionally suffers from non-uniform convergence, and requires considerable computational overhead. The limitations of existing methods drive us to seek a different approach to tackle the photon guiding problem.

In this paper we propose a novel method to learn a global distribution of photon emission to maximize the photon density of the rendered image. Our key insight is to leverage 3D Gaussian, a flexible and powerful distribution whose application in Monte Carlo rendering is less explored. With 3D Gaussians, we are able to model the global distribution of ”points of interest” effectively. We derive a novel equation and method to sample directions from 3D Gaussian from any observation point, directly and unbiasedly, to guide photon emission. Along with it, we introduce a novel adaptive light source sampler, to form a complete photon guiding framework. Our method is very compact and runs fast on a conventional GPU. We integrate our method with a unidirectional GPU path tracer to achieve very efficient rendering of caustics.

In conclusion, the contribution of this paper includes:

  • The development of an algorithm for sampling directions from a 3D Gaussian at any observation point, completed with a closed-form probability density function (PDF).

  • A workflow to learn points of interest distribution for photon emission, represented with 3D Gaussian mixture, utilizing online photon samples as training data.

  • A novel, fast photon guiding framework that progressively learns a global distribution to guide photon emission for high photon density.

2. Related work

2.1. Photon map** and photon guiding

Photon map** is an efficient approach proposed by Jensen (1996) to calculate global illumination. Later research proposed several modification on the scheme for more accurate density estimation and adaptation to progressive rendering scheme (Hachisuka and Jensen, 2009; Hachisuka et al., 2008). Recently, Zhu et al. (2020) proposed to improve the quality of density estimation with machine learning, Misso et al. (2022) showed that it is possible to obtain unbiased result from photon map**. Photon map** is considered as an effective approach to render caustics from specular surfaces such as glass and mirror. Even in hybrid methods which utilize multiple sample algorithms for robust rendering (e.g., (Georgiev et al., 2012)), photon map** is the common choice for caustics paths. Traditionally in photon map**, photon emission is sampled uniformly. However, in practical scenarios, it is a common challenge that when only a small portion of the large scene can benefit from rendering caustics with photon map** (the ”caustic in a stadium” problem), the photon density with uniform emission sampling is not sufficient for accurate estimation. Hachisuka et al. (2011) adapts Metropolis light transport (MLT (Veach and Guibas, 1997). More specifically, it is based on the primary sample space MLT (Kelemen et al., 2002)), and Grittmann et al. (2018) adapts 2D histogramming, to guide photon emission for higher photon density. These ideas are implemented in production renderers (e.g., (Šik and Krivánek, 2019; Burley et al., 2018; Droske et al., 2023)). However, we found these approaches to exhibit certain limitations: MLT requires intricate implementation and suffers from non-uniform convergence when the sample space is complex, while 2D histogram suffers from inherent parallax issue since it projects high dimensional distributions to 2D, and the overhead increases linearly with the number of light sources. Our method innovatively fits a global 3D distribution, using a compact parametric model, to guide photon emission for all light sources. The fitting process is efficient, and the model only uses several tens of scalars, providing a robust solution for the photon guiding problem.

2.2. Distributions and sampling methods

Our research seeks to importance sample photon emission with an explicit global distribution. In Monte Carlo rendering, it has become a common practise to importance sample different distributions for variance reduction. As a typical instance, almost every rendering system importance sample the local bidirectional scattering distribution function (BSDF) of surfaces (e.g., (Trowbridge and Reitz, 1975; Oren and Nayar, 1994)). However, sampling local BSDF distribution may not be optimal when indirect lighting is a major contribution to the result. Path guiding research leverage more general distributions to model indirect lighting. Jensen (1995) utilizes 2D histograms to model and importance sample the spherical incident radiance distribution. Many distribution models are explored subsequently. Vorba et al. (2014) utilize 2D Gaussian mixtures for a similar purpose, with training data from a photon-emitting pass, to achieve a more practical method. Mueller et al. (2017) propose an SD-tree to achieve more compact histogram-based representation. Rupper et al. (2020) propose to use von-Mises Fisher distribution along with a parallax-aware fitting algorithm to remove the parallax issue caused by spatial partition. Dodik et al. (2022) propose to use 5D Gaussians to model incident radiance distribution over the space. Huang et al. (2023) propose an anisotropic spherical distribution that can be directly evaluated and sampled for more accurate and compact representation. Actually, despite different algorithms, these methods share a common insight, i.e. to build and sample local directional distributions that are represented by a suitable distribution model.

Previous photon guiding methods adapt this idea to photon emission straightforwardly. For example, (Grittmann et al., 2018; Burley et al., 2018; Droske et al., 2023) choose to build a local distribution for each of the light source using 2D histograms, however, as mentioned above, it suffers from the parallax issue due to the existence of area in actual light sources. Our research learns a global distribution of the surfaces that photons should be emitted towards, and we find 3D Gaussian serves this purpose well. However, guiding photon emission from a light source with 3D Gaussians requires to sample directions from the distribution. To the best of our knowledge, there is no existing algorithm related to this. In this paper we derive a novel directional transform for 3D Gaussians, enabling sampling directions from it, to guide photon emission at any location.

2.3. Adaptive light source sampling

In addition to emission guiding, it is also important to importance sample light sources. Traditionally light sources are sampled based on their flux, however adaptive sampling strategy is even preferred as reported in (Šik and Krivánek, 2019), since the actual contribution of each light source does not (fully) depend on their flux when only a portion of the full scene is rendered. Walter et al. (2005) propose Lightcuts, which utilizes a binary tree, to cluster light sources for efficient rendering. Yuksel (2020) sample Lightcuts stochastically in Monte Carlo rendering to remove bias. Wang et al. (2021) propose to adaptively refine the importance of each node of the binary tree. Our adaptive light sampler can be seen as an adaptation of the idea of Wang et al. to photon guiding, with the difference that we progressively build the binary tree instead of starting from complete Lightcuts, and we approximate the importance of each node based on the sum of radiance from gathered photons.

3. Overview

Background

Our technique improves photon density for visible region to achieve higher quality photon map** result. To better explain our technique with an actual implementation, we first briefly describe the approach we integrate photon map** into a production renderer for caustics rendering. Since the target production renderer is a GPU unidirectional path tracer, we employ photon map** to render caustics in stochastic approach ((Hachisuka and Jensen, 2009)), allowing the rest handled by path tracing. In the target path tracer, the specular surfaces that casts caustics are explicitly marked as “casters” while surfaces that receives caustics are marked as “receivers”. Only photons that travels through casters and reach receivers will be recorded to construct the photon map. This is a typical approach in production rendering systems, however, under this setup, uniform emission cannot achieve significant photon density for proper density estimation, leading to poor results. Therefore, we propose to guide photon emission with novel online learned distribution.

Integration

Fig. 2 outlines the components and workflow of our framework. The main rendering pipeline is a two-pass process: in the first pass (the photon tracing pass), we randomly trace a batch of photons, guided by our progressively refined distributions, and build a photon map with KD-tree. The second pass is path tracing; In addition to regular path tracing, we gather nearby photons and estimate indirect lighting at diffuse surfaces. During gathering, we also record the data required by distribution learning. After each rendering iteration, the distribution is refined based on the recorded data. Through the rendering iterations, our distribution converges to one that maximizes the number of gathered photons. The distribution consists of two parts: a 3D Gaussian mixture that represents the spatial distribution of points of interest, which is used for guiding later photon emission from light sources (see § 4); and a tree-based 1D distribution that is used for light source sampling (see § 5).

Refer to caption
Figure 2. Our method is integrated in a two-pass rendering process. After the photon pass and the path tracing pass, the distributions are refined using the collected photon data recorded in path tracing pass. With guiding of the learned distributions, the photon density gradually increases over iterations.

4. Learning to guide photon emission with 3D Gaussians

An (unnormalized) isotropic 3D Gaussian is given by the equation:

(1) G(𝐱)=exp(𝐱μ22σ2),𝐺𝐱superscriptnorm𝐱𝜇22superscript𝜎2G(\mathbf{x})=\exp\left(-\frac{\|\mathbf{x}-\mathbf{\mu}\|^{2}}{2\sigma^{2}}% \right),italic_G ( bold_x ) = roman_exp ( - divide start_ARG ∥ bold_x - italic_μ ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) ,

where 𝐱𝐱\mathbf{x}bold_x is the position vector in 3D space, μ𝜇\mathbf{\mu}italic_μ is the mean vector representing the center of the Gaussian distribution and σ𝜎\sigmaitalic_σ is the standard deviation. A 3D Gaussian mixture with N𝑁Nitalic_N isotropic 3D Gaussians can effectively model distributions over the 3D space:

(2) M(𝐱)=i=1Nwi(2πσ2)3Gi(𝐱),𝑀𝐱superscriptsubscript𝑖1𝑁subscript𝑤𝑖superscript2𝜋superscript𝜎23subscript𝐺𝑖𝐱M(\mathbf{x})=\sum_{i=1}^{N}\frac{w_{i}}{\sqrt{(2\pi\sigma^{2})^{3}}}G_{i}(% \mathbf{x}),italic_M ( bold_x ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT divide start_ARG italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG ( 2 italic_π italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x ) ,

where Gisubscript𝐺𝑖G_{i}italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are the 3D Gaussian components, and wisubscript𝑤𝑖w_{i}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are the corresponding weights which sum to 1. The normalizing constant is the integral of each Gisubscript𝐺𝑖G_{i}italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. As shown in Fig. 3, for any set of light sources, our method fits a global 3D Gaussian mixture, which is used to guide the later photon emission.

In the rest of this section we describe the workflow to fit the 3D Gaussian mixture using gathered photons, and the method we derived to sample emission direction from a 3D Gaussian.

Refer to caption
Figure 3. The workflow of our approach: we fit a 3D Gaussian mixture with gathered photons and use it to guide photon emission. In (a), the light source uniformly emits multiple photons, however only the yellow photon is actually gathered from the camera view. We use the first bounce location of yellow photon to fit a 3D Gaussian as shown in (b). Later when emitting photons, we use the 3D Gaussian to sample emission directions, so that more photons can be gathered, achieving higher density.

4.1. Learning with gathered photons

Our goal is to model a distribution to achieve maximum photon density in the visible region, and eventually improve the accuracy of density estimation. Based on the fact that caustics photons are usually gathered at a diffuse surface after a sequence of specular bounce, we only need to fit a distribution at the location of the first bounce, which we define as the point of interest. To learn the distribution of points of interest, when emitting photons, we record the location of the first bounce and the emission pdf (directional pdf \cdot positional pdf), along with the original photon data. In addition, every photon has an indicator of how many times it is gathered.

We learn the distribution with gradient descent using Kullback-Leibler divergence (KL divergence), adapting the approach proposed in (Müller et al., 2019). First, we define the learned 3D Gaussian mixture as q(𝐱;γ)𝑞𝐱𝛾q(\mathbf{x};\gamma)italic_q ( bold_x ; italic_γ ), where γ𝛾\gammaitalic_γ is the vector of parameters. The target distribution p(𝐱)𝑝𝐱p(\mathbf{x})italic_p ( bold_x ), which we are trying to fit, is unknown. However, for each point of interest, the basic rule is that the more times the photons are gathered the higher the density should be, thus we define:

(3) p(𝐱)=gT(𝐱),𝑝𝐱𝑔𝑇𝐱p(\mathbf{x})=g\cdot T(\mathbf{x}),italic_p ( bold_x ) = italic_g ⋅ italic_T ( bold_x ) ,

where T(𝐱)𝑇𝐱T(\mathbf{x})italic_T ( bold_x ) denotes how many times a photon bounced from 𝐱𝐱\mathbf{x}bold_x is gathered in the current pass, and g𝑔gitalic_g is an unknown global scaling factor. The fitting can then be achieved by minimizing a KL divergence:

(4) DKL(p(𝐱)q(𝐱;γ))=3p(𝐱)(log[p(𝐱)]log[q(𝐱;γ)])𝑑𝐱.subscript𝐷𝐾𝐿conditional𝑝𝐱𝑞𝐱𝛾subscriptsuperscript3𝑝𝐱𝑝𝐱𝑞𝐱𝛾differential-d𝐱D_{KL}(p(\mathbf{x})\|q(\mathbf{x};\gamma))=\int_{\mathbb{R}^{3}}p(\mathbf{x})% (\log[p(\mathbf{x})]-\log[q(\mathbf{x};\gamma)])d\mathbf{x}.italic_D start_POSTSUBSCRIPT italic_K italic_L end_POSTSUBSCRIPT ( italic_p ( bold_x ) ∥ italic_q ( bold_x ; italic_γ ) ) = ∫ start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_p ( bold_x ) ( roman_log [ italic_p ( bold_x ) ] - roman_log [ italic_q ( bold_x ; italic_γ ) ] ) italic_d bold_x .

We can minimize it by using gradient descent by optimizing γ𝛾\gammaitalic_γ. Notice that log[p(𝐱)]𝑝𝐱\log[p(\mathbf{x})]roman_log [ italic_p ( bold_x ) ] is irrelevant to γ𝛾\gammaitalic_γ’s gradient and thus we have

(5) γDKL(p(𝐱)q(𝐱;γ))=γ3p(𝐱)log[q(𝐱;γ)])d𝐱.\nabla_{\gamma}D_{KL}(p(\mathbf{x})\|q(\mathbf{x};\gamma))=-\nabla_{\gamma}% \int_{\mathbb{R}^{3}}p(\mathbf{x})\log[q(\mathbf{x};\gamma)])d\mathbf{x}.∇ start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT italic_K italic_L end_POSTSUBSCRIPT ( italic_p ( bold_x ) ∥ italic_q ( bold_x ; italic_γ ) ) = - ∇ start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_p ( bold_x ) roman_log [ italic_q ( bold_x ; italic_γ ) ] ) italic_d bold_x .

Although it requires to calculate the integral over 3D space, we are able to attempt a one-sample estimation:

(6) γDKL(p(𝐱)q(𝐱;γ))=𝔼[p(𝐱)q^(𝐱)γlog[q(𝐱;γ)]].subscript𝛾subscript𝐷𝐾𝐿conditional𝑝𝐱𝑞𝐱𝛾𝔼delimited-[]𝑝𝐱^𝑞𝐱subscript𝛾𝑞𝐱𝛾\nabla_{\gamma}D_{KL}(p(\mathbf{x})\|q(\mathbf{x};\gamma))=-\,\mathbb{E}\left[% \frac{p(\mathbf{x})}{\hat{q}(\mathbf{x})}\nabla_{\gamma}\log[q(\mathbf{x};% \gamma)]\right].∇ start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT italic_K italic_L end_POSTSUBSCRIPT ( italic_p ( bold_x ) ∥ italic_q ( bold_x ; italic_γ ) ) = - blackboard_E [ divide start_ARG italic_p ( bold_x ) end_ARG start_ARG over^ start_ARG italic_q end_ARG ( bold_x ) end_ARG ∇ start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT roman_log [ italic_q ( bold_x ; italic_γ ) ] ] .

In our rendering process, each time we sample an emitted photon, the emission pdf is q^(𝐱)^𝑞𝐱\hat{q}(\mathbf{x})over^ start_ARG italic_q end_ARG ( bold_x ). We then replace p(x)𝑝𝑥p(x)italic_p ( italic_x ) with gT(𝐱)𝑔𝑇𝐱g\cdot T(\mathbf{x})italic_g ⋅ italic_T ( bold_x ), and by using a moment-based optimizer, the global scaling factor g𝑔gitalic_g can be effectively cancelled.

4.2. Sampling directions from 3D Gaussian

In this section, we describe a novel approach to transform the a global 3D Gaussian mixture that represents the distribution of photons, into a local directional distribution representing the photon emission from a given light source position.

Given an observation point 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, we can define the north pole 𝐳𝐳\mathbf{z}bold_z pointing from 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to μ𝜇\muitalic_μ. Then, we are able to represent any 3D point 𝐱𝐱\mathbf{x}bold_x using distance r=𝐱0𝐱𝑟normsubscript𝐱0𝐱r=\|\mathbf{x}_{0}-\mathbf{x}\|italic_r = ∥ bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_x ∥ and the angle θ𝜃\thetaitalic_θ between (𝐱𝐱0)𝐱subscript𝐱0(\mathbf{x}-\mathbf{x}_{0})( bold_x - bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) and 𝐳𝐳\mathbf{z}bold_z:

(7) d=𝐱𝟎μ𝑑normsubscript𝐱0𝜇\displaystyle d=\|\mathbf{x_{0}}-\mathbf{\mu}\|italic_d = ∥ bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT - italic_μ ∥
(8) 𝐳=μ𝐱0d𝐳𝜇subscript𝐱0𝑑\displaystyle\mathbf{z}=\frac{\mu-\mathbf{x}_{0}}{d}bold_z = divide start_ARG italic_μ - bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG italic_d end_ARG

Then, 𝐱μ2superscriptnorm𝐱𝜇2\|\mathbf{x}-\mu\|^{2}∥ bold_x - italic_μ ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT in Eq. (1) can be represented with distance r=𝐱0𝐱𝑟normsubscript𝐱0𝐱r=\|\mathbf{x}_{0}-\mathbf{x}\|italic_r = ∥ bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_x ∥ and θ𝜃\thetaitalic_θ:

(9) 𝐱μ2=(𝐱𝐱0)(μ𝐱0)2=(rdcosθ)2+d2sin2θ.superscriptdelimited-∥∥𝐱𝜇2superscriptdelimited-∥∥𝐱subscript𝐱0𝜇subscript𝐱02superscript𝑟𝑑𝜃2superscript𝑑2superscript2𝜃\lVert\mathbf{x}-\mu\rVert^{2}=\lVert(\mathbf{x}-\mathbf{x}_{0})-(\mu-\mathbf{% x}_{0})\rVert^{2}=(r-d\cos\theta)^{2}+d^{2}\sin^{2}\theta.∥ bold_x - italic_μ ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ∥ ( bold_x - bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) - ( italic_μ - bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( italic_r - italic_d roman_cos italic_θ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_sin start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_θ .

With these parameters, we transform a 3D Gaussian to polar coordinates:

(10) G(ω,r)=exp((rdcosθ)2+d2sin2θ2σ2).𝐺𝜔𝑟superscript𝑟𝑑𝜃2superscript𝑑2superscript2𝜃2superscript𝜎2G(\omega,r)=\exp\left(-\frac{(r-d\cos\theta)^{2}+d^{2}\sin^{2}\theta}{2\sigma^% {2}}\right).italic_G ( italic_ω , italic_r ) = roman_exp ( - divide start_ARG ( italic_r - italic_d roman_cos italic_θ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_sin start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_θ end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) .

The integral of 3D Gaussian can be then transformed to the spherical form:

(11) 3G(𝐱)𝑑𝐱=Ω0exp((rdcosθ)2+d2sin2θ2σ2)r2𝑑r𝑑ω.subscriptsuperscript3𝐺𝐱differential-d𝐱subscriptΩsuperscriptsubscript0superscript𝑟𝑑𝜃2superscript𝑑2superscript2𝜃2superscript𝜎2superscript𝑟2differential-d𝑟differential-d𝜔\int_{\mathbb{R}^{3}}G(\mathbf{x})\,d\mathbf{x}=\int_{\Omega}\int_{0}^{\infty}% \exp\left(-\frac{(r-d\cos\theta)^{2}+d^{2}\sin^{2}\theta}{2\sigma^{2}}\right)r% ^{2}drd\omega.∫ start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_G ( bold_x ) italic_d bold_x = ∫ start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT roman_exp ( - divide start_ARG ( italic_r - italic_d roman_cos italic_θ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_sin start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_θ end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d italic_r italic_d italic_ω .

By matching Eq. (11) with the definition of a directional distribution, we derive the directional distribution of a 3D Gaussian at observation point 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT as:

(12) fo(ω)=0exp((rdcosθ)2+d2sin2θ2σ2)r2𝑑r.subscript𝑓𝑜𝜔superscriptsubscript0superscript𝑟𝑑𝜃2superscript𝑑2superscript2𝜃2superscript𝜎2superscript𝑟2differential-d𝑟f_{o}(\omega)=\int_{0}^{\infty}\exp\left(-\frac{(r-d\cos\theta)^{2}+d^{2}\sin^% {2}\theta}{2\sigma^{2}}\right)r^{2}dr.italic_f start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ( italic_ω ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT roman_exp ( - divide start_ARG ( italic_r - italic_d roman_cos italic_θ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_sin start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_θ end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d italic_r .

Actually, this integral has a closed-form solution:

(13) fo(ω)=σ2ed22σ2dcosθ+π2σed2sin2θ2σ2(σ2+d2cos2θ)×(1+erf(dcosθ2σ)).subscript𝑓𝑜𝜔superscript𝜎2superscript𝑒superscript𝑑22superscript𝜎2𝑑𝜃𝜋2𝜎superscript𝑒superscript𝑑2superscript2𝜃2superscript𝜎2superscript𝜎2superscript𝑑2superscript2𝜃1erf𝑑𝜃2𝜎\begin{split}f_{o}(\omega)=&\,\sigma^{2}e^{-\frac{d^{2}}{2\sigma^{2}}}d\cos% \theta\\ &+\sqrt{\frac{\pi}{2}}\sigma e^{-\frac{d^{2}\sin^{2}\theta}{2\sigma^{2}}}\left% (\sigma^{2}+d^{2}\cos^{2}\theta\right)\\ &\times\left(1+\text{erf}\left(\frac{d\cos\theta}{\sqrt{2}\sigma}\right)\right% ).\end{split}start_ROW start_CELL italic_f start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ( italic_ω ) = end_CELL start_CELL italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - divide start_ARG italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT italic_d roman_cos italic_θ end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + square-root start_ARG divide start_ARG italic_π end_ARG start_ARG 2 end_ARG end_ARG italic_σ italic_e start_POSTSUPERSCRIPT - divide start_ARG italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_sin start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_θ end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT ( italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_cos start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_θ ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL × ( 1 + erf ( divide start_ARG italic_d roman_cos italic_θ end_ARG start_ARG square-root start_ARG 2 end_ARG italic_σ end_ARG ) ) . end_CELL end_ROW

This is our unnormalized directional PDF of the 3D Gaussian; readers are referred to the appendix for a complete derivation. Since the normalizing term is the integral of the original 3D Gaussian over real space, the normalized form is given as:

(14) Fo(ω)=1(2πσ2)3fo(ω).subscript𝐹𝑜𝜔1superscript2𝜋superscript𝜎23subscript𝑓𝑜𝜔F_{o}(\omega)=\frac{1}{\sqrt{(2\pi\sigma^{2})^{3}}}f_{o}(\omega).italic_F start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ( italic_ω ) = divide start_ARG 1 end_ARG start_ARG square-root start_ARG ( 2 italic_π italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG italic_f start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ( italic_ω ) .

The distribution can be effectively evaluated; it only involves several elementary transcendental functions and the error function, and the error function can be precisely approximated with Abramowitz and Stegun approximation (1964). Despite the complex form, we find it rather simple to sample this distribution too: by sampling a 3D point from the original 3D Gaussian, the distribution of direction pointing to it from the observation point obeys Eq. (14). In the appendix we provide the complete proof of this sampling algorithm. With these conditions, we achieve efficient unbiased sampling of spherical transform of 3D Gaussian.

Our transformation of the 3D Gaussian essentially builds a connection between global distribution and local directional distribution; this can potentially lead a shift from using multiple local directional distribution (e.g., the von Mises-Fisher (vMF) mixture) to one global 3D Gaussian mixture, which could significantly reduce redundancy and improve learning efficiency. Furthermore, it could eliminate the parallax issue introduced by local discrete distributions.

Similarity with von-Mises Fisher distribution

d𝑑ditalic_d and σ𝜎\sigmaitalic_σ determine the shape of Fo(ω)subscript𝐹𝑜𝜔F_{o}(\omega)italic_F start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ( italic_ω ). Interestingly, we find the shape of this spherical distribution very similar to 3D von-Mises Fisher distribution (vMF):

(15) Fv(ω|κ)=κ4πsinh(κ)eκ(cosθ1),subscript𝐹𝑣conditional𝜔𝜅𝜅4𝜋𝜅superscript𝑒𝜅𝜃1F_{v}(\omega~{}|~{}\kappa)=\frac{\kappa}{4\pi\sinh(\kappa)}e^{\kappa(\cos% \theta-1)},italic_F start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_ω | italic_κ ) = divide start_ARG italic_κ end_ARG start_ARG 4 italic_π roman_sinh ( italic_κ ) end_ARG italic_e start_POSTSUPERSCRIPT italic_κ ( roman_cos italic_θ - 1 ) end_POSTSUPERSCRIPT ,

where θ𝜃\thetaitalic_θ is the angle between ω𝜔\omegaitalic_ω and vMF’s mean vector ν𝜈\nuitalic_ν. Actually, the north pole 𝐳𝐳\mathbf{z}bold_z used in Eq. (14) works the same way as the mean vector ν𝜈\nuitalic_ν of vMF (thus, in our discussion, we always assume 𝐳𝐳\mathbf{z}bold_z and ν𝜈\mathbf{\nu}italic_ν are the same). Fig. 4 presents a comparative visualization: it displays Fo(ω)subscript𝐹𝑜𝜔F_{o}(\omega)italic_F start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ( italic_ω ) under two distinct parameter sets alongside a vMF distribution. The parameters of the Fvsubscript𝐹𝑣F_{v}italic_F start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT are meticulously chosen to achieve a close match with the shape of Fo(ω)subscript𝐹𝑜𝜔F_{o}(\omega)italic_F start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ( italic_ω ).

Refer to caption Refer to caption
Figure 4. Visualization of distributions for our directional transform of 3D Gaussian (Fosubscript𝐹𝑜F_{o}italic_F start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT) and vMF (Fvsubscript𝐹𝑣F_{v}italic_F start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT) with different parameters. By choosing parameters meticulously, the two distributions achieve very similar shapes.

4.3. Sampling Photon Emission with 3D Gaussian Mixture

Photon emission sampling comprises two principal stages: determining the emission location 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, and selecting the emission direction ω0subscript𝜔0\omega_{0}italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Utilizing 3D Gaussians, we can precisely guide photon emissions from different types of light sources, each following a unique approach:

Area and Point Lights::

For both area and point lights, we first sample the emission location 𝐱^0subscript^𝐱0\hat{\mathbf{x}}_{0}over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. In the case of area lights, we perform this sampling uniformly across the surface. For point lights, the location is a fixed point pre-defined in our setup. Subsequently, we determine the emission direction using the method we describe in § 4.2.

Infinite Light::

We first sample the direction ω^0subscript^𝜔0\hat{\omega}_{0}over^ start_ARG italic_ω end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT from the infinite light’s distribution. Then, we define a plane that passes through the center of the scene and is perpendicular to ω^0subscript^𝜔0\hat{\omega}_{0}over^ start_ARG italic_ω end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Upon projecting our 3D Gaussian mixture onto this plane, we obtain the corresponding 2D Gaussian mixture. We sample a point 𝐱^0superscriptsubscript^𝐱0\hat{\mathbf{x}}_{0}^{\prime}over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT from this 2D distribution on the plane. The actual emission location 𝐱^0subscript^𝐱0\hat{\mathbf{x}}_{0}over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is then calculated as 𝐱^0=𝐱^0bω^0subscript^𝐱0superscriptsubscript^𝐱0𝑏subscript^𝜔0\hat{\mathbf{x}}_{0}=\hat{\mathbf{x}}_{0}^{\prime}-b\hat{\omega}_{0}over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_b over^ start_ARG italic_ω end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, where b𝑏bitalic_b is the radius of the scene’s bounding sphere.

5. Adaptive Light Sampler

Under the circumstance that only a portion of objects cast caustics, an independent light source sampler helps to reduce the variance and improve photon density. (Šik and Krivánek, 2019) mentioned a light source sampler for photon map** is achieved, however the authors didn’t provide details of the method. In this section we introduce our adaptive light sampler, which progressively learns an appropriate distribution of light sources for photon emission.

Our light sampler is inspired by (Wang et al., 2021) and (Müller et al., 2017): we store a binary tree whose nodes are used to store the photon count gathered in last iteration. Each node represents a range of light sources according to the node’s position and depth. For example, the root node represents the whole light sources, while the left child of it represents the first half. Starting from a single root node, in each sample iteration, we record how many photons are gathered in the region a node represents. When the photon number of a leaf node passes the threshold, we branch the leaf node into two children. To sample this light tree we just need to run binary sample down along the tree, using each node’s recorded count as importance. After several iterations we are able to obtain a close approximation of importance distribution of light sources.

6. Implementation Details

We implement our method in a GPU production renderer. In this section we provide implementation details in addition to the basic algorithms.

Progressive learning of 3D Gaussian mixture

Although we use gradient-based learning approach to fit the global 3D Gaussian mixture, we do not use existing learning frameworks. Instead, we implement a high performance learning kernel utilizing auto-differentiation feature of LuisaRender (2022), which can be integrated with our rendering system’s pipeline seamlessly. The learning process requires a moment-based optimizer, for which we utilize Adam (2014). The learning rate is initially set to 0.1 and then progressively scaled down to 0.001.

Searching radius

The searching radius is an essential parameter for gathering. We employ an adaptive searching radius approach, where a relatively large maximum radius is initially set. For each shading point, we gather the four nearest photons, and the distance to the farthest photon among these dictates the actual searching radius. This allows for dynamic adjustment of the radius in response to the increasing photon density achieved through photon guiding.

Initialization Process

In line with standard practices in distribution-fitting applications, our 3D Gaussian mixture model necessitates an accurate initial guess to commence the fitting process effectively. Fortunately, this requirement can be satisfied with the k-means clustering method. Specifically, for a 3D Gaussian mixture comprising N𝑁Nitalic_N components, we employ the k-means method to ascertain N𝑁Nitalic_N mean positions from the initial position data in the first photon pass. Subsequently, these mean positions serve as the preliminary guesses for the Gaussian components. Regarding the standard deviation parameter (σ𝜎\sigmaitalic_σ), we opt for the median value of our encoding. Additionally, each component of the mixture is assigned with an equal weight (1N1𝑁\frac{1}{N}divide start_ARG 1 end_ARG start_ARG italic_N end_ARG).

Multiple importance sampling with uniform emission

Following existing path guiding techniques, we sample from both the 3D Gaussian mixture and the uniform sampling techniques. To blend these two techniques, we set a selection probability β𝛽\betaitalic_β for 3D Gaussian mixture. Starting from 0, β𝛽\betaitalic_β gradually increases as the fitting continues, and stops at S𝑆Sitalic_S. In our implementation, we set S=0.75𝑆0.75S=0.75italic_S = 0.75.

7. Evaluation

We execute evaluation by rendering scenes with different configurations/techniques and comparing the quality. All the images in this section are rendered in 1920×\times×1080 resolution, and the full-sized images are included in supplemental materials. We run rendering on a same conventional PC with Intel Core 9700K CPU and Nvidia RTX 4070 GPU. In all the experiments, our method uses an 8-component 3D Gaussian mixture.

7.1. Guiding components

Our method importance samples photon emission and light source, with 3D Gaussian mixture and proposed adaptive light sampler respectively, to improve photon density. We evaluate the improvement of both components by rendering several representative scenes with a fixed 1-minute time budget, with and without the two components. Our reference images are rendered using SPPM too, since noiseless caustics are difficult for path tracing to produce. To produce the reference images, we render for 64K iterations with very small searching radius and 1M photons: 64 billion photons in total. We evaluate the results by comparing their difference from the reference, measured in structural similarity index measure (SSIM, higher is better) and mean squared error (MSE, lower is better). The results are shown in Fig. 5.

We observe significant improvement with our method from all the 3 scenes. With less sample count achieved in fixed time budget, the error is greatly reduced. It is worth noting that, lower sample count is mainly due to the increased photon density in each iteration, since a larger KD-tree is built and traversed. The Pool scene (Fig. 5 (b)) demonstrates a typical situation where only a portion of the whole scene casts/receive caustics, and photon guiding helps to greatly improve the photon density: compared to uniform emission, our method achieves sharper caustics, and the approximation is more accurate judging from the lower error. The Ring scene (Fig. 5 (c)) demonstrates where our adaptive light source sampler makes difference: the “Light Sampler” result shows clear improvement against the “Uniform” one.

7.2. Learned result

Fig. 6 (a) shows visualization of the learned 3D Gaussian mixture, transformed to directional distribution at two arbitrary observation points (i.e., the position of the camera). In Fig. 6 (b) we visualize the photon density from one rendering iteration of the same scene. In this iteration of the Tabletop scene, with our photon guiding method, the photon count in the visible area increases from 20,858 to 211,034: 9.12×9.12\times9.12 × more photons, which leads to an overall higher density and more accurate estimation.

7.3. State-of-the-art Comparison

The 2D histogram approach (H2D) used in (Grittmann et al., 2018) and the Metropolis light transport approach (MLT) proposed in (Hachisuka and Jensen, 2011) target a purpose similar to our photon guiding method. We compare our tailored implementations of H2D and MLT with our method in the same production renderer. In this comparison, we use the same maximum searching radius for all methods (while kee** the adaptive strategy as described above). While trying to adhere closely to the fundamental concepts of the original methods, we had to make necessary modifications in our implementation for adaptation purpose as described below:

H2D: For each light source, we construct a 256 x 256 distribution map. We map directions from the light origin to points of interest onto cylindrical coordinate to achieve 2D distributions for area and point lights. For infinite directional lights, we project the points of interest onto a plane intersecting the scene’s center and perpendicular to the light direction to obtain their 2D distributions. At the end of every two rendering iterations, we blur the 2D maps that record the gathered times of each area, and use them to build the 2D distributions, represented as alias tables. We use the built 2D distributions to guide directions for area and point lights, and positions for directional lights. However, H2D is not applicable to dome lights (environment maps), because dome lights could consist of multiple dominant directions. We utilize the light sampler proposed in § 5 along with H2D.

MLT: We adapt MLT for GPU implementation, following the principles described in (Hachisuka and Jensen, 2011). Our importance function is a binary function: to any photon 𝐯𝐯\mathbf{v}bold_v, Fi(𝐯)=1subscript𝐹𝑖𝐯1F_{i}(\mathbf{v})=1italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_v ) = 1 if 𝐯𝐯\mathbf{v}bold_v successfully arrives receiver surface after several specular bounces on caster surfaces, and Fi(𝐯)=0subscript𝐹𝑖𝐯0F_{i}(\mathbf{v})=0italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_v ) = 0 otherwise. This is a robust approach for our integration scheme, where we explicitly mark caustic casters and receivers, while photon emission and gathering are in different passes. Our GPU integration emits 220superscript2202^{20}2 start_POSTSUPERSCRIPT 20 end_POSTSUPERSCRIPT photons every iteration in parallel, and each thread maintains its own MLT sampler. For the bootstrap phase, we extensively emit 230superscript2302^{30}2 start_POSTSUPERSCRIPT 30 end_POSTSUPERSCRIPT photons (1024×1024\times1024 × the number of MLT samplers) with random states to explore the sample space comprehensively, and the accepted states form an initial state pool. Post to this, each MLT sampler randomly selects state from this pool, ensuring a diverse yet representative starting point for exploration. In rendering phase, when the importance function returns 1 for a photon, it is recorded and the state is accepted; otherwise, the previous accepted photon is recorded and the MLT sampler reverts to the previous state. We employ a normal distribution with a manually selected variance value for mutation, as the adaptive mutation from (Hachisuka and Jensen, 2011) requires per-sample update for the mutation scale while a simple adaptation to our GPU-based SPPM scheme gives poor result. We set a probability of 0.3 for large-step mutation.

The performance of various methods can vary significantly based on their implementation specifics, particularly in the context of a GPU rendering system. We thus compare them based on a consistent sample budget for a direct assessment of rendering quality, independent of time constraints. However, we also report execution times for completeness. The test scenes and results from different methods are shown in Fig. 1 and Fig. 7. Our method demonstrates more robust results across all test scenes, outperforming H2D and MLT in four scenes, and closely matching MLT in the Pool scene (Fig. 7 (c)) with about half the time cost.

The Ring scene (Fig. 1) is a challenging scene that features complex lighting setup and intricate caustics. Our global 3D Gaussian mixture approach robustly guides photon emission from different light sources at arbitrary location, while the adaptive light sampler helps to select light sources according to their importance accurately.

With H2D, the transition from 3D to 2D distributions introduces parallax issues, particularly problematic for light sources with considerable area. For instance, the lights illuminating the diamonds in Ring scene (Fig. 1) are rectangles, and H2D struggles to guide such lights due to the parallax issue. In the Torus scene (Fig. 7 (b)), a complex dome light with multiple peaks posed a challenge for H2D’s capability. Similarly, in the Shelf (Fig. 7 (a)) scene, the result contains more outliers due to the small angular size of the directional light (5{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT). In contrast, our 3D Gaussian method accurately provides directional distributions for any position and corrects perspective for any direction, ensuring superior guiding quality.

MLT achieves very close quality as our method, however the nature of MLT approach suffers in exploring a wide range of sample space. In the Ring scene (Fig. 1), MLT fails to distribute samples to each light source uniformly, consequently, the caustics from the diamond is well rendered, however the ring caustics appear to be noisy. In the Tabletop scene (Fig. 7 (d)), MLT achieves high quality photon guiding, however the result at the bottom-right is far off. We explain this as MLT fails to assign enough samples at that region in sample space. Note that we have invested a heavy amount of samples in the bootstrap phase, and applied large-step mutation during sampling, however MLT still cannot cover the sample space uniformly.

8. Discussion

Ease of Implementation

During the implementation of different techniques, we found our method shows simplicity in implementation, mostly due to its single global distribution scheme. This model’s independence from the number of light sources simplifies the integration process, making it less susceptible to the complexities that arise with variable lighting conditions. In contrast, the 2D histogram method requires frequent and substantial updates to its maps, which can be computationally demanding as the number of light source increases. Meanwhile, MLT, despite its effectiveness, is known for its intricate implementation, and requires more careful handling. MLT also requires more significant computational and memory overhead. In contrast, with GPU-based gradient descent, our learning process is very efficient and requires minimal memory footprint.

Cosine-weighted guiding

Theoretically, our method can be extended for cosine-weighted guiding for emission direction. We have shown that the spherical distribution Eq. (14), obtained using our novel directional transform of 3D Gaussian, has a similar shape with vMF distribution Fvsubscript𝐹𝑣F_{v}italic_F start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT (while they are essentially different). For an area light, if we can first approximate Fosubscript𝐹𝑜F_{o}italic_F start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT with Fvsubscript𝐹𝑣F_{v}italic_F start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT, then, with another Fvsubscript𝐹𝑣F_{v}italic_F start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT resembling the cosine-weighted distribution of directions with respect to the surface normal, we are able to calculate the product of these two, which is still a vMF distribution. By sampling the product vMF we can achieve cosine-weighted guiding for diffuse area lights. However, finding a Fvsubscript𝐹𝑣F_{v}italic_F start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT approximating Fosubscript𝐹𝑜F_{o}italic_F start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT is expensive, since it requires solving a transcendental equation that connects parameters of Fo(ω|d,σ)subscript𝐹𝑜conditional𝜔𝑑𝜎F_{o}(\omega~{}|~{}d,\sigma)italic_F start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ( italic_ω | italic_d , italic_σ ) and Fv(ω|κ)subscript𝐹𝑣conditional𝜔𝜅F_{v}(\omega~{}|~{}\kappa)italic_F start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_ω | italic_κ ). We leave this as a future work.

Limitations

Our global distribution approach only requires minimal overhead, however, it does struggle in situations where the actual distribution of each light source drastically changes; however, this seldom occurs in practice. In most caustics rendering scenarios, the efficiency is already greatly increased by guiding photons towards specular surfaces. Although this limitation can be overcome by assigning and training unique 3D Gaussian mixtures for each light source, it will introduce additional computational overhead.

In caustics rendering, many photon paths involve multi-lobe BSDF sampling (e.g., glass has both a reflection lobe and a refraction lobe). For such paths, we currently sample lobes with their corresponding BSDF weights, however, the variance could be further reduced if we are able to sample the decision based on the lobes’ actual contribution to the image. In fact, in the comparison of Pool scene (Fig. 7 (c)), MLT achieves the best result mainly because it invests more samples on the reflection of the water surface. Adapting a variance-aware distribution, as suggested by Rath et al. (2020), is a conceivable solution. However, the practical implementation of such a concept within our gradient-based learning remains unexplored, which we leave as a future work.

9. Conclusion and Future Work

In this paper, we have introduced a novel photon guiding method designed for production rendering systems, focusing on efficiently rendering caustics. Our approach combines the use of a global 3D Gaussian mixture for emission guiding and an adaptive light source sampler, both of which contribute to achieving significantly higher photon density compared to existing photon guiding methods. The cornerstone of our approach is the innovative utilization of 3D Gaussian distributions, from which we have derived a novel directional distribution.

We consider 3D Gaussians to not only serve as the foundation of our method, but also to hold immense potential for a wide range of applications in computer graphics research. For example, the versatility of 3D Gaussians in connecting local and global distributions is also promising in a broader path guiding context, such as modeling indirect lighting and view-dependent visual effects. We will explore such directions in our future work.

References

  • (1)
  • Abramowitz and Stegun (1964) Milton Abramowitz and Irene A. Stegun. 1964. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover Publications, New York.
  • Burley et al. (2018) Brent Burley, David Adler, Matt Jen-Yuan Chiang, Hank Driskill, Ralf Habel, Patrick Kelly, Peter Kutz, Yining Karl Li, and Daniel Teece. 2018. The Design and Evolution of Disney’s Hyperion Renderer. ACM Trans. Graph. 37, 3, Article 33 (jul 2018), 22 pages. https://doi.org/10.1145/3182159
  • Dodik et al. (2022) Ana Dodik, Marios Papas, Cengiz Öztireli, and Thomas Müller. 2022. Path Guiding Using Spatio‐Directional Mixture Models. Computer Graphics Forum (2022). https://doi.org/10.1111/cgf.14428
  • Droske et al. (2023) Marc Droske, Johannes Hanika, Jiří Vorba, Andrea Weidlich, and Manuele Sabbadin. 2023. Path Tracing in Production: The Path of Water. In ACM SIGGRAPH 2023 Courses (SIGGRAPH ’23). Association for Computing Machinery, New York, NY, USA, Article 12, 66 pages. https://doi.org/10.1145/3587423.3595519
  • Georgiev et al. (2012) Iliyan Georgiev, Jaroslav Křivánek, Tomáš Davidovič, and Philipp Slusallek. 2012. Light Transport Simulation with Vertex Connection and Merging. ACM Trans. Graph. 31, 6, Article 192 (nov 2012), 10 pages. https://doi.org/10.1145/2366145.2366211
  • Grittmann et al. (2018) Pascal Grittmann, Arsène Pérard-Gayot, Philipp Slusallek, and Jaroslav Křivánek. 2018. Efficient Caustic Rendering with Lightweight Photon Map**. Computer Graphics Forum 37, 4 (2018), 133–142. https://doi.org/10.1111/cgf.13481 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.13481
  • Hachisuka and Jensen (2009) Toshiya Hachisuka and Henrik Wann Jensen. 2009. Stochastic Progressive Photon Map**. In ACM SIGGRAPH Asia 2009 Papers (Yokohama, Japan) (SIGGRAPH Asia ’09). Association for Computing Machinery, New York, NY, USA, Article 141, 8 pages. https://doi.org/10.1145/1661412.1618487
  • Hachisuka and Jensen (2011) Toshiya Hachisuka and Henrik Wann Jensen. 2011. Robust Adaptive Photon Tracing Using Photon Path Visibility. ACM Trans. Graph. 30, 5, Article 114 (oct 2011), 11 pages. https://doi.org/10.1145/2019627.2019633
  • Hachisuka et al. (2008) Toshiya Hachisuka, Shinji Ogaki, and Henrik Wann Jensen. 2008. Progressive photon map**. ACM Trans. Graph. 27, 5, Article 130 (dec 2008), 8 pages. https://doi.org/10.1145/1409060.1409083
  • Huang et al. (2023) Jiawei Huang, Akito Iizuka, Hajime Tanaka, Taku Komura, and Yoshifumi Kitamura. 2023. Online Neural Path Guiding with Normalized Anisotropic Spherical Gaussians. arXiv:2303.08064 [cs.CV]
  • Jensen (1995) Henrik Wann Jensen. 1995. Importance Driven Path Tracing using the Photon Map. In Rendering Techniques ’95, Patrick M. Hanrahan and Werner Purgathofer (Eds.). Springer Vienna, Vienna, 326–335.
  • Jensen (1996) Henrik Wann Jensen. 1996. Global illumination using photon maps. In Rendering Techniques’ 96: Proceedings of the Eurographics Workshop in Porto, Portugal, June 17–19, 1996 7. Springer, 21–30.
  • Kelemen et al. (2002) Csaba Kelemen, László Szirmay-Kalos, György Antal, and Ferenc Csonka. 2002. A Simple and Robust Mutation Strategy for the Metropolis Light Transport Algorithm. Computer Graphics Forum 21, 3 (2002), 531–540. https://doi.org/10.1111/1467-8659.t01-1-00703 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/1467-8659.t01-1-00703
  • Kingma and Ba (2014) Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). http://arxiv.longhoe.net/abs/1412.6980
  • Misso et al. (2022) Zackary Misso, Benedikt Bitterli, Iliyan Georgiev, and Wojciech Jarosz. 2022. Unbiased and Consistent Rendering Using Biased Estimators. ACM Trans. Graph. 41, 4, Article 48 (jul 2022), 13 pages. https://doi.org/10.1145/3528223.3530160
  • Müller et al. (2017) Thomas Müller, Markus Gross, and Jan Novák. 2017. Practical Path Guiding for Efficient Light-Transport Simulation. Comput. Graph. Forum 36, 4 (jul 2017), 91–100. https://doi.org/10.1111/cgf.13227
  • Müller et al. (2019) Thomas Müller, Brian Mcwilliams, Fabrice Rousselle, Markus Gross, and Jan Novák. 2019. Neural Importance Sampling. ACM Trans. Graph. 38, 5, Article 145 (oct 2019), 19 pages. https://doi.org/10.1145/3341156
  • Oren and Nayar (1994) Michael Oren and Shree K. Nayar. 1994. Generalization of Lambert’s reflectance model. In Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’94). Association for Computing Machinery, New York, NY, USA, 239–246. https://doi.org/10.1145/192161.192213
  • Rath et al. (2020) Alexander Rath, Pascal Grittmann, Sebastian Herholz, Petr Vévoda, Philipp Slusallek, and Jaroslav Křivánek. 2020. Variance-Aware Path Guiding. ACM Trans. Graph. 39, 4, Article 151 (jul 2020), 12 pages. https://doi.org/10.1145/3386569.3392441
  • Ruppert et al. (2020) Lukas Ruppert, Sebastian Herholz, and Hendrik P. A. Lensch. 2020. Robust Fitting of Parallax-Aware Mixtures for Path Guiding. ACM Trans. Graph. 39, 4, Article 147 (aug 2020), 15 pages. https://doi.org/10.1145/3386569.3392421
  • Trowbridge and Reitz (1975) T. S. Trowbridge and K. P. Reitz. 1975. Average irregularity representation of a rough surface for ray reflection. J. Opt. Soc. Am. 65, 5 (May 1975), 531–536. https://doi.org/10.1364/JOSA.65.000531
  • Veach and Guibas (1997) Eric Veach and Leonidas J. Guibas. 1997. Metropolis light transport. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’97). ACM Press/Addison-Wesley Publishing Co., USA, 65–76. https://doi.org/10.1145/258734.258775
  • Vorba et al. (2014) Jiří Vorba, Ondřej Karlík, Martin Šik, Tobias Ritschel, and Jaroslav Křivánek. 2014. On-Line Learning of Parametric Mixture Models for Light Transport Simulation. ACM Trans. Graph. 33, 4, Article 101 (jul 2014), 11 pages. https://doi.org/10.1145/2601097.2601203
  • Walter et al. (2005) Bruce Walter, Sebastian Fernandez, Adam Arbree, Kavita Bala, Michael Donikian, and Donald P. Greenberg. 2005. Lightcuts: a scalable approach to illumination. ACM Trans. Graph. 24, 3 (jul 2005), 1098–1107. https://doi.org/10.1145/1073204.1073318
  • Wang et al. (2021) Yu-Chen Wang, Yu-Ting Wu, Tzu-Mao Li, and Yung-Yu Chuang. 2021. Learning to Cluster for Rendering with Many Lights. ACM Trans. Graph. 40, 6, Article 277 (dec 2021), 10 pages. https://doi.org/10.1145/3478513.3480561
  • Yuksel (2020) Cem Yuksel. 2020. Stochastic Lightcuts for Sampling Many Lights. IEEE Transactions on Visualization and Computer Graphics (2020), 11 pages. https://doi.org/10.1109/TVCG.2020.3001271
  • Zheng et al. (2022) Shaokun Zheng, Zhiqian Zhou, Xin Chen, Difei Yan, Chuyan Zhang, Yuefeng Geng, Yan Gu, and Kun Xu. 2022. LuisaRender: A High-Performance Rendering Framework with Layered and Unified Interfaces on Stream Architectures. ACM Trans. Graph. 41, 6, Article 232 (nov 2022), 19 pages. https://doi.org/10.1145/3550454.3555463
  • Zhu et al. (2020) Shilin Zhu, Zexiang Xu, Henrik Wann Jensen, Hao Su, and Ravi Ramamoorthi. 2020. Deep Kernel Density Estimation for Photon Map**. Computer Graphics Forum 39, 4 (2020), 35–45. https://doi.org/10.1111/cgf.14052 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.14052
  • Šik and Krivánek (2019) Martin Šik and Jaroslav Krivánek. 2019. Implementing One-Click Caustics in Corona Renderer. In Eurographics Symposium on Rendering - DL-only and Industry Track, Tamy Boubekeur and Pradeep Sen (Eds.). The Eurographics Association. https://doi.org/10.2312/sr.20191221
Reference Full Gaussians Light Sampler Uniform

(a) Tabletop

Refer to caption Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
SSIM: 0.9842 0.9792 0.9816 0.9743
MSE ×105absentsuperscript105\times 10^{5}× 10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT: 7.41 9.96 8.73 12.5
Sample count: 712 716 941 953

(b) Pool

Refer to caption Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
SSIM: 0.9583 0.9579 0.9407 0.9419
MSE ×104absentsuperscript104\times 10^{4}× 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT: 4.18 4.22 10.7 10.6
Sample count: 737 757 1165 1168

(c) Ring

Refer to caption Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
SSIM: 0.9587 0.9577 0.9421 0.9416
MSE ×103absentsuperscript103\times 10^{3}× 10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT: 2.33 2.39 4.47 5.10
Sample count: 718 840 1028 1046
Figure 5. We compare the rendering results of our method (Full) with those of uniform photon map** within a fixed 60-second time budget. Additionally, we contrast the guiding results using only 3D Gaussians and only our adaptive light sampler. Under the same time budget, the sample count varies, so we focus solely on comparing the contribution of photon map** to eliminate differences caused by the path tracing pass. A higher photon density, achieved with our guiding method, results in a larger KD-tree and longer build time, leading to a lower sample count. Our adaptive light sampler enhances quality in scenes with unbalanced light distribution: in the Tabletop and Ring scenes, the error is significantly reduced with the light sampler enabled. However, in the Pool scene, such an improvement is not observed.
Refer to captionRefer to caption Refer to captionRefer to captionRefer to captionRefer to caption
(a) (b)


Figure 6. Visualization of learned distribution and photon density. (a) Visualization the 3D Gaussian mixture after directional transformation. Left is the scene setup with a point light at the top-left. Right is the visualization of spherical distribution, transformed from the global 3D Gaussian mixture, at the point light’s perspective. Red indicates a high density and green low. The learned distribution accurately covers the surfaces that cast caustics. (b) We visualize the searching radius of each pixel that successfully gathers photons. Red indicates a low radius which is due to higher photon density at the surface. Top-left is the reference and top-right isolates the contribution of photon map**. Compared to uniform photon emission (bottom-left), our method (bottom-right) achieves higher photon density in the visible region, hence more accurate density estimation.
Reference PT-64K G3D (Ours) H2D MLT PM

(a) Shelf

Refer to caption Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
SSIM: (0.9908) 0.9820 0.9703 0.9771 0.9697
MSE ×104absentsuperscript104\times 10^{4}× 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT: (1.32) 2.18 7.92 3.56 8.29
Time: 49s 65s 116s 37s

(b) Torus

Refer to caption Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
SSIM: (0.9967) 0.9906 - 0.9901 0.9797
MSE ×105absentsuperscript105\times 10^{5}× 10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT: (2.49) 8.08 - 8.52 29.2
Time: 57s - 113s 34s

(c) Pool

Refer to caption Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
SSIM: (0.9462) 0.9755 0.9674 0.9783 0.9642
MSE ×104absentsuperscript104\times 10^{4}× 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT: (10.9) 5.70 7.80 5.49 11.8
Time: 82s 1s 132s 1s

(d) Tabletop

Refer to caption Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
SSIM: (0.9763) 0.9810 0.9700 0.9794 0.9524
MSE ×104absentsuperscript104\times 10^{4}× 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT: (3.19) 1.88 3.01 2.56 5.25
Time: 87s 92s 175s 65s
Figure 7. We compare rendering results of our method (G3D) with existing techniques, including 2D histogram (H2D) that is used in (Grittmann et al., 2018), Metropolis light transport (MLT) that is used in (Hachisuka and Jensen, 2011), and uniform photon map** (PM). All the results are rendered with 1024 iterations, each with 1M photons emitted. We report SSIM and MSE, with the best result highlighted in bold. The reference is produced with SPPM after 64K iterations. In addition we show result of path tracing with 64K sample-per-pixel (PT-64K) to validate the reference (for the Pool scene we increased the sample count to 512K due to the difficulty of rendering reflecting caustics for water surface). Readers are also referred to supplemental material for full-sized result for better comparison. Our 3D Gaussian mixture achieves satisfactory guiding quality as the accuracy metrics appear to be high, and is more robust over all test scenes. H2D’s parallax issue limits its accuracy when fitting light sources with area, as in (a) it produces more outliers, and unable to learn appropriate distribution for a dome light with complex lighting peaks as shown in (b). MLT achieves very good result when only a small portion of the whole distribution of photon emission contributes to the final image, such as shown in (a) and (b). However, the non-uniform convergence issue can be observed when the exploring space is larger (i.e., a larger set of paths contributes to caustics). In (c), MLT produces smooth underwater caustics, however the caustics on the wall shows bumpy artifact, which indicates that even on the same surface, the distribution of photon density is not balanced. In (d), which contains 42 sphere light sources, each with drastically different distribution, MLT fails to effectively explore the sample space, and the bottom-right caustics appears to be biased because too few photon samples are invested there.