Online Photon Guiding with 3D Gaussians for Caustics Rendering

Jiawei Huang Chuzhou UniversityChina Void DimensionsChina , Hajime Tanaka Tohoku UniversityJapan , Taku Komura The University of Hong KongHong Kong Tohoku UniversityJapan and Yoshifumi Kitamura Tohoku UniversityJapan

(2018)

Abstract.

In production rendering systems, caustics are typically rendered via photon map** and gathering, a process often hindered by insufficient photon density. In this paper, we propose a novel photon guiding method to improve the photon density and overall quality for caustic rendering. The key insight of our approach is the application of a global 3D Gaussian mixture model, used in conjunction with an adaptive light sampler. This combination effectively guides photon emission in expansive 3D scenes with multiple light sources. By employing a global 3D Gaussian mixture, our method precisely models the distribution of the points of interest. To sample emission directions from the distribution at any observation point, we introduce a novel directional transform of the 3D Gaussian, which ensures accurate photon emission guiding. Furthermore, our method integrates a global light cluster tree, which models the contribution distribution of light sources to the image, facilitating effective light source selection. We conduct experiments demonstrating that our approach robustly outperforms existing photon guiding techniques across a variety of scenarios, significantly advancing the quality of caustic rendering.

^†^†copyright: acmcopyright^†^†journalyear: 2018^†^†doi: XXXXXXX.XXXXXXX^†^†journal: TOG^†^†journalvolume: 37^†^†journalnumber: 4^†^†article: 111^†^†publicationmonth: 8^†^†submissionid: 927

Refer to caption — Figure 1. Comparative rendering results of the Ring scene, featuring intricate and aesthetically appealing caustics cast by metallic rings and diamonds, with various photon guiding techniques under a fixed 1024 sample budget. (a) Our method leverages 3D Gaussian mixture (G3D) to guide photon emission to improve the photon density in visible area, achieving accurate photon density estimation, with minimal overhead. (b) 2D histogram (H2D) approach fails to fit accurate distribution for area lights due to parallax issue, yielding noisy result. (c) Despite more computational overhead, state-of-the-art Metropolis light transport (MLT) approach suffers from non-uniform convergence, leading to a higher occurrence of outliers and a slightly elevated overall error in comparison to our method. The reference is produced with the same integrated rendering system for 64K iterations.

	Reference	(a) G3D (Ours)	(b) H2D	(c) MLT
\begin{overpic}[trim=200 0 200 0,clip,height=165.02606pt]{figures/Teaser/% ground_truth_marked} \put(1.0,1.0){Reference} \put(91.0,67.0){Ours} \end{overpic}
SSIM:		0.9505	0.9089	0.9499
MSE $\times 10^{3}$ :		2.88	5.42	2.91
Time:		77s	80s	141s

1. Introduction

In the realm of computer graphics, the rendering of caustics stands out for its visual appeal. In production rendering systems, caustics are usually separately sampled with photon map** / photon tracing, while other components are path-traced. However, it requires significant photon density to render caustics accurately and effectively with photon map**. This becomes a challenge when only a small portion of scene is rendered, or when only a selection of objects cast caustics. Many rendering systems thus necessitate photon guiding techniques to address this issue to achieve robust photon map**.

There has been significant research focus on develo** sufficient photon guiding techniques. Lightweight Photon Map** (2018) utilizes 2D histograms to learn distributions and guide emission for each light source. Hachisuka et al. (2011) leverage Metropolis Light Transport (MLT) for more efficient photon emission sampling. However, limitations exist in these approaches. Lightweight Photon Map** has to build a distribution for each individual light source, which leads to inaccurate fitting and large overhead with increasing number of the light sources; also the accuracy of 2D guiding map is inherently limited by the parallax issue with none-delta light sources (e.g., rectangular light with nonnegligible area, or environment map with multiple peaks). MLT occasionally suffers from non-uniform convergence, and requires considerable computational overhead. The limitations of existing methods drive us to seek a different approach to tackle the photon guiding problem.

In this paper we propose a novel method to learn a global distribution of photon emission to maximize the photon density of the rendered image. Our key insight is to leverage 3D Gaussian, a flexible and powerful distribution whose application in Monte Carlo rendering is less explored. With 3D Gaussians, we are able to model the global distribution of ”points of interest” effectively. We derive a novel equation and method to sample directions from 3D Gaussian from any observation point, directly and unbiasedly, to guide photon emission. Along with it, we introduce a novel adaptive light source sampler, to form a complete photon guiding framework. Our method is very compact and runs fast on a conventional GPU. We integrate our method with a unidirectional GPU path tracer to achieve very efficient rendering of caustics.

In conclusion, the contribution of this paper includes:

•

The development of an algorithm for sampling directions from a 3D Gaussian at any observation point, completed with a closed-form probability density function (PDF).
•

A workflow to learn points of interest distribution for photon emission, represented with 3D Gaussian mixture, utilizing online photon samples as training data.
•

A novel, fast photon guiding framework that progressively learns a global distribution to guide photon emission for high photon density.

2. Related work

2.1. Photon map** and photon guiding

Photon map** is an efficient approach proposed by Jensen (1996) to calculate global illumination. Later research proposed several modification on the scheme for more accurate density estimation and adaptation to progressive rendering scheme (Hachisuka and Jensen, 2009; Hachisuka et al., 2008). Recently, Zhu et al. (2020) proposed to improve the quality of density estimation with machine learning, Misso et al. (2022) showed that it is possible to obtain unbiased result from photon map**. Photon map** is considered as an effective approach to render caustics from specular surfaces such as glass and mirror. Even in hybrid methods which utilize multiple sample algorithms for robust rendering (e.g., (Georgiev et al., 2012)), photon map** is the common choice for caustics paths. Traditionally in photon map**, photon emission is sampled uniformly. However, in practical scenarios, it is a common challenge that when only a small portion of the large scene can benefit from rendering caustics with photon map** (the ”caustic in a stadium” problem), the photon density with uniform emission sampling is not sufficient for accurate estimation. Hachisuka et al. (2011) adapts Metropolis light transport (MLT (Veach and Guibas, 1997). More specifically, it is based on the primary sample space MLT (Kelemen et al., 2002)), and Grittmann et al. (2018) adapts 2D histogramming, to guide photon emission for higher photon density. These ideas are implemented in production renderers (e.g., (Šik and Krivánek, 2019; Burley et al., 2018; Droske et al., 2023)). However, we found these approaches to exhibit certain limitations: MLT requires intricate implementation and suffers from non-uniform convergence when the sample space is complex, while 2D histogram suffers from inherent parallax issue since it projects high dimensional distributions to 2D, and the overhead increases linearly with the number of light sources. Our method innovatively fits a global 3D distribution, using a compact parametric model, to guide photon emission for all light sources. The fitting process is efficient, and the model only uses several tens of scalars, providing a robust solution for the photon guiding problem.

2.2. Distributions and sampling methods

Our research seeks to importance sample photon emission with an explicit global distribution. In Monte Carlo rendering, it has become a common practise to importance sample different distributions for variance reduction. As a typical instance, almost every rendering system importance sample the local bidirectional scattering distribution function (BSDF) of surfaces (e.g., (Trowbridge and Reitz, 1975; Oren and Nayar, 1994)). However, sampling local BSDF distribution may not be optimal when indirect lighting is a major contribution to the result. Path guiding research leverage more general distributions to model indirect lighting. Jensen (1995) utilizes 2D histograms to model and importance sample the spherical incident radiance distribution. Many distribution models are explored subsequently. Vorba et al. (2014) utilize 2D Gaussian mixtures for a similar purpose, with training data from a photon-emitting pass, to achieve a more practical method. Mueller et al. (2017) propose an SD-tree to achieve more compact histogram-based representation. Rupper et al. (2020) propose to use von-Mises Fisher distribution along with a parallax-aware fitting algorithm to remove the parallax issue caused by spatial partition. Dodik et al. (2022) propose to use 5D Gaussians to model incident radiance distribution over the space. Huang et al. (2023) propose an anisotropic spherical distribution that can be directly evaluated and sampled for more accurate and compact representation. Actually, despite different algorithms, these methods share a common insight, i.e. to build and sample local directional distributions that are represented by a suitable distribution model.

Previous photon guiding methods adapt this idea to photon emission straightforwardly. For example, (Grittmann et al., 2018; Burley et al., 2018; Droske et al., 2023) choose to build a local distribution for each of the light source using 2D histograms, however, as mentioned above, it suffers from the parallax issue due to the existence of area in actual light sources. Our research learns a global distribution of the surfaces that photons should be emitted towards, and we find 3D Gaussian serves this purpose well. However, guiding photon emission from a light source with 3D Gaussians requires to sample directions from the distribution. To the best of our knowledge, there is no existing algorithm related to this. In this paper we derive a novel directional transform for 3D Gaussians, enabling sampling directions from it, to guide photon emission at any location.

2.3. Adaptive light source sampling

In addition to emission guiding, it is also important to importance sample light sources. Traditionally light sources are sampled based on their flux, however adaptive sampling strategy is even preferred as reported in (Šik and Krivánek, 2019), since the actual contribution of each light source does not (fully) depend on their flux when only a portion of the full scene is rendered. Walter et al. (2005) propose Lightcuts, which utilizes a binary tree, to cluster light sources for efficient rendering. Yuksel (2020) sample Lightcuts stochastically in Monte Carlo rendering to remove bias. Wang et al. (2021) propose to adaptively refine the importance of each node of the binary tree. Our adaptive light sampler can be seen as an adaptation of the idea of Wang et al. to photon guiding, with the difference that we progressively build the binary tree instead of starting from complete Lightcuts, and we approximate the importance of each node based on the sum of radiance from gathered photons.

3. Overview

Background

Our technique improves photon density for visible region to achieve higher quality photon map** result. To better explain our technique with an actual implementation, we first briefly describe the approach we integrate photon map** into a production renderer for caustics rendering. Since the target production renderer is a GPU unidirectional path tracer, we employ photon map** to render caustics in stochastic approach ((Hachisuka and Jensen, 2009)), allowing the rest handled by path tracing. In the target path tracer, the specular surfaces that casts caustics are explicitly marked as “casters” while surfaces that receives caustics are marked as “receivers”. Only photons that travels through casters and reach receivers will be recorded to construct the photon map. This is a typical approach in production rendering systems, however, under this setup, uniform emission cannot achieve significant photon density for proper density estimation, leading to poor results. Therefore, we propose to guide photon emission with novel online learned distribution.

Integration

Fig. 2 outlines the components and workflow of our framework. The main rendering pipeline is a two-pass process: in the first pass (the photon tracing pass), we randomly trace a batch of photons, guided by our progressively refined distributions, and build a photon map with KD-tree. The second pass is path tracing; In addition to regular path tracing, we gather nearby photons and estimate indirect lighting at diffuse surfaces. During gathering, we also record the data required by distribution learning. After each rendering iteration, the distribution is refined based on the recorded data. Through the rendering iterations, our distribution converges to one that maximizes the number of gathered photons. The distribution consists of two parts: a 3D Gaussian mixture that represents the spatial distribution of points of interest, which is used for guiding later photon emission from light sources (see § 4); and a tree-based 1D distribution that is used for light source sampling (see § 5).

4. Learning to guide photon emission with 3D Gaussians

An (unnormalized) isotropic 3D Gaussian is given by the equation:

(1)

G(\mathbf{x})=\exp\left(-\frac{\|\mathbf{x}-\mathbf{\mu}\|^{2}}{2\sigma^{2}}% \right),

where $\mathbf{x}$ is the position vector in 3D space, $\mathbf{\mu}$ is the mean vector representing the center of the Gaussian distribution and $\sigma$ is the standard deviation. A 3D Gaussian mixture with $N$ isotropic 3D Gaussians can effectively model distributions over the 3D space:

(2)

M(\mathbf{x})=\sum_{i=1}^{N}\frac{w_{i}}{\sqrt{(2\pi\sigma^{2})^{3}}}G_{i}(% \mathbf{x}),

where $G_{i}$ are the 3D Gaussian components, and $w_{i}$ are the corresponding weights which sum to 1. The normalizing constant is the integral of each $G_{i}$ . As shown in Fig. 3, for any set of light sources, our method fits a global 3D Gaussian mixture, which is used to guide the later photon emission.

In the rest of this section we describe the workflow to fit the 3D Gaussian mixture using gathered photons, and the method we derived to sample emission direction from a 3D Gaussian.

4.1. Learning with gathered photons

Our goal is to model a distribution to achieve maximum photon density in the visible region, and eventually improve the accuracy of density estimation. Based on the fact that caustics photons are usually gathered at a diffuse surface after a sequence of specular bounce, we only need to fit a distribution at the location of the first bounce, which we define as the point of interest. To learn the distribution of points of interest, when emitting photons, we record the location of the first bounce and the emission pdf (directional pdf $\cdot$ positional pdf), along with the original photon data. In addition, every photon has an indicator of how many times it is gathered.

We learn the distribution with gradient descent using Kullback-Leibler divergence (KL divergence), adapting the approach proposed in (Müller et al., 2019). First, we define the learned 3D Gaussian mixture as $q(\mathbf{x};\gamma)$ , where $\gamma$ is the vector of parameters. The target distribution $p(\mathbf{x})$ , which we are trying to fit, is unknown. However, for each point of interest, the basic rule is that the more times the photons are gathered the higher the density should be, thus we define:

(3)

p(\mathbf{x})=g\cdot T(\mathbf{x}),

where $T(\mathbf{x})$ denotes how many times a photon bounced from $\mathbf{x}$ is gathered in the current pass, and $g$ is an unknown global scaling factor. The fitting can then be achieved by minimizing a KL divergence:

(4)

D_{KL}(p(\mathbf{x})\|q(\mathbf{x};\gamma))=\int_{\mathbb{R}^{3}}p(\mathbf{x})% (\log[p(\mathbf{x})]-\log[q(\mathbf{x};\gamma)])d\mathbf{x}.

We can minimize it by using gradient descent by optimizing $\gamma$ . Notice that $\log[p(\mathbf{x})]$ is irrelevant to $\gamma$ ’s gradient and thus we have

(5)

\nabla_{\gamma}D_{KL}(p(\mathbf{x})\|q(\mathbf{x};\gamma))=-\nabla_{\gamma}% \int_{\mathbb{R}^{3}}p(\mathbf{x})\log[q(\mathbf{x};\gamma)])d\mathbf{x}.

Although it requires to calculate the integral over 3D space, we are able to attempt a one-sample estimation:

(6)

\nabla_{\gamma}D_{KL}(p(\mathbf{x})\|q(\mathbf{x};\gamma))=-\,\mathbb{E}\left[% \frac{p(\mathbf{x})}{\hat{q}(\mathbf{x})}\nabla_{\gamma}\log[q(\mathbf{x};% \gamma)]\right].

In our rendering process, each time we sample an emitted photon, the emission pdf is $\hat{q}(\mathbf{x})$ . We then replace $p(x)$ with $g\cdot T(\mathbf{x})$ , and by using a moment-based optimizer, the global scaling factor $g$ can be effectively cancelled.

4.2. Sampling directions from 3D Gaussian

In this section, we describe a novel approach to transform the a global 3D Gaussian mixture that represents the distribution of photons, into a local directional distribution representing the photon emission from a given light source position.

Given an observation point $\mathbf{x}_{0}$ , we can define the north pole $\mathbf{z}$ pointing from $\mathbf{x}_{0}$ to $\mu$ . Then, we are able to represent any 3D point $\mathbf{x}$ using distance $r=\|\mathbf{x}_{0}-\mathbf{x}\|$ and the angle $\theta$ between $(\mathbf{x}-\mathbf{x}_{0})$ and $\mathbf{z}$ :

(7)		$\displaystyle d=\\|\mathbf{x_{0}}-\mathbf{\mu}\\|$
(8)		$\displaystyle\mathbf{z}=\frac{\mu-\mathbf{x}_{0}}{d}$

Then, $\|\mathbf{x}-\mu\|^{2}$ in Eq. (1) can be represented with distance $r=\|\mathbf{x}_{0}-\mathbf{x}\|$ and $\theta$ :

(9)

\lVert\mathbf{x}-\mu\rVert^{2}=\lVert(\mathbf{x}-\mathbf{x}_{0})-(\mu-\mathbf{% x}_{0})\rVert^{2}=(r-d\cos\theta)^{2}+d^{2}\sin^{2}\theta.

With these parameters, we transform a 3D Gaussian to polar coordinates:

(10)

G(\omega,r)=\exp\left(-\frac{(r-d\cos\theta)^{2}+d^{2}\sin^{2}\theta}{2\sigma^% {2}}\right).

The integral of 3D Gaussian can be then transformed to the spherical form:

(11)

\int_{\mathbb{R}^{3}}G(\mathbf{x})\,d\mathbf{x}=\int_{\Omega}\int_{0}^{\infty}% \exp\left(-\frac{(r-d\cos\theta)^{2}+d^{2}\sin^{2}\theta}{2\sigma^{2}}\right)r% ^{2}drd\omega.

By matching Eq. (11) with the definition of a directional distribution, we derive the directional distribution of a 3D Gaussian at observation point $\mathbf{x}_{0}$ as:

(12)

f_{o}(\omega)=\int_{0}^{\infty}\exp\left(-\frac{(r-d\cos\theta)^{2}+d^{2}\sin^% {2}\theta}{2\sigma^{2}}\right)r^{2}dr.

Actually, this integral has a closed-form solution:

(13)

\begin{split}f_{o}(\omega)=&\,\sigma^{2}e^{-\frac{d^{2}}{2\sigma^{2}}}d\cos% \theta\\ &+\sqrt{\frac{\pi}{2}}\sigma e^{-\frac{d^{2}\sin^{2}\theta}{2\sigma^{2}}}\left% (\sigma^{2}+d^{2}\cos^{2}\theta\right)\\ &\times\left(1+\text{erf}\left(\frac{d\cos\theta}{\sqrt{2}\sigma}\right)\right% ).\end{split}

This is our unnormalized directional PDF of the 3D Gaussian; readers are referred to the appendix for a complete derivation. Since the normalizing term is the integral of the original 3D Gaussian over real space, the normalized form is given as:

(14)

F_{o}(\omega)=\frac{1}{\sqrt{(2\pi\sigma^{2})^{3}}}f_{o}(\omega).

The distribution can be effectively evaluated; it only involves several elementary transcendental functions and the error function, and the error function can be precisely approximated with Abramowitz and Stegun approximation (1964). Despite the complex form, we find it rather simple to sample this distribution too: by sampling a 3D point from the original 3D Gaussian, the distribution of direction pointing to it from the observation point obeys Eq. (14). In the appendix we provide the complete proof of this sampling algorithm. With these conditions, we achieve efficient unbiased sampling of spherical transform of 3D Gaussian.

Our transformation of the 3D Gaussian essentially builds a connection between global distribution and local directional distribution; this can potentially lead a shift from using multiple local directional distribution (e.g., the von Mises-Fisher (vMF) mixture) to one global 3D Gaussian mixture, which could significantly reduce redundancy and improve learning efficiency. Furthermore, it could eliminate the parallax issue introduced by local discrete distributions.

Similarity with von-Mises Fisher distribution

$d$ and $\sigma$ determine the shape of $F_{o}(\omega)$ . Interestingly, we find the shape of this spherical distribution very similar to 3D von-Mises Fisher distribution (vMF):

(15)

F_{v}(\omega~{}|~{}\kappa)=\frac{\kappa}{4\pi\sinh(\kappa)}e^{\kappa(\cos% \theta-1)},

where $\theta$ is the angle between $\omega$ and vMF’s mean vector $\nu$ . Actually, the north pole $\mathbf{z}$ used in Eq. (14) works the same way as the mean vector $\nu$ of vMF (thus, in our discussion, we always assume $\mathbf{z}$ and $\mathbf{\nu}$ are the same). Fig. 4 presents a comparative visualization: it displays $F_{o}(\omega)$ under two distinct parameter sets alongside a vMF distribution. The parameters of the $F_{v}$ are meticulously chosen to achieve a close match with the shape of $F_{o}(\omega)$ .

4.3. Sampling Photon Emission with 3D Gaussian Mixture

Photon emission sampling comprises two principal stages: determining the emission location $\mathbf{x}_{0}$ , and selecting the emission direction $\omega_{0}$ . Utilizing 3D Gaussians, we can precisely guide photon emissions from different types of light sources, each following a unique approach:

Area and Point Lights::: For both area and point lights, we first sample the emission location $\hat{\mathbf{x}}_{0}$ . In the case of area lights, we perform this sampling uniformly across the surface. For point lights, the location is a fixed point pre-defined in our setup. Subsequently, we determine the emission direction using the method we describe in § 4.2.
Infinite Light::: We first sample the direction $\hat{\omega}_{0}$ from the infinite light’s distribution. Then, we define a plane that passes through the center of the scene and is perpendicular to $\hat{\omega}_{0}$ . Upon projecting our 3D Gaussian mixture onto this plane, we obtain the corresponding 2D Gaussian mixture. We sample a point $\hat{\mathbf{x}}_{0}^{\prime}$ from this 2D distribution on the plane. The actual emission location $\hat{\mathbf{x}}_{0}$ is then calculated as $\hat{\mathbf{x}}_{0}=\hat{\mathbf{x}}_{0}^{\prime}-b\hat{\omega}_{0}$ , where $b$ is the radius of the scene’s bounding sphere.

5. Adaptive Light Sampler

Under the circumstance that only a portion of objects cast caustics, an independent light source sampler helps to reduce the variance and improve photon density. (Šik and Krivánek, 2019) mentioned a light source sampler for photon map** is achieved, however the authors didn’t provide details of the method. In this section we introduce our adaptive light sampler, which progressively learns an appropriate distribution of light sources for photon emission.

Our light sampler is inspired by (Wang et al., 2021) and (Müller et al., 2017): we store a binary tree whose nodes are used to store the photon count gathered in last iteration. Each node represents a range of light sources according to the node’s position and depth. For example, the root node represents the whole light sources, while the left child of it represents the first half. Starting from a single root node, in each sample iteration, we record how many photons are gathered in the region a node represents. When the photon number of a leaf node passes the threshold, we branch the leaf node into two children. To sample this light tree we just need to run binary sample down along the tree, using each node’s recorded count as importance. After several iterations we are able to obtain a close approximation of importance distribution of light sources.

6. Implementation Details

We implement our method in a GPU production renderer. In this section we provide implementation details in addition to the basic algorithms.

Progressive learning of 3D Gaussian mixture

Although we use gradient-based learning approach to fit the global 3D Gaussian mixture, we do not use existing learning frameworks. Instead, we implement a high performance learning kernel utilizing auto-differentiation feature of LuisaRender (2022), which can be integrated with our rendering system’s pipeline seamlessly. The learning process requires a moment-based optimizer, for which we utilize Adam (2014). The learning rate is initially set to 0.1 and then progressively scaled down to 0.001.

Searching radius

The searching radius is an essential parameter for gathering. We employ an adaptive searching radius approach, where a relatively large maximum radius is initially set. For each shading point, we gather the four nearest photons, and the distance to the farthest photon among these dictates the actual searching radius. This allows for dynamic adjustment of the radius in response to the increasing photon density achieved through photon guiding.

Initialization Process

In line with standard practices in distribution-fitting applications, our 3D Gaussian mixture model necessitates an accurate initial guess to commence the fitting process effectively. Fortunately, this requirement can be satisfied with the k-means clustering method. Specifically, for a 3D Gaussian mixture comprising $N$ components, we employ the k-means method to ascertain $N$ mean positions from the initial position data in the first photon pass. Subsequently, these mean positions serve as the preliminary guesses for the Gaussian components. Regarding the standard deviation parameter ( $\sigma$ ), we opt for the median value of our encoding. Additionally, each component of the mixture is assigned with an equal weight ( $\frac{1}{N}$ ).

Multiple importance sampling with uniform emission

Following existing path guiding techniques, we sample from both the 3D Gaussian mixture and the uniform sampling techniques. To blend these two techniques, we set a selection probability $\beta$ for 3D Gaussian mixture. Starting from 0, $\beta$ gradually increases as the fitting continues, and stops at $S$ . In our implementation, we set $S=0.75$ .

7. Evaluation

We execute evaluation by rendering scenes with different configurations/techniques and comparing the quality. All the images in this section are rendered in 1920 $\times$ 1080 resolution, and the full-sized images are included in supplemental materials. We run rendering on a same conventional PC with Intel Core 9700K CPU and Nvidia RTX 4070 GPU. In all the experiments, our method uses an 8-component 3D Gaussian mixture.

7.1. Guiding components

Our method importance samples photon emission and light source, with 3D Gaussian mixture and proposed adaptive light sampler respectively, to improve photon density. We evaluate the improvement of both components by rendering several representative scenes with a fixed 1-minute time budget, with and without the two components. Our reference images are rendered using SPPM too, since noiseless caustics are difficult for path tracing to produce. To produce the reference images, we render for 64K iterations with very small searching radius and 1M photons: 64 billion photons in total. We evaluate the results by comparing their difference from the reference, measured in structural similarity index measure (SSIM, higher is better) and mean squared error (MSE, lower is better). The results are shown in Fig. 5.

We observe significant improvement with our method from all the 3 scenes. With less sample count achieved in fixed time budget, the error is greatly reduced. It is worth noting that, lower sample count is mainly due to the increased photon density in each iteration, since a larger KD-tree is built and traversed. The Pool scene (Fig. 5 (b)) demonstrates a typical situation where only a portion of the whole scene casts/receive caustics, and photon guiding helps to greatly improve the photon density: compared to uniform emission, our method achieves sharper caustics, and the approximation is more accurate judging from the lower error. The Ring scene (Fig. 5 (c)) demonstrates where our adaptive light source sampler makes difference: the “Light Sampler” result shows clear improvement against the “Uniform” one.

7.2. Learned result

Fig. 6 (a) shows visualization of the learned 3D Gaussian mixture, transformed to directional distribution at two arbitrary observation points (i.e., the position of the camera). In Fig. 6 (b) we visualize the photon density from one rendering iteration of the same scene. In this iteration of the Tabletop scene, with our photon guiding method, the photon count in the visible area increases from 20,858 to 211,034: $9.12\times$ more photons, which leads to an overall higher density and more accurate estimation.

7.3. State-of-the-art Comparison

The 2D histogram approach (H2D) used in (Grittmann et al., 2018) and the Metropolis light transport approach (MLT) proposed in (Hachisuka and Jensen, 2011) target a purpose similar to our photon guiding method. We compare our tailored implementations of H2D and MLT with our method in the same production renderer. In this comparison, we use the same maximum searching radius for all methods (while kee** the adaptive strategy as described above). While trying to adhere closely to the fundamental concepts of the original methods, we had to make necessary modifications in our implementation for adaptation purpose as described below:

H2D: For each light source, we construct a 256 x 256 distribution map. We map directions from the light origin to points of interest onto cylindrical coordinate to achieve 2D distributions for area and point lights. For infinite directional lights, we project the points of interest onto a plane intersecting the scene’s center and perpendicular to the light direction to obtain their 2D distributions. At the end of every two rendering iterations, we blur the 2D maps that record the gathered times of each area, and use them to build the 2D distributions, represented as alias tables. We use the built 2D distributions to guide directions for area and point lights, and positions for directional lights. However, H2D is not applicable to dome lights (environment maps), because dome lights could consist of multiple dominant directions. We utilize the light sampler proposed in § 5 along with H2D.

MLT: We adapt MLT for GPU implementation, following the principles described in (Hachisuka and Jensen, 2011). Our importance function is a binary function: to any photon $\mathbf{v}$ , $F_{i}(\mathbf{v})=1$ if $\mathbf{v}$ successfully arrives receiver surface after several specular bounces on caster surfaces, and $F_{i}(\mathbf{v})=0$ otherwise. This is a robust approach for our integration scheme, where we explicitly mark caustic casters and receivers, while photon emission and gathering are in different passes. Our GPU integration emits $2^{20}$ photons every iteration in parallel, and each thread maintains its own MLT sampler. For the bootstrap phase, we extensively emit $2^{30}$ photons ( $1024\times$ the number of MLT samplers) with random states to explore the sample space comprehensively, and the accepted states form an initial state pool. Post to this, each MLT sampler randomly selects state from this pool, ensuring a diverse yet representative starting point for exploration. In rendering phase, when the importance function returns 1 for a photon, it is recorded and the state is accepted; otherwise, the previous accepted photon is recorded and the MLT sampler reverts to the previous state. We employ a normal distribution with a manually selected variance value for mutation, as the adaptive mutation from (Hachisuka and Jensen, 2011) requires per-sample update for the mutation scale while a simple adaptation to our GPU-based SPPM scheme gives poor result. We set a probability of 0.3 for large-step mutation.

The performance of various methods can vary significantly based on their implementation specifics, particularly in the context of a GPU rendering system. We thus compare them based on a consistent sample budget for a direct assessment of rendering quality, independent of time constraints. However, we also report execution times for completeness. The test scenes and results from different methods are shown in Fig. 1 and Fig. 7. Our method demonstrates more robust results across all test scenes, outperforming H2D and MLT in four scenes, and closely matching MLT in the Pool scene (Fig. 7 (c)) with about half the time cost.

The Ring scene (Fig. 1) is a challenging scene that features complex lighting setup and intricate caustics. Our global 3D Gaussian mixture approach robustly guides photon emission from different light sources at arbitrary location, while the adaptive light sampler helps to select light sources according to their importance accurately.

With H2D, the transition from 3D to 2D distributions introduces parallax issues, particularly problematic for light sources with considerable area. For instance, the lights illuminating the diamonds in Ring scene (Fig. 1) are rectangles, and H2D struggles to guide such lights due to the parallax issue. In the Torus scene (Fig. 7 (b)), a complex dome light with multiple peaks posed a challenge for H2D’s capability. Similarly, in the Shelf (Fig. 7 (a)) scene, the result contains more outliers due to the small angular size of the directional light (5 ${}^{\circ}$ ). In contrast, our 3D Gaussian method accurately provides directional distributions for any position and corrects perspective for any direction, ensuring superior guiding quality.

MLT achieves very close quality as our method, however the nature of MLT approach suffers in exploring a wide range of sample space. In the Ring scene (Fig. 1), MLT fails to distribute samples to each light source uniformly, consequently, the caustics from the diamond is well rendered, however the ring caustics appear to be noisy. In the Tabletop scene (Fig. 7 (d)), MLT achieves high quality photon guiding, however the result at the bottom-right is far off. We explain this as MLT fails to assign enough samples at that region in sample space. Note that we have invested a heavy amount of samples in the bootstrap phase, and applied large-step mutation during sampling, however MLT still cannot cover the sample space uniformly.

8. Discussion

Ease of Implementation

During the implementation of different techniques, we found our method shows simplicity in implementation, mostly due to its single global distribution scheme. This model’s independence from the number of light sources simplifies the integration process, making it less susceptible to the complexities that arise with variable lighting conditions. In contrast, the 2D histogram method requires frequent and substantial updates to its maps, which can be computationally demanding as the number of light source increases. Meanwhile, MLT, despite its effectiveness, is known for its intricate implementation, and requires more careful handling. MLT also requires more significant computational and memory overhead. In contrast, with GPU-based gradient descent, our learning process is very efficient and requires minimal memory footprint.

Cosine-weighted guiding

Theoretically, our method can be extended for cosine-weighted guiding for emission direction. We have shown that the spherical distribution Eq. (14), obtained using our novel directional transform of 3D Gaussian, has a similar shape with vMF distribution $F_{v}$ (while they are essentially different). For an area light, if we can first approximate $F_{o}$ with $F_{v}$ , then, with another $F_{v}$ resembling the cosine-weighted distribution of directions with respect to the surface normal, we are able to calculate the product of these two, which is still a vMF distribution. By sampling the product vMF we can achieve cosine-weighted guiding for diffuse area lights. However, finding a $F_{v}$ approximating $F_{o}$ is expensive, since it requires solving a transcendental equation that connects parameters of $F_{o}(\omega~{}|~{}d,\sigma)$ and $F_{v}(\omega~{}|~{}\kappa)$ . We leave this as a future work.

Limitations

Our global distribution approach only requires minimal overhead, however, it does struggle in situations where the actual distribution of each light source drastically changes; however, this seldom occurs in practice. In most caustics rendering scenarios, the efficiency is already greatly increased by guiding photons towards specular surfaces. Although this limitation can be overcome by assigning and training unique 3D Gaussian mixtures for each light source, it will introduce additional computational overhead.

In caustics rendering, many photon paths involve multi-lobe BSDF sampling (e.g., glass has both a reflection lobe and a refraction lobe). For such paths, we currently sample lobes with their corresponding BSDF weights, however, the variance could be further reduced if we are able to sample the decision based on the lobes’ actual contribution to the image. In fact, in the comparison of Pool scene (Fig. 7 (c)), MLT achieves the best result mainly because it invests more samples on the reflection of the water surface. Adapting a variance-aware distribution, as suggested by Rath et al. (2020), is a conceivable solution. However, the practical implementation of such a concept within our gradient-based learning remains unexplored, which we leave as a future work.

9. Conclusion and Future Work

In this paper, we have introduced a novel photon guiding method designed for production rendering systems, focusing on efficiently rendering caustics. Our approach combines the use of a global 3D Gaussian mixture for emission guiding and an adaptive light source sampler, both of which contribute to achieving significantly higher photon density compared to existing photon guiding methods. The cornerstone of our approach is the innovative utilization of 3D Gaussian distributions, from which we have derived a novel directional distribution.

We consider 3D Gaussians to not only serve as the foundation of our method, but also to hold immense potential for a wide range of applications in computer graphics research. For example, the versatility of 3D Gaussians in connecting local and global distributions is also promising in a broader path guiding context, such as modeling indirect lighting and view-dependent visual effects. We will explore such directions in our future work.

References

(1)
Abramowitz and Stegun (1964) Milton Abramowitz and Irene A. Stegun. 1964. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover Publications, New York.
Burley et al. (2018) Brent Burley, David Adler, Matt Jen-Yuan Chiang, Hank Driskill, Ralf Habel, Patrick Kelly, Peter Kutz, Yining Karl Li, and Daniel Teece. 2018. The Design and Evolution of Disney’s Hyperion Renderer. ACM Trans. Graph. 37, 3, Article 33 (jul 2018), 22 pages. https://doi.org/10.1145/3182159
Dodik et al. (2022) Ana Dodik, Marios Papas, Cengiz Öztireli, and Thomas Müller. 2022. Path Guiding Using Spatio‐Directional Mixture Models. Computer Graphics Forum (2022). https://doi.org/10.1111/cgf.14428
Droske et al. (2023) Marc Droske, Johannes Hanika, Jiří Vorba, Andrea Weidlich, and Manuele Sabbadin. 2023. Path Tracing in Production: The Path of Water. In ACM SIGGRAPH 2023 Courses (SIGGRAPH ’23). Association for Computing Machinery, New York, NY, USA, Article 12, 66 pages. https://doi.org/10.1145/3587423.3595519
Georgiev et al. (2012) Iliyan Georgiev, Jaroslav Křivánek, Tomáš Davidovič, and Philipp Slusallek. 2012. Light Transport Simulation with Vertex Connection and Merging. ACM Trans. Graph. 31, 6, Article 192 (nov 2012), 10 pages. https://doi.org/10.1145/2366145.2366211
Grittmann et al. (2018) Pascal Grittmann, Arsène Pérard-Gayot, Philipp Slusallek, and Jaroslav Křivánek. 2018. Efficient Caustic Rendering with Lightweight Photon Map**. Computer Graphics Forum 37, 4 (2018), 133–142. https://doi.org/10.1111/cgf.13481 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.13481
Hachisuka and Jensen (2009) Toshiya Hachisuka and Henrik Wann Jensen. 2009. Stochastic Progressive Photon Map**. In ACM SIGGRAPH Asia 2009 Papers (Yokohama, Japan) (SIGGRAPH Asia ’09). Association for Computing Machinery, New York, NY, USA, Article 141, 8 pages. https://doi.org/10.1145/1661412.1618487
Hachisuka and Jensen (2011) Toshiya Hachisuka and Henrik Wann Jensen. 2011. Robust Adaptive Photon Tracing Using Photon Path Visibility. ACM Trans. Graph. 30, 5, Article 114 (oct 2011), 11 pages. https://doi.org/10.1145/2019627.2019633
Hachisuka et al. (2008) Toshiya Hachisuka, Shinji Ogaki, and Henrik Wann Jensen. 2008. Progressive photon map**. ACM Trans. Graph. 27, 5, Article 130 (dec 2008), 8 pages. https://doi.org/10.1145/1409060.1409083
Huang et al. (2023) Jiawei Huang, Akito Iizuka, Hajime Tanaka, Taku Komura, and Yoshifumi Kitamura. 2023. Online Neural Path Guiding with Normalized Anisotropic Spherical Gaussians. arXiv:2303.08064 [cs.CV]
Jensen (1995) Henrik Wann Jensen. 1995. Importance Driven Path Tracing using the Photon Map. In Rendering Techniques ’95, Patrick M. Hanrahan and Werner Purgathofer (Eds.). Springer Vienna, Vienna, 326–335.
Jensen (1996) Henrik Wann Jensen. 1996. Global illumination using photon maps. In Rendering Techniques’ 96: Proceedings of the Eurographics Workshop in Porto, Portugal, June 17–19, 1996 7. Springer, 21–30.
Kelemen et al. (2002) Csaba Kelemen, László Szirmay-Kalos, György Antal, and Ferenc Csonka. 2002. A Simple and Robust Mutation Strategy for the Metropolis Light Transport Algorithm. Computer Graphics Forum 21, 3 (2002), 531–540. https://doi.org/10.1111/1467-8659.t01-1-00703 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/1467-8659.t01-1-00703
Kingma and Ba (2014) Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). http://arxiv.longhoe.net/abs/1412.6980
Misso et al. (2022) Zackary Misso, Benedikt Bitterli, Iliyan Georgiev, and Wojciech Jarosz. 2022. Unbiased and Consistent Rendering Using Biased Estimators. ACM Trans. Graph. 41, 4, Article 48 (jul 2022), 13 pages. https://doi.org/10.1145/3528223.3530160
Müller et al. (2017) Thomas Müller, Markus Gross, and Jan Novák. 2017. Practical Path Guiding for Efficient Light-Transport Simulation. Comput. Graph. Forum 36, 4 (jul 2017), 91–100. https://doi.org/10.1111/cgf.13227
Müller et al. (2019) Thomas Müller, Brian Mcwilliams, Fabrice Rousselle, Markus Gross, and Jan Novák. 2019. Neural Importance Sampling. ACM Trans. Graph. 38, 5, Article 145 (oct 2019), 19 pages. https://doi.org/10.1145/3341156
Oren and Nayar (1994) Michael Oren and Shree K. Nayar. 1994. Generalization of Lambert’s reflectance model. In Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’94). Association for Computing Machinery, New York, NY, USA, 239–246. https://doi.org/10.1145/192161.192213
Rath et al. (2020) Alexander Rath, Pascal Grittmann, Sebastian Herholz, Petr Vévoda, Philipp Slusallek, and Jaroslav Křivánek. 2020. Variance-Aware Path Guiding. ACM Trans. Graph. 39, 4, Article 151 (jul 2020), 12 pages. https://doi.org/10.1145/3386569.3392441
Ruppert et al. (2020) Lukas Ruppert, Sebastian Herholz, and Hendrik P. A. Lensch. 2020. Robust Fitting of Parallax-Aware Mixtures for Path Guiding. ACM Trans. Graph. 39, 4, Article 147 (aug 2020), 15 pages. https://doi.org/10.1145/3386569.3392421
Trowbridge and Reitz (1975) T. S. Trowbridge and K. P. Reitz. 1975. Average irregularity representation of a rough surface for ray reflection. J. Opt. Soc. Am. 65, 5 (May 1975), 531–536. https://doi.org/10.1364/JOSA.65.000531
Veach and Guibas (1997) Eric Veach and Leonidas J. Guibas. 1997. Metropolis light transport. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’97). ACM Press/Addison-Wesley Publishing Co., USA, 65–76. https://doi.org/10.1145/258734.258775
Vorba et al. (2014) Jiří Vorba, Ondřej Karlík, Martin Šik, Tobias Ritschel, and Jaroslav Křivánek. 2014. On-Line Learning of Parametric Mixture Models for Light Transport Simulation. ACM Trans. Graph. 33, 4, Article 101 (jul 2014), 11 pages. https://doi.org/10.1145/2601097.2601203
Walter et al. (2005) Bruce Walter, Sebastian Fernandez, Adam Arbree, Kavita Bala, Michael Donikian, and Donald P. Greenberg. 2005. Lightcuts: a scalable approach to illumination. ACM Trans. Graph. 24, 3 (jul 2005), 1098–1107. https://doi.org/10.1145/1073204.1073318
Wang et al. (2021) Yu-Chen Wang, Yu-Ting Wu, Tzu-Mao Li, and Yung-Yu Chuang. 2021. Learning to Cluster for Rendering with Many Lights. ACM Trans. Graph. 40, 6, Article 277 (dec 2021), 10 pages. https://doi.org/10.1145/3478513.3480561
Yuksel (2020) Cem Yuksel. 2020. Stochastic Lightcuts for Sampling Many Lights. IEEE Transactions on Visualization and Computer Graphics (2020), 11 pages. https://doi.org/10.1109/TVCG.2020.3001271
Zheng et al. (2022) Shaokun Zheng, Zhiqian Zhou, Xin Chen, Difei Yan, Chuyan Zhang, Yuefeng Geng, Yan Gu, and Kun Xu. 2022. LuisaRender: A High-Performance Rendering Framework with Layered and Unified Interfaces on Stream Architectures. ACM Trans. Graph. 41, 6, Article 232 (nov 2022), 19 pages. https://doi.org/10.1145/3550454.3555463
Zhu et al. (2020) Shilin Zhu, Zexiang Xu, Henrik Wann Jensen, Hao Su, and Ravi Ramamoorthi. 2020. Deep Kernel Density Estimation for Photon Map**. Computer Graphics Forum 39, 4 (2020), 35–45. https://doi.org/10.1111/cgf.14052 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.14052
Šik and Krivánek (2019) Martin Šik and Jaroslav Krivánek. 2019. Implementing One-Click Caustics in Corona Renderer. In Eurographics Symposium on Rendering - DL-only and Industry Track, Tamy Boubekeur and Pradeep Sen (Eds.). The Eurographics Association. https://doi.org/10.2312/sr.20191221

		Reference	Full	Gaussians	Light Sampler	Uniform
(a) Tabletop
	SSIM:		0.9842	0.9792	0.9816	0.9743
	MSE $\times 10^{5}$ :		7.41	9.96	8.73	12.5
	Sample count:		712	716	941	953
(b) Pool
	SSIM:		0.9583	0.9579	0.9407	0.9419
	MSE $\times 10^{4}$ :		4.18	4.22	10.7	10.6
	Sample count:		737	757	1165	1168
(c) Ring
	SSIM:		0.9587	0.9577	0.9421	0.9416
	MSE $\times 10^{3}$ :		2.33	2.39	4.47	5.10
	Sample count:		718	840	1028	1046

		Reference	PT-64K	G3D (Ours)	H2D	MLT	PM
(a) Shelf
	SSIM:		(0.9908)	0.9820	0.9703	0.9771	0.9697
	MSE $\times 10^{4}$ :		(1.32)	2.18	7.92	3.56	8.29
	Time:			49s	65s	116s	37s
(b) Torus
	SSIM:		(0.9967)	0.9906	-	0.9901	0.9797
	MSE $\times 10^{5}$ :		(2.49)	8.08	-	8.52	29.2
	Time:			57s	-	113s	34s
(c) Pool
	SSIM:		(0.9462)	0.9755	0.9674	0.9783	0.9642
	MSE $\times 10^{4}$ :		(10.9)	5.70	7.80	5.49	11.8
	Time:			82s	1s	132s	1s
(d) Tabletop
	SSIM:		(0.9763)	0.9810	0.9700	0.9794	0.9524
	MSE $\times 10^{4}$ :		(3.19)	1.88	3.01	2.56	5.25
	Time:			87s	92s	175s	65s