Physically based rendering is widely used due to its ability to create compelling, photorealistic images. Generating these images requires evaluating a multidimensional integral and is typically done through Monte Carlo (MC) samples of the scene's light transport. However, these samples are computationally costly and many are needed to obtain a converged result, otherwise there is objectionable noise in the final image.
To address these concerns, there has been substantial work aimed at reducing the variance of the MC estimate. For this dissertation, we observe that this work can be categorized based on where in the rendering pipeline these methods are being applied: pre-rendering, intra-rendering, and post-rendering. The first category refers to methods that perform preprocessing to make subsequent rendering more efficient. Meanwhile, intra-rendering methods will provide feedback during rendering to facilitate convergence of the MC estimate. Finally, post-rendering approaches are used after the sampling budget has been exhausted to output images that better match the ground truth rendering.
Previous methods in these categories are far from optimal as a result of their inaccurate approximations, costly initialization, and fallible heuristics. On the other hand, machine learning approaches in countless applications have seen steady progress in recent years thanks to their ability to model complex relationships with significantly more accuracy and robustness relative to their hand-crafted counterparts. Inspired by these approaches, we propose to apply machine learning to all three portions of the pipeline to avoid the aforementioned pitfalls of prior work in MC variance reduction.
For pre-rendering, we introduce the first deep network for prefiltering that can accurately capture the appearance of large environments with complex geometry and materials for level of detail. Our approach foregoes traditional geometric simplification and volumetric modeling and robustly handles the range of cases.
Within the context of intra-rendering, previous approaches for importance sampling are costly since they require many samples and, in some cases, even additional training per scene before they can be effectively used to guide samples towards paths with higher light contributions. We demonstrate the first offline deep network that can guide sampling for almost the entirety of rendering for an arbitrary scene.
Finally, we bring deep learning to post-rendering MC denoising and replace the heuristics used by previous methods to set the parameters of explicit filters. Our network predicts the filter kernel around each pixel, can even directly filter the image, and works on general scenes and varying levels of noise without retraining.
In this thesis, we demonstrate how our proposed, learning-based solutions offer a significant advantage over prior state-of-the-art methods in each of these cases. We also discuss avenues for future work in the research community to bring machine learning to additional applications within all components of the MC rendering pipeline.