- Main
Reconstructing and Generating the Physical World with Subtle Realism
- Wang, Zhen
- Advisor(s): Kadambi, Achuta
Abstract
This dissertation advances subtle realism in digital representations by developing methods forphysiologically-aware video synthesis, efficient 3D reconstruction, and temporally consistent 4D modeling. We propose a scalable framework that generates bio-realistic face videos by preserving physiological signals, addressing demographic bias in remote health sensing. In 3D reconstruction, we introduce ALTO, a method that alternates between latent topologies to achieve high-fidelity shape recovery with fast inference. Extending to 3D generation, we present a multi-view diffusion model (MVDD) that synthesizes detailed 3D shapes from multi-view depth maps, improving upon point-based generative models. Finally, we develop a framework for dynamic 4D surface reconstruction from monocular video, ensuring temporal coherence for applications like simulation and editing. Collectively, these contributions form a cohesive progression toward realistic, scalable digital modeling of the physical world, with applications across healthcare, graphics, and AI-driven simulation. Future directions include training-free 3D reasoning, real-time dynamic modeling, and broader cross-modal generative modeling.