Analysis by Synthesis: 3D Image Parsing Using Spatial Grammar and Markov Chain Monte Carlo
- Author(s): Qi, Siyuan
- Advisor(s): Zhu, Song-Chun
- et al.
Scene understanding is a fundamental problem in computer vision research. We
address this problem in an “analysis by synthesis” fashion - explain observed data
(an 2D image) according to a set of spatial grammar (describes the underlying
functional arrangement and 3D geometric structure of a scene) that generate it.
The inference process is carried out in a Bayesian framework. The posterior
probability includes a prior probability reflecting the knowledge of indoor 3D scene
structure encoded by grammar, and a likelihood that evaluates the accuracy of the
re-projected image and the physical plausibility. The most reasonable explanation
of the image is given by a parse tree that maximizes the posterior probability, and
it is found by reversible-jump Markov Chain Monte Carlo sampling.