Yildirim, Ilker; Kulkarni, Tejas D; Freiwald, Winrich A; Tenenbaum, Joshua B

Efficient analysis-by-synthesis in vision: A computational framework, behavioral tests, and comparison with neural representations

2015

Abstract

A glance at an object is often sufficient to recognize it and recover fine details of its shape and appearance, even under highly variable viewpoint and lighting conditions. How can vision be so rich, but at the same time fast? The analysisby- synthesis approach to vision offers an account of the richness of our percepts, but it is typically considered too slow to explain perception in the brain. Here we propose a version of analysis-by-synthesis in the spirit of the Helmholtz machine (Dayan, Hinton, Neal, & Zemel, 1995) that can be implemented efficiently, by combining a generative model based on a realistic 3D computer graphics engine with a recognition model based on a deep convolutional network. The recognition model initializes inference in the generative model, which is then refined by brief runs of MCMC. We test this approach in the domain of face recognition and show that it meets several challenging desiderata: it can reconstruct the approximate shape and texture of a novel face from a single view, at a level indistinguishable to humans; it accounts quantitatively for human behavior in ‚Äúhard‚Äù recognition tasks that foil conventional machine systems; and it qualitatively matches neural responses in a network of face-selective brain areas. Comparison to other models provides insights to the success of our model.

Main Content

For improved accessibility of PDF content, download the file to your device.

Proceedings of the Annual Meeting of the Cognitive Science Society

Efficient analysis-by-synthesis in vision: A computational framework, behavioral tests, and comparison with neural representations