Energy-based models are a powerful and flexible tool for studying emergent properties in systems with many interacting components. Energy functions of complex systems are highly non-convex because the local modes encode the rich variety of probable phenomena.
This work focuses on energy models of image signals, which are treated as complex systems of interacting pixels that contribute to the emergent appearance of the whole image. By adapting classical Maximum Likelihood learning to the parametric family of ConvNet functions, one can learn energy-based models capable of realistic image synthesis. The observed behaviors of ML learning with ConvNet potentials are surprising and not well understood despite widespread use of the potential in recent studies. This work rigorously diagnoses the learning process to correct systematic problems with steady-state distributions of learned potentials while revealing an unexpected non-convergent outcome of learning.
The true value of the energy-based model lies in the structure of the energy landscape. The energy-based model encodes explicit quantitative relations between local modes in landscape energy barriers. Concepts in the training data form macroscopic energy basins with many shallow local modes. Powered by a new Markov Chain Monte Carlo algorithm that efficiently detects macroscopic energy structures, energy landscape mapping experiments are conducted to discover meaningful image concepts. The mapping framework is a powerful tool for unsupervised clustering based on geodesic distances in a learned potential.