Perceptual Scale-Space and Its application
- Author(s): Wang, Yizhou
- Bahrami, Siavosh
- Zhu, Song-Chun
- et al.
When an image is viewed at decreasing resolutions in a Gaussian pyramid, information is lost gradually. Amid continuous intensity changes across scales, there are “quantumjumps” or “perceptual transitions” in our inner representation. In this paper, we study a representational paradigm called the perceptual scale-space which augments the Gaussian pyramid in traditional image scale-space theory by constructing a so-called sketch pyramid. Each level of the sketch pyramid is a generic attribute graph – called primal sketch, and is inferred from the corresponding image at the same level of a Gaussian pyramid by Bayesian inference using a generative image model. Perceptual jumps are then represented by structural changes in the primal sketch in terms of graph operators, such as death-birth and split-merge of vertices and edges in the generic attribute graph. In a training stage, we ask seven human sub jects to label transitions of graphs over scales for a set of images. We learn the most frequent atomic graph operators, composite operators, and thresholds of transitions across human sub jects. This information is then used in a generative model for inferring a sketch pyramid up-down a Gaussian pyramid. In experiments, we show that the sketch pyramid is a more parsimonious representation than a multi-resolution wavelet transforms, and demonstrate an application on adaptive image display – showing a large image in a small screen (say PDA) through a selective tour of its image pyramid. In this application, the sketch pyramid provides a means for calculating information gain in zooming-in different areas of a image by counting a number of operators expanding primal sketches, such that the maximum information is displayed in a given number of frames. We argue that the perceptual scale-space enriches the conventional scale-space theory, and provides an important representation for many vision applications, such as super-resolution and multi-scale ob ject recognition.