UC San Diego
Applications of visual saliency to video processing
- Author(s): Jacobson, Natan Haim
- et al.
Our understanding of the human visual system has advanced significantly over the past quarter-century. With the availability of modern computers and development of sophisticated algorithms, it is now possible to efficiently predict human attention patterns for images and video. A saliency map can easily be generated, which provides a measure of how important each portion of a scene is, with respect to the human visual system. A region with a high saliency value is more likely to be fixated upon by a human than a region with a low saliency value. In this work, we explore the application of saliency to video processing. In our first project, saliency is applied to Frame Rate Up-Conversion. By enforcing motion vector refinement only for salient regions, we reduce processing time while maintaining a high level of visual quality for the up-converted video sequence. In our second project, we propose a new method for saliency detection which considers object scale using a scale-space model. Excellent results are demonstrated, including improved performance of our saliency-based Frame Rate Up-Conversion algorithm. Finally, an experiment is conducted on the salient power of the stereoscopic depth feature using two different datasets. While local contrasts in luminance, color, orientation and motion are known to be highly salient, less is understood about local contrasts in depth. Using a mirror stereoscope for 3D display to subjects and an eye-tracking system, we measure human fixations for 2D (no depth) and 3D scenes. We determine that contrast in stereoscopic depth repels human fixations for natural scenes, while attracting it for synthetic scenes. This conflict may arise from different stages of human attention (bottom-up vs. top-down), activated by the different scene content in the two datasets