Scene Representations for Video Compression
- Author(s): Georgiadis, Georgios
- Advisor(s): Soatto, Stefano
- et al.
Video analysis has evolved into a fundamental research problem following an unprecedented growth of video content consumption in the last few years. Its applications span a wide range of tasks including video editing, indexing, classification, surveillance and autonomous driving. While for these tasks, representations that capture the main properties of the scene have been investigated, this has still not been done for the purpose of video compression. Current video compression systems largely ignore the underlying scene and instead only model its projection on the video frames. The goal of this thesis is to show that by modeling the scene further improvements can be achieved in compressing video data.
In the first part of the thesis, it is shown how a scene can be decomposed into two types of regions called ``Visual Structures" and ``Visual Textures". These regions are defined and algorithms that allow their inference are proposed. Based on such partition, a video compression system is designed that achieves a significant improvement over current state-of-the-art methods. In addition, the definition of textures is expanded to model transformations of the domain and range of the image and applications such as texture compression and segmentation are discussed. In the second part, extensions of the video compression system are proposed that take advantage of further modeling of the scene, such homogeneity, occlusions and tight boundaries of regions. Finally, further applications of scene modeling are shown in various problems such as video hole-filling and independent motion detection.