Video packet loss visibility models and their application to packet prioritization
- Author(s): Lin, Ting-Lan
- et al.
In video transmission, packets can be lost for many reasons. Traditionally the impact of packet losses is measured by mean squared error induced by the loss in the pixel domain. However, mean squared error does not correlate with human perception well. In this dissertation, we aim to provide predictions of how human observers respond to different video packet losses. Based on their estimated visual importance, we can insert a prioritization bit for each video packet before sending it over a lossy network, or perform unequal channel protection on packets before transmission over a wireless channel. The models are developed from data collected from subjective tests. The models predict the packet loss visibility, that is, the probability of a given packet producing a glitch that will be observed by the end user if it is lost. We discuss the development and the application of encoder-based packet loss visibility models and network-based packet loss visibility models. We discuss an encoder-based packet loss visibility model using three subjective experiment data sets that span various encoding standards (H.264 and MPEG-2), group-of- picture structures, and decoder error concealment choices. The factors of scene cuts, camera motion, and reference distance are highly significant to the packet loss visibility. The encoder-based packet loss visibility model exploits factors in the pixel domain as well as reference frame information. The first application of the encoder- based packet loss visibility model is packet prioritization for a video stream. When a network gets congested at an intermediate router, the router is able to decide which packets to drop such that visual quality of the video is minimally impacted. Experiments are done to compare our perceptual-quality-based packet prioritization approach with existing Drop-Tail and cumulative-MSE-based prioritization methods. The result shows that our prioritization method produces videos of higher perceptual quality for different network conditions and group-of- picture structures. The second application of the encoder- based packet loss visibility model is unequal error protection. For an AWGN channel, we aim to minimize the end-to-end video quality degradation using rate-compatible punctured convolutional codes for a given channel rate budget. We solve the integer programming problem by the Branch and Bound method, K-means clustering, and the subgradient method. We also exploit the advantage of not sending or not coding packets of lower importance. The algorithm is compared to an existing method. In order to reduce the computational complexity of the encoder-based model so that a model can be implemented at the router, we aim to develop a network-based model that uses only information within one packet to predict the importance of that packet, requiring no frame-level reconstruction nor any information on the reference frame. We conduct subjective experiments for SDTV and HDTV resolutions on visual quality following packet loss. We design the model for SDTV and HDTV resolutions, and discuss the differences in the important factors between SDTV and HDTV models. We then use the model to measure the visual importance of incoming packets to the router. During network congestion, we drop the least visible frames and/or the least visible packets until the required bit reduction rate is achieved. Our algorithm performs better than dropping B packets/ frames. The way we estimated the frame importance is based on the summation of the visibility of all slices in a frame, which is an indirect approach. Therefore, we conduct subjective experiments and collect responses from human observers directly on whole frame losses. We develop a model which can predict the visibility of whole frame losses for B frames. This model could be useful for designing an intelligent frame dropping approach for use at a router during congestion