Learning Ranking Functions for Video Search on the Web
- Author(s): Lam, Antony Ming
- Advisor(s): Shelton, Christian R
- Roy-Chowdhury, Amit K
- et al.
Videos on the Internet have become widespread. However search engines are still mostly limited to using associated text data to find desired content. In this dissertation, we build ranking functions that can directly analyze image and video content and assign a ranking to a database with respect to user queries.
A common approach to building ranking functions is to use a machine learning algorithm to perform a priori training of class concepts and use the trained classifier as the ranking function. However, a priori training of class concepts for retrieval is daunting since users queries can be very diverse. In addition, a priori training cannot capture the subjective component of user queries. For example, if a user were searching for videos of ``nice basketball shots,'' there would be no way to know what the user considers ``nice.'' Relevance feedback (RF) is an interactive search framework that captures user subjectivity and supports on-the-fly learning of target classes.
However, RF is limited in its need for large amounts of user feedback when the data being searched are complex (e.g. Internet content). Transfer learning (TL) is a machine learning formulation where existing knowledge about a related ``source'' classification task can be used to improve the generalization performance of a ``target'' task (where training data is scarce). In this dissertation we explore the combination of RF and TL and present a framework which can learn more from the user with less feedback. We show extensive experiments with real-world data taken from the Internet and show improved performance over past RF frameworks.
Although our RF and TL framework is effective for a wide range of queries, we acknowledge that there are some highly specific but common queries users could make which would benefit from more dedicated design of a ranking function. For example, finding particular people using face recognition would be an important type of query on the Internet. The problem in this case is well defined and objective. While the problem is specific, it is important enough to warrant the dedicated design of a ranking function. Thus we complete our studies in this dissertation through the exploration of a robust face recognition based ranking function and show strong results in a challenging face identity retrieval task.