- Main
Learning Application-Oriented Classifiers for Multi-frame Visual Recognition
- Pal, Anwesan
- Advisor(s): Christensen, Henrik Iskov
Abstract
Classification, a \textit{supervised learning} problem, is a technique to categorize a given set of data points into two (binary classification) or more (multi-class classification) targets or labels, based on their characteristics. It has wide applications in the domain of Computer Vision and Robotics. Despite the tremendous success of classification in recent times, a fundamental question remains as to whether a classifier truly learns the correct representation (\ie the \textit{ground-truth representation}) needed to solve a problem. The uncertainty in learning often limits generalization of such methods across different domains. This thesis presents an in-depth study of how different classifiers leverage the intrinsic properties of a dataset in order to robustly model the data distribution. The study further extends to the application of classifiers in two challenging domains - Visual Semantic Place Categorization and Human Activity Recognition.
The first part of the thesis focuses on the concept of bias in action recognition videos. This is illustrated by the fact that given just a single frame of a video, some actions can be easily identified, due to the presence of certain objects and the background. The second part generalizes classification into the domain of place categorization, where the focus is to learn semantic information about a robot's environment without any human guidance. This is done through a deep fusion of the scene and object attributes. Finally, an algorithm is proposed for human action recognition which incorporates the information learnt from optical flow and visual attention mechanism in order to perform classification.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-