Learning with Limited Supervision for Static and Dynamic Tasks
- Author(s): Paul, Sujoy
- Advisor(s): Roy-Chowdhury, Amit K
- et al.
The recent successes in computer vision have been mostly around using a huge corpus of intricately labeled data for training recognition models. But, in real-world cases, acquiring such large datasets will require a lot of manual annotation, which may be strenuous, out of budget, or even prone to errors. Whereas, a lot of real data that are generated daily can be acquired at low to no annotation cost. Such data can be unlabeled or contain tag/meta-data information, termed as weak annotation. Our goal is to develop methods that can learn recognition models from such data involving limited manual supervision. In this thesis, we explore two dimensions of learning with limited supervision - first, reducing the number of manually labeled data required to learn recognition models, and second, reducing the level of supervision from strong to weak which can be mined from the web, easily queried from an oracle, or imposed as rule-based labels derived from domain knowledge.
In the first dimension of learning with limited supervision, we show that context information, often present in natural data, can be used to reduce the number of annotations required. We take an information-theoretic approach considering the relationship in data points, to select them for labeling, unlike works in literature which only use the uncertainty of individual samples. In the next dimension of learning with limited supervision, i.e., reducing the level of supervision, we use weak labels instead of dense strong labels, for learning dense prediction tasks. We develop frameworks to learn using weak labels for action detection in videos and domain adaptation of semantic segmentation models on images. In action detection, unlike using frame-wise annotations as in the literature, we use only video-level annotations, which is much easier to obtain from the annotator and can also be mined from the web. In domain adaptation of semantic segmentation models, we use weak image-level labels in two forms - pseudo weak labels, which are estimated using the source segmentation model, incurring no annotation cost, or oracle weak labels, which are obtained from the human annotator and incurring a very low cost. In spite of using such weak labels, our methods perform close to frameworks using strong supervision.
Continuing in the direction of learning from weak labels, we explore sequential decision-making problems. We learn robotics tasks with a small set of expert human demonstrations. Traditional imitation learning methods can only be as good as the expert, with a lot of human demos. We devise a strategy that divides a complex task into subgoals and solves them sequentially with reinforcement learning. We learn the subgoal partitions just from the human demos without any partition labels from the human annotator, by imposing only a temporal ordering based weak constraint among the subgoals, often arising in most real-world tasks. Our method is able to solve tasks with a low number of demos which other methods in the literature are not able to solve.