Unsupervised Learning for Object Representations by Watching and Moving
- Author(s): Yang, Yanchao
- Advisor(s): Soatto, Stefano
- et al.
The power of deep neural networks comes mainly from huge labeled datasets. Even though it shines on many computer vision tasks, supervised learning bears little hope to hack into the core of intelligent visual systems. On the other side, unsupervised learning is believed to be the future of AI; however, its performance is always inferior compared to the supervised counterpart. The goal of our research is to develop unsupervised learning algorithms for computer vision tasks while matching or even outperforming the supervised ones. Our key is a representation that is as informative as the supervisory labels, which can be constructed from an unlimited amount of unlabeled data. In theory, this representation contains richer information than the processed supervisory signal. Moreover, we develop algorithms that can utilize existing labeled datasets to expedite the information extraction from the unlimited unlabeled data. Our research is lined up in an order similar to the visual development in early infancy, such that we can also investigate the interplay between different visual functionalities. The final goal is to develop a robotic visual system akin to a human's, that can automatically acquire semantics from concepts of objects fostered by basic perceptions of motion and depth with the minimum amount of human supervision.