Skip to main content
eScholarship
Open Access Publications from the University of California

Labeling Transformation and Introspective Learning with Convolutional Nets

  • Author(s): Jin, Long
  • Advisor(s): Tu, Zhuowen
  • et al.
Abstract

Convolutional neural networks have been widely used in machine learning and computer vision tasks for either discriminative purposes or generative modeling.

This thesis first presents the discriminative power of CNNs for instance segmentation task, with a focus on the intrinsic challenge of the problem —- the presence of a quotient space (swapping the labels of different instances leads to the same result). We propose a simple segmentation based framework as well as three instance labeling transformation methods, namely pixel-based affinity mapping, superpixel-based affinity learning, and boundary-based component segmentation. Our methods are object proposal- and object detection- free, and achieves competitive results on various benchmark datasets.

Secondly, this thesis presents the introspective classification with convolutional neural networks that emphasizes the importance of having convolutional neural networks empowered with generative capabilities. We propose the reclassification-by-synthesis algorithm to perform training using a formulation stemmed from the Bayes theory. The single CNN classifier learned is at the same time generative -- being able to directly synthesize new samples within its own discriminative model. We conduct experiments on benchmark datasets using state-of-the-art CNN architectures, and observe improved classification results.

Finally, this thesis applies introspective learning in unsupervised vision tasks. We develop a generative model by progressively learning a sequence of CNN classifiers. The resulting generator is additionally a discriminator, being able to self-evaluate the difference between its generated samples and the given training data. In the experiments, we observe encouraging results on a number of applications including face modeling, texture modeling, unsupervised feature learning and image-to-image translation.

Main Content
Current View