Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Leveraging Label Information in Representation Learning for Multi-label Text Classication

Abstract

The thesis studies the problem of multi-label text classification, and argues that it could benefit from bringing the question into the stage of language understanding. In specific, rather than limit the use of annotated labels to providing supervision in classification only, we also rely on them as auxiliary information to guide the learning of an effective representation that is tangent to the down-stream task. Two approaches are discussed: a) learn a label-word attention layer for composition of word embedding into document vectors; b) learn a high-level latent abstraction via an auto-encoder generative model with structured priors conditional on labels. We introduce two designs of label-enhanced representation learning: Label-embedding Attention Model (LEAM) and Conditional Variational Document model (CVDM) with application on real-world datasets, in order to demonstrate their ability in promoting the classification performances with improved interpretability.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View