Skip to main content
Open Access Publications from the University of California

Learning Latent Hierarchical Structures via Probabilistic Models and Deep Learning

  • Author(s): Arabshahi, Forough
  • Advisor(s): Singh, Sameer
  • et al.

Hierarchical structures arise in many real world applications and domains. For example, in social networks people’s relationships and the groups to which they belong form a hierarchy. In natural language and computer programs, parse trees (which have a hierarchical structure) are used to represent the compositionality of expressions. These hierarchies strongly affect the statistics and the behavior of the data. Hence, it is important to develop algorithms that take these structures into account when modeling such data. Apart from these hierarchical structures, some datasets are best explained with hierarchical models even though there is no apparent hierarchy in the data itself. For instance when modeling the occurrence of words in a document, it is more realistic to assume that the words are drawn in a hierarchical manner from a topic distribution rather than independently from a single topic. In this dissertation, we focus on capturing these hierarchies and leveraging them for modeling high dimensional datasets.

Hierarchical structures underlying the data are either observed or latent. For example in the context of computer programs, the syntax tree is inherent to the program and is therefore observed. On the other hand, the statistical dependence of a social network’s users is latent. In this dissertation, we study both types of hierarchies and develop models under both struc- tures because they both arise in many applications and are equally important. Nevertheless, capturing latent hierarchical structures is more challenging. We develop novel probabilistic models to capture latent hierarchies and present statistically efficient and provably consistent parameter learning algorithms for them. When capturing observed hierarchical structures we develop deep learning models that learn low-dimensional continuous representations for the discrete symbols and variables.

Main Content
Current View