- Main
A Generative Account of Latent Abstractions
- Xie, Sirui
- Advisor(s): Terzopoulos, Demetri;
- Zhu, Song-Chun
Abstract
Abstractions are fundamental to human intelligence, extending far beyond pattern recognition. They enable the distillation and organization of complex information into structured knowledge, facilitate the succinct communication of intricate ideas, and empower us to navigate complex decision-making scenarios with consistent value prediction. The ability to abstract is particularly fascinating because abstractions are not inherently present in raw data --- they are latent variables underlying our observations. Despite the recent phenomenal advances in modeling data distributions, Generative Artificial Intelligence (GenAI) systems still lack robust principles for the autonomous emergence of latent abstractions.
This dissertation studies the problem of unsupervised latent abstraction learning, focusing on developing modeling, learning, and inference methods for latent-variable generative models across diverse high-dimensional data modalities. The core premise is that by incorporating algebraic, geometric, and statistical structures into the latent space and generator, we can cultivate representations of latent variables that explain observed data in alignment with human understanding.
The dissertation consists of four parts. The first three explore the generative constructs of latent abstractions for Category, Object, and Decision, respectively. Part I examines the basic structure of categories, emphasizing their symbol-vector duality. We develop a latent-variable text model with a coupling of symbols and vectors in its representations. We investigate another representation that is both discrete and continuous --- iconic symbols --- in a visual communication game. Part II enriches the abstract structure by shifting focus to object-centric abstractions in visual data. We introduce a generative model that disentangles objects from backgrounds in the latent space. We then rethink the algebraic structures of object abstractions and propose a novel metric that measures compositionality as a more generic form than disentanglement. Part III incorporates situational context by introducing a sequential decision-making aspect with trajectory data. Here, latent abstractions manifest as actions and plans. We bridge the theories of decision-making and generative modeling, proving that the inference of latent decisions enhances consistency with the model's understanding while optimizing intrinsic values. Whereas these three parts adopt the paradigm of directly learning from raw data, Part IV introduces a dialectic discussion with an alternative paradigm, Knowledge Distillation. We demonstrate how to distill from and accelerate the state-of-the-art massive-scale data-space models by re-purposing our methods and techniques for latent-variable generative modeling.
Together, the contributions of this dissertation enable GenAI systems to overcome the critical bottlenecks of alignment, efficiency, and consistency in representation, inference, and decision-making.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-