- Main
Exploring Generalization Inductive Biases in the Brain and Deep Neural Networks: Experimental and Computational Approaches
- Zolfaghar, Maryam
- Advisor(s): O'Reilly, Randall C.
Abstract
What enables humans to effortlessly learn and generalize across diverse tasks, an exceptional ability that even advanced deep neural networks struggle to replicate? This thesis explores this question, crucial in both neuroscience and computer science. Despite deep neural networks' achievements, they require vast data to learn, sometimes even surpassing human lifetimes' worth of experience. In contrast, humans adapt existing knowledge to novel challenges, indicating the presence of cognitive mechanisms facilitating adaptive out-of-domain generalization. This thesis comprehensively explores some of these mechanisms, including how humans learn in a way that supports the development of representations applicable across diverse contexts, adjust and deploy them based on contextual demands, and form such representations over different time scales, especially when rapid learning is required, and how it achieves such learning without forgetting the past knowledge.
The human brain has a remarkable ability to predict what will happen next and subsequently compares these predictions to what actually happens. The differences form prediction errors. These errors serve as self-generated teaching signals that guide us in adjusting our understanding and mental models to better match reality. This process, known as deep predictive learning, helps us adapt and refine our internal representations, fostering adaptable abstract representations that capture patterns within sensory inputs, which are central for out-of-domain generalization. Additionally, our brain forms abstract map-like representations (i.e., cognitive maps) that empower advanced reasoning skills and bridge learned knowledge with novel challenges, facilitating successful problem-solving. Cognitive control mechanism within the prefrontal cortex, known for systematic generalization, dynamically interacts with these representations to meet multiple objectives. This capability is instrumental in navigating complex tasks, generalizing to unfamiliar situations, and embracing new challenges. Constructing such abstract representations demands gradual experience integration. However, according to the CLS framework, the brain also possesses mechanisms for when rapid learning is required in new environments without catastrophically forgetting the prior knowledge.
Generalization is a cornerstone of human intelligence, enabling us to tackle daily and novel challenges. This thesis enriches our understanding through a multidisciplinary approach and highlights the integration of experimental and computational techniques. It combines neural EEG and behavioral data with machine learning models to explore the predictive learning process. Additionally, it builds deep neural networks to replicate fMRI findings, with a particular emphasis on cognitive control processes associated with the generation of map-like representations. The study quantifies generalization in these networks, introduces cognitive-inspired inductive biases to these models, and develops models consistent with the CLS framework for tasks requiring generalization across various time scales. By merging computational and experimental methods, this research offers insights into scenarios challenging to replicate with human participants, in addition to inspiring the development of advanced models and contributing to the ongoing evolution of future AI systems.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-