Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

An Energy-based Approach to Learning and Memory in Artificial Neural Networks

No data is associated with this publication.
Creative Commons 'BY' version 4.0 license
Abstract

The standard method for training artificial neural networks (ANNs) is stochastic gradient descent (SGD) implemented by the backpropagation (BP) algorithm. Neuroscientists, however, readily agree it is unlikely that the brain uses BP, and BP/SGD has well-known performance limitations like slow convergence and issues with catastrophic forgetting. This raises the question of whether there are alternative learning algorithms that better fit with what is known about the brain and which may mitigate some of the performance drawbacks of BP/SGD. A class of more biologically plausible alternatives are energy-based algorithms (EBAs). Although EBAs have been used for decades, their theoretical foundations and performance on machine learning tasks are poorly understood. This thesis, through the presentation and synthesis of three papers, aims to develop the theoretical foundations and the empirical testing of EBAs on machine learning tasks. The first paper develops a novel theoretical description of a certain EBA, known as predictive coding (PC). The main claim is that predictive coding approximates a proximal algorithm, which is distinct from BP/SGD and was found to train faster than SGD, especially in online learning scenarios. The second paper describes proximal algorithms, which unlike SGD, are sensitive to second-order information and uses this to develop and test a novel optimizer for PC networks that allows for reductions in memory, compute, and improves convergence. The third paper develops a novel EBA, called a sparse quantized Hopfield network, which performs various memory tasks and learns in online and continual settings. This model was shown to significantly outperform similar models trained with BP. The contributions of these papers are synthesized to support the more broad claims that 1) EBAs are proximal algorithms sensitive to second order information, 2) ANNs trained with EBAs are effective, principled online learners, and 3) sparse ANNs trained with EBAs can be effective, principled online-continual learners. Together, these claims suggest EBAs could be a promising alternative to BP/SGD for training ANNs and for developing theories for how brains learn.

Main Content

This item is under embargo until February 2, 2026.