Skip to main content
eScholarship
Open Access Publications from the University of California

Beyond Transformers for Function Learning

Creative Commons 'BY' version 4.0 license
Abstract

The ability to learn and predict simple functions is a key aspect of human intelligence. Recent works have started to explore this ability using transformer architectures, however it remains unclear whether this is sufficient to recapitulate the extrapolation abilities of people in this domain. Here, we propose to ad- dress this gap by augmenting the transformer architecture with two simple inductive learning biases, that are directly adapted from recent models of abstract reasoning in cognitive science. The results we report demonstrate that these biases are helpful in the context of large neural network models, as well as shed light on the types of inductive learning biases that may contribute to human abilities in extrapolation.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View