Chang, Michael

Neural Software Abstractions

2023

Abstract

The desire to efficiently solve problems has driven humans to create tools to accomplish more with less. To be useful in a variety of contexts, a tool must encode knowledge of how to solve a general problem, knowledge that models the system that the tool manipulates. For most of human history, tools enabled humans to better manipulate only physical systems, such as using a lever for lifting heavy objects. These tools implicitly modeled the physical system via their specialized design. The computer is significant because it was the first universal tool for modeling and manipulating any system.

Unfortunately, this universality has historically been restricted to systems that only humans can manually model and manipulate, via code. Humans have long acted as the interface between computers and the physical world, but we will increasingly become the bottleneck to progress as computers become more powerful and the world becomes more complex. If we could build machines that automatically model and manipulate systems on their own, then we would solve more problems with less effort: we would need only specify what the problem is rather than bother with how to solve it.

The problem of building machines that automatically model and manipulate systems is not new and arguably encompasses the entire field of artificial intelligence (AI). Solving such a problem implies two things: first, that the machine can represent system interactions and second, that the machine can learn such representations automatically. What it means to represent system interactions is to represent the entities in the environment, the transformations that change the state of these entities, and choices the agent makes to apply these transformations. What it means to learn representations automatically is for these representations to be learned functions of the machine's raw sensorimotor stream. For such representations to be effective for automatically modeling and manipulating systems, they need to generalize over the combinatorial space of possible combinations of entities, of transformations, and of choices, and criterion that I call combinatorial generalization.

Neither of the two paradigms that have dominated AI since the mid-1900s have yet offered a complete solution to both desiderata. The symbolic paradigm offers solutions for how to represent system interactions but not for how to learn representations. Conversely, the connectionist paradigm offers solutions for how to learn representations, but generally such representations do not directly expose the entities, transformations, and choices of the underlying system interaction in question. In the last half century these two paradigms have grown into the modern disciplines of software programming and deep learning, largely retaining their original complementary strengths and weaknesses. How can we achieve the strengths of both?

One prominent class of approaches for combining both paradigms is to use neural networks for processing symbolic data or searching over symbolic code. These methods have achieved great success in natural language processing, code generation, and symbolic search, but they all assume a human-defined abstraction of the system to begin with. To actually address the problem of automatically modeling and manipulating systems, we need the machine to create these abstractions from its own sensorimotor experience. We need to combine both paradigms in a different way. What we would want instead are AI methods that can learn directly from raw data as deep learning algorithms do, with learned representations that generalize over the combinatorial space of system interactions as software does.

My central thesis is that there is a deep similarity between electronic circuits and neural networks, and that adapting the methods we invented almost a century ago for creating modular software programs on top of analog circuits can enable neural networks to exhibit similar generalization properties as software does. I argue that the principle of separation of concerns was the key design principle that enabled representations in software to generalize and that contextual refinement was the key technique that enabled us to implement the principle of separation of concerns at every level of the computing stack. This thesis presents various ways for how to instantiate contextual refinement in neural networks and shows the gains in combinatorial generalization that this technique brings.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Berkeley

Neural Software Abstractions