Stochastic Analog Computation for Machine Learning
- Author(s): Fang, Yuan Sheng
- Advisor(s): DeWeese, Michael R
- et al.
Analog computers model logical and mathematical operations by exploiting the physical properties of continuously evolving systems. Variables are often directly represented by easily measurable quantities such as voltage, current, etc. By contrast, in typical digital computers, the representation is much more abstract and manipulated in discrete time. Despite certain advantages of analog computers, such as their power efficiency and speed, they were eventually made obsolete by the better scalability of their digital counterparts. However, with the recent advances in machine learning, specialized applications of analog computation, such as optical neural networks, are becoming more viable.
Presented here is a collection of topics relevant to analog computing for machine learning. Rather than a set of algorithmic procedures, machine learning models can be treated as physical systems and studied accordingly. Specifically discussed are parameter estimation of the Ising spin glass, deep learning with photonic networks, and fully analog implementation of latent variable models.
For the most part, experimental physics is interested in the observation and measurement of a system under various conditions. On the other hand, much of machine learning is concerned with inferring the underlying properties of a system from known observations. This type of inference is often referred to as the inverse problem by physicists. In Chapter 1, one such concretely formalized task, parameter estimation, is used to study the Ising model and Hopfield network.
The next chapter explores certain practical problems associated with implementing neural networks with photonic components -- optical neural networks (ONN). The analog noise and imprecisions are present in all analog computers and impact their performance. Accordingly, the proper characterization of ONNs require quantifying the effects of fabrication errors and other noise on their operation. The trade-off between expressivity and robustness is explored through comparison of two ONN architectures.
Rather than minimizing the effects of noise in analog systems, in the last chapter, it is demonstrated that they can be leveraged for more efficient computation. Manipulation and analysis of probabilistic models often require the generation of continuous random variables. Instead of using a deterministic, digital computer for this task, we demonstrate that this can be much more efficiently done with an analog computer with inherent variability.
A more, in depth, summary of these topics is presented in the introduction. While each chapter is self-contained, the common theme is a departure from discrete, deterministic approaches to computation and a step toward continuous and stochastic dynamics. Taken as a whole, the thesis acts as reference for further study in analog computation.