Energy-Efficient Neuromorphic Computing with CMOS-Integrated Memristive Crossbars
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Barbara

UC Santa Barbara Electronic Theses and Dissertations bannerUC Santa Barbara

Energy-Efficient Neuromorphic Computing with CMOS-Integrated Memristive Crossbars

Abstract

The von Neumann architecture has been broadly adopted in modern computing systems in which the central processor unit (CPU) is separated from the memory unit. During data processing, it is necessary to transfer data between the memory and CPU. For data-intensive applications such as deep neural networks, as the size of data increases, data movement between memory and CPU becomes a significant bottleneck for high throughput and energy-efficient implementation. In-memory computing is a paradigm that tackles this challenge by allowing computation within the memory, i.e., where data are stored. Hence, in-memory computing is a promising solution for implementing energy-efficient neuromorphic systems since it minimizes data transportation between memory and the processing units. The major component in developing neuromorphic circuits is a nanoscale memory device, which is responsible for weight storage and analog computation. Resistive Random-Access Memory (RRAM) is one of the most promising memory candidates due to its long-term retention, analog storage, low-power operation, and compact nanoscale footprint.The first part of this thesis explores the nonidealities of RRAM technology, such as temperature dependency, stuck-at-fault, and tunning error, and their impact on the accuracy of neuromorphic hardware implementation. We show that these imperfections may significantly degrade the inference accuracy of neuromorphic circuits. To mitigate them, we have proposed a holistic approach based on hardware-aware training in which modifications are done in tunning, circuit, and training phase (ex-situ) of hardware development. The proposed method significantly decreases the accuracy drop across the 25–100 °C temperature range, allows 2.5× to 9× improvement in energy consumption of the memory arrays during inference, and improves the defect tolerance by >100×. In the second part of this thesis, we also study the impact of device uniformity in passive memristive circuits and the tradeoffs between computing accuracy, crossbar size, switching threshold variations, and target precision. Nonidealities are investigated in two representative deep neural networks, and several solutions, including hardware-aware training, improved tuning algorithm, and switching threshold modification, are proposed to enhance the performance. These techniques allow us to implement advanced deep neural networks (DNNs) with almost no accuracy drop, using state-of-the-art analog 0T1R technology. In the last part, we focus on integrating passive and active RRAM with CMOS circuits for implementing efficient demos for various applications such as neural networks. First, focusing on passive technology, we show the building block circuit that facilitates the forming, programing, reading, inference, and monitoring of RRAM circuits. We discuss several neuromorphic networks and prototype demos with integrated analog passive RRAM and CMOS. The designs are fabricated in two wafer-scale tapeout runs in 180 nm CMOS technology, and preliminary encouraging experimental results are obtained. Second, we demonstrate a massive DNN accelerator fabricated in a standard 65 nm CMOS process with integrated active analog RRAM devices. The main focus is on novelties in the design of the VMM and tuning circuits, which reduced the impact of IR drop, improved the area efficiency, and allowed massive parallel programming features in this chip.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View