Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Architecting Non-volatile Memory for High Bandwidth Systems

Abstract

High-performance processors and emerging deep learning applications require a tremendous amount of data access. Meeting the demands of this bandwidth within the available energy range is a big challenge on current memory systems. Processing-in-Memory (PIM) is a promising solution to address this bandwidth bottleneck by performing a portion of computation inside the memory. Existing state-of-the-art PIM technologies are divided into logic implementation using sensing circuit and cell-based logic-in-memory technology. we present the novel designs that improve performance and energy efficiency in each of these two areas. For the designs using the sensing circuit, we present PIM architecture based on the latch-up effect of thyristors, enabling single-cycle addition (ADD), and significantly improving the performance of multiplication (MUL) This design requires no additional cell array for processing, hence can be an excellent candidate for the storage class memory which has been considered as the main application of memristor-based products. Also, we present the method to carry out multiple bit-lines (BLs) requests under a MUX in parallel, hugely accelerating the PIM applications. Our designs present 16x and 12.7x performance improvement over the state-of-the-art PIM designs, respectively.

Considering that cell-based PIM technology is mainly targeted at high-density applications, we present technologies that contribute to energy efficiency and integration. Our UPIM design exploits unipolar switching memristors to offer a sneak current reduction compared to the existing bipolar-based structure and takes advantages of a 3D vertical crossbar array (CBA) structure to increase memory utilization per unit area for high-density applications. As compared to the state-of-the-art PIM design based on the bipolar switching mode, our design achieves 3.1x lower energy consumption and 84% area saving. We also propose a novel design that enables fast and low-overhead computation for very long byte processing by using a multi-block parallelizing method. The proposed design presents 51x energy saving and 16% area saving compared to the state-of-the-art PIM accelerators with a superior cell-efficiency of 45%.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View