Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Software-Hardware Co-design for Processing In-Memory Accelerators

Abstract

The explosive increase in data volume in emerging applications poses grand challenges to computing systems because the bandwidth between compute and memory cannot keep up with exploding data volumes. Processing in memory (PIM) is a promising technology to solve this memory problem by performing some key operations directly in and near memory. There remain several challenges in fully unleashing the power of PIM. Such challenges come from both the software and the hardware sides. On the software side, PIM requires that each operation happens where the data is. As a result, we need to first find the optimal data layout for each application, prior to running it in PIM. On the hardware side, due to the limited functionality of PIM operations, PIM acceleration may require customized logic to achieve high performance. Software-hardware co-design plays a critical role in order to fully exploit PIM acceleration. There are a number of challenges to PIM-based software-hardware co-design. First, software mapping (data layout) in PIM architecture has an extremely large design space. Second, the hardware customization should have minimum overhead to maximize memory capacity. This thesis presents several novel techniques for PIM software-hardware co-design. To tackle the challenges of mapping applications to PIM architecture, this thesis proposes a PIM data layout framework that efficiently optimizes data layout of widely-used machine learning (ML) operators onto general PIM architectures. This thesis also presents the software-hardware co-design for both conventional and non-convolution ML models using PIM architecture. The presented optimizations provide at least 3.7× speedup over conventional PIM acceleration methods. Finally, this thesis proposes a software-hardware co-design for fully homomorphic encryption, which is a challenging and critical application in cryptography, resulting in 30× speedup and energy efficiency over the existing PIM solutions. The target applications in this thesis cover a wide range of challenging operations and data transfer patterns needed for various emerging computing tasks, shedding light on the software-hardware co-design for future systems with PIM acceleration.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View