Architecture Supports and Optimizations for Memory-Centric Processing System
- Author(s): gu, peng
- Advisor(s): Xie, Yuan
- et al.
For the past two decades, the scaling of main memory lags behind the advancement of computation in aspects of bandwidth and capacity. First, conventional compute-centric architecture faces challenges to scale memory bandwidth due to the limitation of off-chip interconnect resources and the energy-inefficiency of long distance data movement. Also, the emerging big data workloads have increasing demand for higher memory capacity, which cannot be satisfied by traditional DRAM technology scaling.
To address these challenges, this dissertation focuses on exploring memory-centric architectures and design optimizations for higher memory bandwidth and larger memory capacity. Three categories of memory-centric designs have been researched. The analog process-in-memory architecture merges computation logics inside memory arrays. It employs the in-situ computing capabilities of resistive memory arrays to eliminate data movements and benefits from massive data parallelism. The digital process-near-memory architecture integrates computation units near memory arrays. The near-memory lightweight components can utilize abundant bandwidth of the internal memory arrays while the optimizations maintain hardware programmability. The enhanced memory design develops a simulation framework for emerging non-volatile memory technologies, which can greatly boost the memory capacity. Using both emerging non-volatile memory and 3D stacking memory technologies, this dissertation investigates four architectures and one simulation framework, covering a wide spectrum of application domains including deep learning, image processing, and high-performance parallel computing.