General, Flexible and Unified Near-Data Computing
Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

General, Flexible and Unified Near-Data Computing

Abstract

Over the past decades, the memory hierarchy has increasinglybecome the bottleneck in general-purpose processors due to a widening gap between the growing demand for large data and the much slower scaling of conventional memory hierarchies. Therefore, conventional in-core computing suffers from increasingly expensive overheads such as excessive request messages, unnecessary data movement and coherence traffic, as well as limited off-chip bandwidth, to bring the data from memory to computing cores. To continue the performance and energy efficiency scaling, architects propose near-data computing (NDC) in which computations are offloaded to where the data is. However, existing NDC techniques fall short of providing generality and flexibility across different application domains, programming paradigms, computing substrates, which are crucial to the wide adoption of NDC.

Our key insight is that the critical missing cornerstone forgeneral and flexible near-data computing is a novel rich-semantic memory abstraction. Unlike existing byte-grained load/store operations, the new interface should express a wide range of rich semantics such as the access pattern, reuse distance, near-data computations, etc. Such high-level information is essential for the system to promptly recognize the program's long-term behavior and adjust accordingly to reach optimal states. More importantly, the new interface should be as transparent as possible to programmers with automatic compiler analysis and runtime library support. Based on this, we can fundamentally revolutionize the memory interface and co-optimize computation and data together.

This dissertation explores a new ISA interface - streams- to precisely capture the program's long-term memory and compute activities. Streams are incorporated into the program's functional semantics and are exposed to the entire system stack to guide various policies. Our evaluation and analysis suggest serval key findings. First, a set of useful and prevalent stream patterns cover a wide range of program behaviors and can be embedded into the program in a lightweight way while still maintaining the sequential ordering. Second, streams naturally decouple the address generation and computation from the core pipeline and can be offloaded as the basic unit for near-data computing. Third, by exposing high-level semantics to the system, we can unify different computing paradigms and codesign the software and data structure. Overall, this dissertation aims to enable a general and end-to-end near-data computing system that wipes out the boundary between computation and data -- the computation is freely scheduled in the system near the data, and the data is carefully mapped to the memory resources to provide maximal locality and parallelism. Such data-computation orchestration is the key to continuing the performance and energy efficiency scaling.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View