SpVM Acceleration with Latency Masking Threads on FPGAs
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Previously Published Works bannerUC Riverside

SpVM Acceleration with Latency Masking Threads on FPGAs

No data is associated with this publication.
Abstract

Long memory latencies are mitigated through the use of large cache hierarchies in multi-core architectures, SIMD execution in GPU architectures and streaming of data in FPGA-based accelerators. However, none of these approaches benefits irregular applications that exhibit no locality and rely on extensive pointer de-referencing for data accesses. By masking the memory latency, multi-threaded execution has been demonstrated to deal effectively with such applications. In the MT-FPGA model a multi-threaded engine is implemented on the FPGA accelerator specifically for the masking on the memory latency in the execution of irregular applications: following a memory access, the execution is switched to a ready thread while the suspended threads wait for the return of the requested data value from memory. The multi-threaded engine is automatically generated, from C code, by the CHAT compilation tool and is customized to the specific application. In this paper we use the Sparse Vector Matrix application to evaluate the performance of the MT-FPGA execution and compare it to the latest GPU architectures over a wide range of benchmarks.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content

MT-FPGA-techreport.pdf

Download