Pointer Cache Assisted Speculative Precomputation
Data prefetching effectively reduces the negative effects of long load latencies on the performance of modern processors. Hardware prefetchers employ hardware structures to predict future memory addresses based on previous patterns. Thread-based prefetchers use portions of the actual program code to determine future load addresses for prediction. In this paper, we combine both of these techniques to address the memory performance of pointer-based applications. We combine a thread-based prefetcher, based on speculative precomputation, with a pointer cache. The pointer cache is a new hardware address predictor that tracks pointer transitions. Previously proposed thread-based prefetchers are limited in how far they can run ahead of the main thread in the face of recurrent dependent loads. When combined with the pointer cache, a speculative thread can make better progress ahead of the main thread, rapidly traversing data structures, despite pointer transition cache misses. The pointer cache allows the consumers of a pointer load miss to issue before the data actually arrives. Our results show that using a pointer cache with speculative precomputation achieves a 65% speedup on average over a speculative precompution architecture with a larger L3 cache.
Pre-2018 CSE ID: CS2002-0712