A larger instruction window on Out-of-Order (OoO) cores facilitates better exploitation of inherent Instruction Level Parallelism (ILP). Branch miss-speculation penalty restricts scaling to larger instruction window in OoO cores. Branch instructions dependent on hard-to-predict load data are the leading misprediction contributors. Computer architects continuously strive to optimize branch prediction algorithms and increase predictor size to mitigate mispredictions. Current state-of-the-art history-based branch predictors have low prediction accuracy for these branches. Prior research backs this observation by showing that increasing the size of a 256-KBit history-based branch predictor to its 1-MBit variant has just a 10% reduction in branch mispredictions.
In this dissertation, I present the novel Load Driven Branch Predictor (LDBP), specifically targeting hard-to-predict branches dependent on a load instruction. Though random load data determines these branches’ outcomes, most of these data’s load address have a predictable pattern. This is an observable template in data structures like arrays and maps. The LDBP predictor model exploits this behavior to trigger future loads associated with branches ahead of time and use its data to predict its outcome. The predictable loads are tracked, and the branch instruction’s precomputed outcomes are buffered for making predictions. The experimental results show that on a modern Zen2-like OoO core, compared to a standalone 256-Kbit IMLI predictor, when LDBP is augmented to it, the average branch mispredictions reduce by 12% and the average IPC improves 7.14% for benchmarks from SPEC CINT2006 and GAP benchmark suite.