Energy Efficient Memory Speculation With Memory Latency Tolerance Supporting Sequential Consistency Without A Coherence Protocol
Modern out-of-order processor architectures focus significantly on the high performance execution of memory operations. Because memory instructions pose ordering requirements, their execution becomes a significant bottleneck for out-of-order execution, particularly slow executing loads.
Many high-overhead structures such as StoreSets, Load Queues and Store Queues are included in these processors to support memory speculation in an attempt to relax these ordering hazards whereever possible. However, each of these structures presents a new and significant source of energy consumption and design complexity to processor architects.
The execution of memory instructions becomes further complicated by the introduction of multi-core processors. Memory coherence is needed which often requries a coherence protocol and interconnection network. Additionally, the timing and ordering of memory instructions' execution between cores can have critical impacts on program functionality and output which programmers must concern themselves with in the form of a memory consistency model. This work proposes a decoupled memory execution verification mechanism that supports memory speculation without costly, complex, and scaling limited structures. This in-order
verification can reduce the average energy dissipation by over 16% with a simpler design that removes the Load and Store queues, StoreSets, and even invalidation-based cache coherence protocols. These benefits are realized by a system providing the straight-forward and intuitive
Sequential Consistency memory consistency model.