Optimized Trace Binaries for Architectural Evaluation
Skip to main content
Open Access Publications from the University of California

Optimized Trace Binaries for Architectural Evaluation


The increasing demand for high performance forces computer architects to employ a plethora of hardware optimizations. These optimizations and new architecture features are often examined with past generation compilers that do not take into account the new architecture optimizations. The use of a complier that is unaware of a set of architecture optimizations may lead to an incorrect estimation of the impact of these new optimizations. This problem is exacerbated for in-order architectures which rely on the complier to assist in scheduling. Our research focuses on efficient techniques for generating a highly optimized and scheduled binary for VLIW and in-order architectures. We propose to use a modified out-of-order simulator to generate a trace scheduled binary for in-order execution. Our constrained out-of-order machine resolves dependencies and allows independent instructions to move above stalled instructions exposing the available parallelism within a program, and performing the appropriate level of loop unrolling and inlining for the architecture. This new trace binary can then be used to guide architectural research, even when there may not yet exist an optimizing compiler for the architecture being evaluated. In this paper we examine the merits of the simulator-based trace optimizer and show how it performs compared to un-scheduled code on an in-order machine.

Pre-2018 CSE ID: CS2002-0711

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View