Skip to main content
Open Access Publications from the University of California

The case for dynamic optimization : improving memory-hierarchy performance by continuously adapting the internal storage layout of heap objects at run-time


We present and evaluate a simple, yet efficient dynamic optimization technique that increases memory-hierarchy performance for pointer-centric applications by up to 24% and reduces cache misses by up to 35%. Based on temporal profiling information, our algorithm reorders individual data members in dynamically allocated objects to increase spatial and temporal locality. Our optimization is applicable to all type-safe programming languages that completely abstract from physical storage layout; examples of such languages are Java and Oberon.

In our implementation, the optimization is fully automatic and operates at run-time on live data structures, guided by dynamic profiling data. Whenever the results of profiling suggest that a running program could benefit from data-member reordering, optimized versions of the affected procedures are constructed on-the-fly in the background. As soon as it is safe to do so, the dynamically generated code is substituted in place of the previously executing version and all affected live data objects are simultaneously transformed to the new storage layout. The program then continues its execution using the improved data arrangement, until profiling again suggests that re-optimization would be beneficial. Hence, storage layouts in our system are continuously adapted to reflect current access profiles.

Our results indicate that it is often worthwhile to re-optimize an already executing and optimized program all over again when the user's behavior changes. The main beneficiaries of such re-optimizations are shared libraries, which at different times can be optimized in the context of the currently dominant client application. In our experiments, we optimized a system library in the context of four different usage patterns and then correlated each of these specialized libraries across all four of the usage patterns. In some contexts, the specialized library performed 13% better than other libraries, and also 7% better than a statically optimized library. Hence, in systems where such re-optimizations can be executed rapidly, it becomes worthwhile to construct specialized versions at run-time.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View