Skip to main content
eScholarship
Open Access Publications from the University of California

Data organization for improved performance in embedded processor applications

Abstract

Code generation for embedded processors opens up the possibility for several performance optimization techniques that have been ignored by traditional compilers that typically do not exploit architectural features of embedded processors such as paramaterized caches. In this report, we present techniques that take into account the parameters of the data caches, for organizing variables declared in embedded code into memory, with the objective of improving data cache performance. We present techniques for clustering variables to minimize compulsory cache misses, and for solving the memory assignment problem to minimize conflict cahce misses. Our experiments with benchmark code kernels from DSP and other domains on the CW4001 embedded processor from LSI Logic indicate significant improvements in data cache performance (average improvement of 42% in hit ratios) by the application of our memory organization technique.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View