Performance and Power Optimization for Multi-core Systems using Multi-level Scaling
- Author(s): Almatouq, Munirah
- Advisor(s): Gaudiot, Jean-Luc
- et al.
Integrating more cores per chip to increase the performance of processors has been trending
for the past decade. However, this trend cannot be sustained because the reduction in power
consumption per core has slowed down while the power budget per chip has not increased.
Modern processor chips are becoming so power constrained to the point that not all their
devices can be powered at once - this is often referred to as dark silicon. To maximize
performance within these power constraints, the system must carefully select the set of
resources to be used.
To solve this problem, several power management techniques such as Dynamic Voltage/Frequency
Scaling (DVFS), core scaling, and resource scaling have been the subject of active research
and have proven to be effective. However, most of these solutions are sub-optimal because
they explore only one layer of the architecture. Although considering one layer reduces the
complexity of the technique, it limits the exploitation of potential improvement in performance
and energy consumption.
The problem is an order of magnitude more complex for power constrained multi-core architectures.
We need power management systems that can take advantage of dierent scaling
techniques. Many studies have been conducted on scaling with the sole objective of performance
improvement. Nevertheless, few of them have considered both performance and energy consumption in the optimization process.
This dissertation proposes an optimization technique that balances performance and energy
consumption by applying a joint control of core, resource and frequency scaling. This
system finds the optimal configuration for a given application and accordingly adapts the
The proposed technique consists of three stages: configuration sampling, response surface
models to approximate performance and energy consumption, and online optimization using
a genetic algorithm (GA). To evaluate the system, experiments were conducted on a
simulated 12 core architecture. Our experiments have shown that the performance could
improve by 15% on average while achieving energy savings of up to 26%. Using a per-core
configuration improves the performance by 25% on average and reduces the energy by 18%.