Thermal Safety and Real-Time Predictability on Heterogeneous Embedded SoC Platforms
Recent embedded systems are designed with high-performance System-on-Chips (SoCs) to satisfy the computational needs of complex applications widely used in real life, such as airplane controllers, autonomous driving automobiles, medical devices, drones, and hand-held devices. Modern SoCs integrate multi-core CPUs and various types of accelerators including graphics processing units (GPUs), digital signal processing (DSP), video encoding, and decoding units. The performance gain of such SoCs comes at the cost of high power consumption, which in turn leads to high heat dissipation. Uncontrolled heat dissipation is one of the main sources of interference that can adversely affect the reliability and real-time performance of safety-critical applications. The mechanisms currently available to protect SoCs from overheating, such as frequency throttling or core shutdown, may exacerbate the problem as they cause unpredictable delay and deadline misses. Dynamic changes in ambient temperature further increase the difficulty of solving this problem.
This dissertation addresses the challenges caused by thermal interference in real-time mixed-criticality systems (MCSs) built with heterogeneous embedded SoC platforms. We propose a novel thermal-aware system framework with analytical timing and thermal models to guarantee safe execution of real-time tasks under the thermal constraints of a multi-core CPU/GPU integrated SoC. For mixed-criticality tasks, the proposed framework bounds the heat generation of the system at each criticality level and provides different levels of assurance against ambient temperature changes. In addition, we propose a data-driven thermal parameter estimation scheme that is directly applicable to MCSs built with commercial-off-the-shelf multi-core processors to obtain a precise thermal model without using special measurement instruments or access to proprietary information. The practicality and effectiveness of our solutions have been evaluated using real SoC platforms and our contributions will help develop systems with thermal safety and real-time predictability.