I/O Design and Core Power Management Issues in Heterogeneous Multi/Many-Core System-on-Chip
- Author(s): Kim, Myoungseo
- Advisor(s): Gaudiot, Jean-Luc
- Nicolau, Alexandru
- et al.
Since dark silicon and the end of multicore scaling, multi/many-core system-on-a-chip (SoC) platform designs nowadays are facing some conﬂicting issues regarding product development. One is induced by increasing design complexity and another is induced by decreasing time-to-market. Hence, designers are seeking a more eﬃcient and reliable methodology in order to design complex multimillion gate SoC under such harsh conditions.
In particular, the complexity of a generic pin control block in multimedia SoC which implements input/output (I/O) paths for oﬀ-chip communication has increased exponentially in recent years. Accordingly, the possibility of introducing human errors in designing such block has grown. Operation of generic-pin control block needs to be validated with a top-level RTL from the early stages of design, which correctly checks full-chip interface. However, generic-pin control block has inherent several design issues since function registers and multi-I/O paths are usually ﬁxed in the relatively late stages of design. Also, the role of a generic pin control block that shares limited pins causes frequent changes in pin assignment. Therefore, current design approaches of a generic pin control block are no longer adequate to meet the challenges of design productivity, design reusability, and shorter time-to-market for design. And, this results in many possible human errors when using a traditional RTL description. As a response to this problem, we developed a design automation based approach to reduce the possibility of human errors. In the case study presented, we succeeded in auto-generating a generic pin control block in multimedia SoC platforms which has more than 400 general purpose I/O interfaces including both input and output, as well as 1200 PAD pins. Ultimately, we reduced the amount of manual description for generating a generic pin control block by a whopping 98%.
The Overhead of Data Preparation (ODP) is very concerned in the future design of multi/many-core systems on the same chip. Therefore, we considered this issue under the extended Amdahl’s law and apply it to three “traditional” mult/many-core systems scenarios such as homogeneous symmetric, asymmetric, and dynamic. In addition, we expanded it toward two new scenarios spanning heterogeneous and dynamic CPU-GPU multi/many-core systems. Based on our evaluation, we found that potential innovations in heterogeneous system architecture are indispensable to decrease ODP.
Furthermore, providing a solution of low power consumption and the trade oﬀ a small decrease in performance and throughput are the main challenges in designing future heterogeneous multi/many-core architecture on a single chip. Our design incorporates heterogeneous cores representing diﬀerent points in the power-performance design space during an applications execution. Under this circumstance, system software dynamically chooses the most appropriate core to meet speciﬁc performance and power requirements. As a response to this finding, we have presented a power-aware core management scheme through tightly-coupled hardware and software interaction: (1) heuristic thread consolidation scheme in software level, (2) 3-bit core power control scheme in hardware level. It is based on eﬃcient methods of the core power management on heterogeneous multi/many-core architecture as a mechanism to
reduce huge clock cycles as a latency when a core is powered down to up. Operation is based on distinct scenarios by 3-bit core power control scheme through 5 statuses switching such as active, hot, cold, idle, and powered down. In addition, this kind of status switching is exactly triggered by referencing two information. One is
the collected process ID information which is allocated by OS scheduler. Another is the decision information of heuristic thread consolidation scheme to maximize power-performance eﬃciency. Experiments prove that the power-performance efficiency of our model presented reduces power on average by 2.3% compared to a system
with an efficient power-aware policy and by up to 15% with respect to the basic policy.
At the aspect of energy-eﬃciency on the same chip, we have proposed a performance-energy eﬃciency analytical model for the future integrated heterogeneous parallel multi/many-core systems which is promising to be used for big data applications. The model extends the traditional computing-centric model by considering ODP which can not be neglected in heterogenous multi/many-core systems anymore. The analysis has clearly shown that higher parallelism gained from either computation or data preparation brings greater energy-eﬃciency. Improving the performance-energy eﬃciency of data preparation is another promising approach to aﬀect power consumption. Therefore, more informed tradeoﬀs should be taken when we design a modern heterogeneous multi-many-core systems within limited budget of energy envelope.