Since dark silicon and the end of multicore scaling, multi/many-core system-on-a-chip (SoC) platform designs nowadays are facing some conflicting issues regarding product development. One is induced by increasing design complexity and another is induced by decreasing time-to-market. Hence, designers are seeking a more efficient and reliable methodology in order to design complex multimillion gate SoC under such harsh conditions.
In particular, the complexity of a generic pin control block in multimedia SoC which implements input/output (I/O) paths for off-chip communication has increased exponentially in recent years. Accordingly, the possibility of introducing human errors in designing such block has grown. Operation of generic-pin control block needs to be validated with a top-level RTL from the early stages of design, which correctly checks full-chip interface. However, generic-pin control block has inherent several design issues since function registers and multi-I/O paths are usually fixed in the relatively late stages of design. Also, the role of a generic pin control block that shares limited pins causes frequent changes in pin assignment. Therefore, current design approaches of a generic pin control block are no longer adequate to meet the challenges of design productivity, design reusability, and shorter time-to-market for design. And, this results in many possible human errors when using a traditional RTL description. As a response to this problem, we developed a design automation based approach to reduce the possibility of human errors. In the case study presented, we succeeded in auto-generating a generic pin control block in multimedia SoC platforms which has more than 400 general purpose I/O interfaces including both input and output, as well as 1200 PAD pins. Ultimately, we reduced the amount of manual description for generating a generic pin control block by a whopping 98%.
The Overhead of Data Preparation (ODP) is very concerned in the future design of multi/many-core systems on the same chip. Therefore, we considered this issue under the extended Amdahl’s law and apply it to three “traditional” mult/many-core systems scenarios such as homogeneous symmetric, asymmetric, and dynamic. In addition, we expanded it toward two new scenarios spanning heterogeneous and dynamic CPU-GPU multi/many-core systems. Based on our evaluation, we found that potential innovations in heterogeneous system architecture are indispensable to decrease ODP.
Furthermore, providing a solution of low power consumption and the trade off a small decrease in performance and throughput are the main challenges in designing future heterogeneous multi/many-core architecture on a single chip. Our design incorporates heterogeneous cores representing different points in the power-performance design space during an applications execution. Under this circumstance, system software dynamically chooses the most appropriate core to meet specific performance and power requirements. As a response to this finding, we have presented a power-aware core management scheme through tightly-coupled hardware and software interaction: (1) heuristic thread consolidation scheme in software level, (2) 3-bit core power control scheme in hardware level. It is based on efficient methods of the core power management on heterogeneous multi/many-core architecture as a mechanism to
reduce huge clock cycles as a latency when a core is powered down to up. Operation is based on distinct scenarios by 3-bit core power control scheme through 5 statuses switching such as active, hot, cold, idle, and powered down. In addition, this kind of status switching is exactly triggered by referencing two information. One is
the collected process ID information which is allocated by OS scheduler. Another is the decision information of heuristic thread consolidation scheme to maximize power-performance efficiency. Experiments prove that the power-performance efficiency of our model presented reduces power on average by 2.3% compared to a system
with an efficient power-aware policy and by up to 15% with respect to the basic policy.
At the aspect of energy-efficiency on the same chip, we have proposed a performance-energy efficiency analytical model for the future integrated heterogeneous parallel multi/many-core systems which is promising to be used for big data applications. The model extends the traditional computing-centric model by considering ODP which can not be neglected in heterogenous multi/many-core systems anymore. The analysis has clearly shown that higher parallelism gained from either computation or data preparation brings greater energy-efficiency. Improving the performance-energy efficiency of data preparation is another promising approach to affect power consumption. Therefore, more informed tradeoffs should be taken when we design a modern heterogeneous multi-many-core systems within limited budget of energy envelope.