UC San Diego
Configurable energy-efficient co-processors to scale the utilization wall
- Author(s): Venkatesh, Ganesh
- et al.
Transistor density continues to increase exponentially, but power dissipation per transistor improves only slightly with each generation of Moore's law. Given the constant chip-level power budgets, this exponentially decreases the fraction of the transistors that can be active simultaneously with each technology generation. Hence, while the area budget continues to increase exponentially, the power budget has become a first-order design constraint in current processors. In this regime, utilizing transistors to design specialized cores that optimize energy-per-computation becomes an effective approach to improve the system performance. To pursue this goal, this thesis focuses on specialized processors that reduce energy and energy-delay for general purpose computing. The focus on energy makes these specialized cores an excellent match for many of the commonly used programs that would be poor candidates for SIMD-style hardware acceleration (e.g. compression, scheduling). However, there are many challenges, such as lack of flexibility and limited computational power, that limit how effective these specialized cores are at targeting general purpose computing. Without addressing these concerns, these specialized cores would be limited in the scope of applications that they can effectively target. This thesis addresses these various challenges involved in making specialization a viable approach to optimize general-purpose computing. To this end, this thesis proposes Patchable Conservation Cores which are flexible, energy-efficient co-processors that contain the ability to be patched, enabling them to remain useful across versions of their target application. To demonstrate the effectiveness of these conservation cores in targeting a system workload, this thesis utilizes them to design a mobile application processor targeting the Android software stack. The results show that these specialized cores can cover significant fraction of the system execution while staying within a modest area budget. To further increase the fraction of the system execution that these specialized cores cover, this thesis proposes QASICs, specialized co-processors capable of executing multiple general-purpose computations. QASIC design flow exploits the similar code patterns present within and across applications to reduce redundancy across specialized cores as well as improve their computational power