Computer architecture is always changing. Now, more than ever, we see deeper vertical integration with domain-specific software, faster emergence of paradigm shifting computing devices and memory technologies, and unprecedented security and privacy vulnerabilities. These changes all present opportunities for innovations in architecture design, and along with them, uncertainties we have to deal with carefully. An uncertainty-aware approach is essential when we design computer architectures and systems for the future, as variations in technology alone has been demonstrated to have the potential of eliminating the performance gain of an entire CMOS generation. To navigate the new dark waters exposed by changes in fabrication, design constraints, and programming models, a clear understanding and rigorous quantification of uncertainties in architecture designs, as well as the risks that come along, is a critical first step.
To achieve such a goal in a tremendous space of designs, from the smallest embedded system to the largest warehouse-scale computing infrastructure, from the most well-characterized CMOS technology node to novel devices at the edge of our understanding, we need new systematic supports across the design stack from high level analytical analysis when evaluating design decisions in the early stage to cycle-accurate detailed simulations when projecting actual system performance.
This thesis establishes a route to build a new uncertainty and risk aware architecture design process. We first demonstrate how even a very high level definition and understanding of such concepts of uncertainties and risks can expose a new trade-off space between average-case performance and the amount of uncertainty/risk a design is exposed to. The framework is then generalized and we design a new modeling language that systematically support such analysis through a combination of symbolic execution, graph transformation and compiler optimizations, followed by demonstration of the applicability and benefits of such a modeling language as a foundation to build high quality analysis with closed-form models. We then take the exploration down to the most common practice of architecture studies: cycle-accurate simulations. There a new way of quantifying uncertainty and risk while keeping the computational taxing at bay is proposed through the adaptation of generalized polynomial-chaos theories to build accurate surrogate models.
Finally, we peak into the future and envision what needs to be researched before quantifying uncertainties and risks become an essential and well-supported practice of architecture design in this new golden era.