Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Cross-Layer Approaches for Monitoring, Margining and Mitigation of Circuit Variability

  • Author(s): Lai, Liangzhen
  • Advisor(s): Gupta, Puneet
  • et al.
Abstract

With technology scaling, circuit performance has become more sensitive to various sources of variability, including manufacturing variations, ambient fluctuations, and circuit wear-out. These increased variations have created new challenges for conventional hardware guardbanding, as the additional design margin diminishes the benefits of technology scaling. This dissertation aims at reducing total system design margin with cross-layer approaches on monitoring, margining and mitigation of circuit variability.

Since hardware and software adaptation can be used to reduce design margin with the

exposed hardware variability provided by hardware monitors, we start by proposing two

different types of performance monitors that can achieve better monitoring accuracy and

smaller monitoring overhead. We also demonstrate the use of these performance monitors in system adaptation with our end-to-end implementation of software testbeds.

We also study the dynamic variations and reliability margining problem in presence of

monitor-and-actuate adaptation and emerging system contexts. In a system with monitor-and-actuate adaptation, dynamic variations require extra margin for monitor and actuate latencies. We analyze and study the margining problem considering different choices of the monitor and actuator types. System reliability margining strategies are also proposed for circuits in the “dark silicon” era, where the low-level design margin should consider the contexts of high-level power/thermal constraints.

Last, we propose a clock gating methodology to mitigate the aging induced clock skew,

which is difficult to monitor and resolve through adaptation. For certain phenomena and

variation sources, for example, soft error rates at different location/altitude, we also propose

system/cloud-based monitors. An emulation platform is built to study the impacts of

dynamic power management schemes on system reliability.

Main Content
Current View