Design-for-Reliability on Thermal Management and ESD Protection of Integrated Circuits
- Author(s): Li, Cheng
- Advisor(s): Wang, Albert
- et al.
Transistor self-heating is a grand challenge in modern integrated circuit (IC) chip designs. Chip-scale thermal management is required to ensure IC design reliability. As IC technologies rapidly advance into few-nm nodes, while performance, complexity and size of chips continue to increase, heat dissipation becomes a technical bottleneck to advanced chips. For example, data servers and mobile electronics increasingly rely on high clock frequency, multiple-core CPU and GPU, which are extremely power-hungry and are heavy heat generators. However, portable devices, such as smartphones, have little room to accommodate traditional heat dissipation means. Further, advanced IC technologies, such as silicon-on-insulator (SOI) and FINFET, are inherently in thermal conduction, hence, making transistor self-heating even severer at the chip level. In addition, advanced 3D packaging makes it much harder to dissipate heat from IC dies. Together, advanced high-performance, complex chips made in advanced IC technologies are essentially big heat generators, unfortunately, poor thermal conduction makes IC chips increasingly suffering from heat-induced performance degradation and reliability problems. Existing on-chip thermal sensors used to monitor temperature typically use thermocouple, thermistor and PN diode devices, which are placed on a chip in a coplanar manor, i.e., laterally side-by-side, with the circuit blocks to be monitored. There are several major technical disadvantages of making accurate full-chip thermal management impractical using the existing thermal-sensing techniques. First, a thermal sensor is laterally far away from a transistor to be monitored, making it impossible to accurately detect any real hot spots, which are often at the corners of a conduction channel of a transistor. Second, these thermal sensors are bulky, hence, impossible to construct a very large thermal sensor network on a chip to realize full-chip thermal mapping. Consequently, existing on-chip thermal sensing may only achieve circuit block level thermal-sensing resolution, which makes practical thermal management impractical. In order to fundamentally address the transistor self-heating problem and enable effective full-chip thermal management, novel thermal sensing techniques are needed to accurately detect the transient hot spots at the single transistor level, which hence requires transistor-level thermal sensing resolution. Further, to enable accurate run-time full-chip thermal management, a large on-chip thermal sensor mash network is required to achieve transistor-level thermal sensing resolution, which in turn, make full-chip thermal management practical. In this dissertation, I proposed and prototyped a novel under-FET thermal sensor device structure, which utilizes a vertical PN junction. This PN junction thermal sensor is made inside a through-silicon via (TSV) type vertical hole, which is placed directly under a MOSFET, hence being able to detect the transient hot spot in a MOSFET channel. Since the TSV-like under -FET thermal sensor does not take any extra “lateral” Si die area and is placed right underneath the heating source, i.e., the MOSFET channel, it allows constructing a large thermal sensor mash network on a chip to realize transistor-level thermal sensing resolution without taking extra IC die area. Using the new under-FET thermal sensing technique, a machine-learning (ML) algorithm is proposed to enable run-time full-chip thermal management, which can fundamentally resolve the transistor self-heating-induced chip performance degradation and thermal reliability problems.Electrostatic discharge (ESD) protection is another major IC reliability challenge, particularly for complex chips implemented in advanced IC technologies. For the past six decades, substantial R&D efforts have been devoted to developing various on-chip ESD protection solutions. Yet, as IC technologies continue to shrink, while IC performance and complexity continuously increase, on-chip ESD protection for advanced ICs becomes extremely challenging. In general, any ESD protection structures inevitably induce the ESD-design overhead problem, including parasitic capacitance, noises and leakages, as well as ESD device size and layout problem. The semiconductor industry urgently needs novel on-chip ESD protection solutions to overcome the ESD-design overhead problems. In this desertion, I proposed and demonstrated three novel on-chip ESD protection structures aiming to address the ESD-design overhead problem. The first new ESD protection structure proposed is a vertical TSV-like diode, which is a truly vertical PN-diode ESD protection device residing inside a TSV-like vertical hole under a bonding pad. Unlike any traditional PN diode ESD protection devices, which always require lateral discussion elements for electrical interconnections, the new TSV-like ESD diode can conduct ESD pulses and ESD-induced heat vertically, hence, significantly improving ESD protection while minimizing Si die area consumed by large ESD protection structures. The second new ESD protection structure is a cell-based Sudoku-type diode-trigger silicon-controlled rectifier (DTSCR) low-triggering ESD protection sub-circuit structure. The third novel ESD protection structure is a single-crystalline graphene-based nano-electromechanical system (gNEMS) ESD protection switch device. This gNEMS ESD protection structure is novel in that it is a mechanical switch made in the back-end-of-line (BEOL) in CMOS, which can be turned on/off extremely fast to provide both human body models (HBM) and charged device model (CDM) ESD protection. Both theoretical and experimental studies were conducted to validate the three novel ESD protection structures Chapter 1 introduces the transistor self-heating problem, which is the root cause of the thermal reliability problem of modern IC chips. Chapter 2 discusses the new TSV-type under-FET thermal sensor device and the ML-based full-chip thermal management method. Chapter 3 gives an induction of ESD protection fundamentals and graphene material background. Chapter 4 presents the novel vertical TSV-like ESD protection diode structure of the TCAD mixed-mode ESD protection design calibration flow. Chapter 5 discusses the new Sudoku-type ESD protection array and theoretical design analysis. Chapter 6 discusses the novel single-crystallin graphene-based gNEMS ESD switch structure. Chapter 7 summarizes my Ph.D. research achievements.