Moran, Steven

Analog In-Memory Multiply-and-Accumulate Engine Fabricated in 22nm FDSOI Technology

2022

Moran, Steven
Advisor(s): Iyer, Subramanian S

Abstract

This dissertation presents the first on-chip demonstration of a Multiply-and-Accumulate (MAC) function in 22nm CMOS on SOI with the Charge-Trap Transistor (CTT).

Recent developments in machine learning and AI focus on digital-based von Neumann architectures to accelerate computation using massively parallel processing platforms including Graphics Processing Units (GPUs), Tensor Processing Units (TPUs), and Application- Specific Integrated Circuits (ASICs), to name a few. While these platforms have dramatically improved system performance, they are inherently limited by the von Neumann memory bottleneck. A resurgence of digital and analog in-memory & near-memory computing (iMC) techniques have been proposed to perform computation directly where the memory is stored, eliminating unnecessary memory accesses and minimizing memory access energy.A hybrid approach to designing high-performance AI computation platforms is composed of learning on the cloud and energy-efficient inference at the edge. In this dissertation, we explore the latter through the use of the Charge-Trap Transistor (CTT)—commercial high-k logic nFET device on SOI—as an ideal candidate nonvolatile memory device for analog-based

in-memory computing. Past results show that the CTT can be accurately programmed with excellent resolution, device programming variance, and retention characteristics. We propose a NeuroCTT inference architecture and present experimental results based on two test chips taped-out utilizing GlobalFoundries 22FDX technology. A first-time demonstration of an on-chip analog MAC Engine using the CTT in a commercial CMOS technology is provided. Accurate on-chip weight programming with sufficient retention are also demonstrated in hardware. In addition, we introduce a CTT-Hardware-based Inference Realistic Circuit Universal Simulator (CIRCUS) Platform for studying the effects of circuit-induced errors and device non-idealities on system performance and accuracy.

We conclude by evaluating the resiliency of general-purpose neural network applications by evaluating the effect of weight programming variance on analog-based in-memory computing and bit errors on digital-based architectures. As a baseline for digital-based & energy-efficient ASICs, an IBM TrueNorth Neurosynaptic System is exposed to 4MeV protons corrupting the on-chip model file for a trained 12-layer Convolutional Neural Network (CNN). The IBM TrueNorth continues to perform classification with negligible degradation to accuracy. For larger-scale networks and memory-intensive applications, reliability studies were also performed on 3D-stacked (3DS) DRAM to study the effect of radiation on more advanced 3D-stacked architectures.

Main Content

For improved accessibility of PDF content, download the file to your device.

UCLA

Analog In-Memory Multiply-and-Accumulate Engine Fabricated in 22nm FDSOI Technology