Liu, Liu

Elastic Processing and Hardware Architectures for Machine Learning

2022

Abstract

Machine Learning (ML) techniques, especially Deep Neural Networks (DNNs), have been driving innovations in many application domains. These breakthroughs are powered by the computational improvements in processor technology driven by Moore’s Law. However, the need for computational resources is insatiable when applying ML to large- scale real-world problems. Energy efficiency is another major concern of large-scale ML. The enormous energy consumption of ML models not only increases costs in data-centers and decreases battery life of mobile devices but also has a severe environmental impact. Entering the post-Moore’s Law era, how to keep up performance and energy-efficiency with the scaling of ML remains challenging.This dissertation addresses the performance and energy-efficiency challenges of ML. The thesis can be encapsulated in a few questions. Do we need all the computations and data movements involved in conventional ML processing? Does redundancy exist at the hardware level? How can we better approach large-scale ML problems with new computing paradigms? This dissertation presents how to explore the elasticity in ML pro- cessing and hardware architectures: from the algorithm perspective, redundancy-aware processing methods are proposed for DNN training and inference, as well as large-scale classification problems and long-range Transformers; from the architecture perspective, balanced, specialized, and flexible designs are presented to improve efficiency.

UC Santa Barbara

Elastic Processing and Hardware Architectures for Machine Learning