Daly, Jake M

A Novel Systolic Architecture for Efficient Acceleration of Deconvolutional Neural Networks at the Edge

2021

Daly, Jake M
Advisor(s): Kreutz-Delgado, Kenneth

Abstract

A new era of processing has dawned: the demands for low latency and low power processing at the edge have ushered in unprecedented opportunity computer architects and embedded designers. In pursuit of new performance standards, chip designers in industry and academia have begun the march towards domain specific processors, a paradigm whose core philosophy and methods are in many ways contrary to the mantras that dominate the processors seen in today's datacenters and technology hubs. The increasing complexity of neural networks and deep learning algorithms being deployed at these edge locations has made this pursuit anything but trivial. Some of the most powerful models that we are seeing deployed, known as deep generative models, use techniques that are effectively capable of generating new data by capturing the full joint data distribution over some input space. These models frequently use upsampling layers to take lower dimensional latent spaces to higher dimensional ones before making inferences about our world.In this work, we perform a deep analysis of one of these upsampling techniques, known as deconvolution (or equivalently transpose convolution), and propose a novel computer architecture for low latency acceleration in edge applications. Our work is the first to fuse together systolic processing and an algorithmic transformation known in this area as the TDC method \cite{chang2018optimizing}. We illustrate how and why this pairing is so powerful for inference acceleration and provide some preliminary performance numbers benchmarked against a pre-existing Wasserstein Generative Adversarial Network (GAN).

Main Content

For improved accessibility of PDF content, download the file to your device.

UC San Diego

A Novel Systolic Architecture for Efficient Acceleration of Deconvolutional Neural Networks at the Edge