Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

End-to-End Inference Optimization for Deep Learning-based Image Upsampling Networks

Abstract

Many computer vision problems require image upsampling, where the number of pixels per unit area is increased by inferring values in high-dimensional image space from low-dimensional representations. Recent research has shown that deep learning-based solutions achieve state-of-the-art performance on such tasks by training deep neural networks (DNNs) on large annotated datasets. Yet, their adoption in real-time applications is predicated on the deployment costs of the resulting models since end-user devices impose significant compute and memory constraints on inference pipelines. To address this, many researchers and practitioners have proposed methods to reduce inference costs without sacrificing model quality. However, many of these works focus on DNNs designed for image downsampling. In this thesis, we study inference optimization techniques designed for deep learning-based image upsampling networks. While some inference optimizations are applicable to both upsampling and downsampling networks, we show that specifically tailoring optimizations for image upsampling workloads can lead to more efficient and effective deployment.

We maintain a holistic view of inference optimization, from training through deployment to execution, by integrating hardware-aware deep learning techniques, compute graph transformations, and computer architecture optimizations into an end-to-end pipeline. We begin by characterizing this pipeline and the different requirements for image upsampling and downsampling workloads. We then introduce novel statistical approaches to hardware-aware deep learning techniques based on quantization and pruning. Once trained, we then introduce novel compute kernels and graph transformations that reduce the compute costs of common upsampling workloads by up to a factor of 3.3. Finally, we adapt our novel inference algorithms to a specialized hardware architecture that reduces resource utilization and improves dataflow on FPGA-based accelerators.

We evaluate a wide range of computer vision benchmarks covering both stochastic and deterministic models to show that our approaches improve power efficiency, throughput, and resource utilization without damaging model quality. Our research highlights the importance of end-to-end inference optimization for deep learning-based image upsampling networks and provides an effective solution for reducing the deployment costs of DNNs designed for real-time computer vision applications on resource-constrained platforms.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View