Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Implementing Efficient, Portable Computations for Machine Learning

Abstract

Computers are powerful tools which perform fast, accurate calculations over huge sets of data. However, many layers of abstraction are required to use computers for any given task. Recent advances in machine learning employ compute-intensive operations embedded in complex overall flows. Further, deployment of these systems must balance many concerns: accuracy, speed, energy, portability, and cost. Currently, for each target, a good implementation of the needed software layers requires many programmer-years of effort.

To address this, we explore new tools and methods to amplify programmer effort for machine learning applications. In particular, we focus on portability and speed for machine learning operations, algorithms, and flows. Additionally, we wish to maintain accuracy and carefully control the complexity of the overall software system.

First, we motivate our approach with a case study in developing libHOG, which provides high-speed primitives for calculating image gradient histograms, where we achieve a 3.6X speedup over the state of the art. Next, in DenseNet, we enable previously prohibitively slow multiscale sliding window object detection using dense convolutional neural network features. Finally, we propose our Boda framework for implementing artificial neural network computations, based on metaprogramming, specialization, and autotuning. In Boda, we explore in depth the development of efficient convolution operations across various types of hardware. With only a few months of effort, we achieve speed within 2X of the highly-tuned vendor library on NVIDIA Graphics Processing Units (GPUs). Further, in only a few weeks, we achieve up to 30\% efficiency on Qualcomm mobile GPUs, where no vendor library exists.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View