Skip to main content
eScholarship
Open Access Publications from the University of California

Power Efficient Image Classification and Generation using Fixed Point Gibbs Sampling

  • Author(s): Kan, Chih-yin
  • Advisor(s): Kreutz-Delgado, Ken
  • et al.
Abstract

Machine learning-based algorithms are essential tools used to extract and analyze information for applications such as classification, pattern recognition, denoising and reconstruction. It has become commonplace that smart, connected, and Internet-of-Things (IoT) based devices, perform machine learning algorithms in the cloud. However, with the increase in the number of connected devices, processing information on the cloud can encounter privacy, latency, and reliability problems. As a result, hardware for machine learning on the "edge" of the cloud is gaining popularity, since having a real-time processor that is locally embedded onto devices and sensors can ameliorate these issues, as well as decrease the power requirements associated with continuous, sustained connectivity to the cloud. In the designing of the locally embedded hardware algorithms, there is a requirement for maintaining accuracy close to cloud-based algorithms (which serves as ideal benchmarks), while addressing limitations due to hardware cost, size, and power consumption. Thus, it can be challenging to design a hardware for machine learning purposes. In this thesis, we study the power-at-performance efficiency of Restricted Boltzmann Machine-based machine learning algorithms using Gibbs samplers implemented with fixed point approximations and functional approximations of sigmoid activation functions. We discuss how hardware designers can determine the trade-offs between using the different styles of activation function approximations and different levels of bitwidth quantization, and develop a design methodology which we implement and verify on a Verilog environment. Metrics are developed for comparing the performance of both Discriminative and Generative RBM, various choices of bitwidth level and style of activation approximations are explored to obtain a good performance-power trade-off.

Main Content
Current View