Skip to main content
Open Access Publications from the University of California

UC Santa Barbara

UC Santa Barbara Electronic Theses and Dissertations bannerUC Santa Barbara

Methods for Efficient Deep Reinforcement Learning

  • Author(s): Green, Samuel Brooks
  • Advisor(s): Koç, Çetin K
  • Eğecioğlu, Ömer
  • et al.

Reinforcement learning (RL) is a broad family of algorithms for training autonomous agents to collect rewards in sequential decision-making tasks. Shortly after deep neural networks (DNNs) advanced, they were incorporated into RL algorithms as high-dimensional function approximators. Recently, "deep" RL algorithms have been used for many applications that were once only approachable by humans, e.g., expert-level performance at the game of Go and dexterous control of a high degree-of-freedom robotic hand. However, standard deep RL approaches are computationally, and often financially, expensive. High cost limits RL's real-world application, and it will slow research progress.

In this dissertation, we introduce methods for developing efficient DNN-based RL agents. Our approaches for increasing efficiency draw upon recent developments for the optimization of DNN inference. Specifically, we present quantization, parameter pruning, parameter sharing, and model distillation algorithms that reduce the computational cost of DNN-based policy execution. We also introduce a new algorithm for the automatic design of DNNs which attain high performance while meeting specific resource constraints like latency and power. Intuition, which is backed by empirical results, states that a naive reduction in DNN model capacity should lead to a reduction in model performance. However, our results prove that by taking a principled approach, it is often possible to maintain high agent performance while simultaneously lowering the computational expense of decision-making.

Finally, a policy must be evaluated on hardware, and currently, there is an explosion of non von Neumann architectures for the acceleration of neural algorithms. We analyze one such device, and we propose rigorous methods for the analysis of such devices for future applications.

Main Content
Current View