Kandel, Aaron Isaac

Risk-Aware Algorithms for Learning-Based Control With Applications to Energy and Mechatronic Systems

2023

Kandel, Aaron Isaac
Advisor(s): Moura, Scott J

Abstract

This dissertation leverages and develops the powerful out-of-sample safety certificates of Wasserstein ambiguity sets to create a suite of data-driven control algorithms that help solve safety-critical industrial problems. This work is motivated by the ongoing relevance of robustness and safety when applying data-driven decision making in the real world. For example, lithium-ion batteries are driving transitions to renewable energy sources. Optimizing their performance and longevity is of the utmost importance, but highly difficult due to their complex, nonlinear, and safety-critical electrochemical dynamics. While data-driven control can dramatically improve the performance of systems like lithium-ion batteries, certifying system safety remains an open challenge. This dissertation explores certifying learning-based controllers via distributionally robust optimization (DRO). We focus on Wasserstein ambiguity sets, DRO methods that draw worst-case realizations of random variables under relatively permissive assumptions. This makes them ideal for learning-based control, where data can be highly limited and the controller is likely encounter new experience unaccounted for in its training data.

In Chapter 2, we begin by presenting simple mathematical arguments that extend an existing reformulation of Wasserstein DRO to cases where dependence on decision variables x and random variables R can be nonconvex as long as x and R are separable. By cleverly modeling stochasticity in model uncertainty, we augment nonconvex optimal control problems with Wasserstein ambiguity sets to obtain idealized probabilistic safety certificates.

The remaining chapters extend this theoretical result across the range of model-based and model-free reinforcement learning. Chapter 2 explores offline model-based reinforcement learning within a latent state-space, with application to real-time fast-charging of li-ion batteries using electrochemical information. By leveraging the results of Chapter 2, we can hedge against model and data errors to probabilistically guarantee safe distributional data-driven control.

Chapter 4 presents an end-to-end framework for safe learning-based control using nonlinear stochastic MPC. We focus on scenarios where the controller is applied directly to a system of which it has highly limited experience, toward safety during tabula-rasa learning-based control as a challenging case for validation. We validate findings with case studies of extreme lithium-ion battery fast charging and autonomous vehicle obstacle avoidance using a basic perception system.

Finally, in Chapter 5, we apply the same DRO architecture to value-based RL. We describe a structure for deep Q-learning within the framework of constrained Markov decision processes (CMDPs). By characterizing the uncertainty of constraint cost functions based on their temporal-difference errors, we augment relevant constraints with tightening offset variables based on DRO theory of Chapter 2.

In our concluding remarks, we discuss the broader relevance of our findings and map directions for future work.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Berkeley

Risk-Aware Algorithms for Learning-Based Control With Applications to Energy and Mechatronic Systems