Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Cruz

UC Santa Cruz Electronic Theses and Dissertations bannerUC Santa Cruz

Measure-valued Proximal Recursions for Learning and Control

Abstract

In this dissertation, we investigate convex optimization problems over the space of probability measures, highlighting applications in stochastic control, stochastic modeling, and stochastic learning. The theory and algorithms we develop can be applied to problems such as sampling from unnormalized priors, policy iteration in reinforcement learning, simulating mean-field dynamics, Wasserstein GANs, optimal distribution steering a.k.a. Schr\"{o}dinger bridge, and its zero-noise limit: optimal mass transport. We propose proximal recursions for solving these measure-valued optimization problems, offering novel algorithms that extend the concept of gradient steps to the space of probability measures.

We propose new algorithms for solving generalized Schr\"{o}dinger bridge problems where the drift and/or diffusion coefficient could be nonlinear in state as well as affine or non-affine in control. We illustrate our results on both model-based and model-free numerical case studies.

Furthermore, we demonstrate that our measure-valued proximal recursions are also useful in stochastic modeling, specifically offering insights for a controlled mean field model. We illustrate these ideas by deriving a controlled mean field dynamics model for chiplet dynamics in micro assembly applications. This model extends finite population dynamics to a continuum, formulating a nonlocal, nonlinear PDE that encapsulates stochastic forces and nonlinear interactions between chiplets and electrodes. The deduced mean field evolution, was found to be a Wasserstein gradient flow of a Lyapunov-like energy functional.

We then turn to applying our measure-valued proximal recursions in stochastic learning. Here we propose two algorithms: one centralized and another distributed. The proposed centralized algorithm solves mean field learning dynamics in a neural network in over-parameterized limit. The proposed distributed algorithm generalizes the well-known Euclidean alternating direction method of multipliers (ADMM) to the space of probability measures. Numerical examples are given to illustrate the performance of the proposed algorithms w.r.t. the state-of-the-art.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View