In this dissertation, we present our work on automating discovery of governing equations for stochastic dynamical systems from noisy, vector-valued time series. By discovery, we mean learning both a drift vector field and a diffusion matrix for an It\^{o} stochastic differential equation (SDE) in $\mathbb{R}^d$. In particular, we develop, test, and compare numerical methods for the computation of likelihoods for SDE models. We focus on likelihood computation as it is intractable with no closed form solution in most cases. Thus it forms the bottleneck for both the frequentist and Bayesian methods for inference of stochastic systems.
In the first part of the dissertation, we develop an iterative algorithm using expectation maximization (EM) combined with data augmentation using diffusion bridge sampling. We focus on nonparametric models for high-dimensional SDEs in the low-data, high-noise regime. To our knowledge, this is the most general EM approach to learning an SDE with multidimensional drift vector field and diffusion matrix. Data augmentation has a two-fold advantage; the expectation of log likelihood in the E step reduces to summation and the optimization in the M step reduces to a batch-wise least-squares problem.
In the second part of the dissertation, we consider the problem of Bayesian filtering and inference for lower-dimensional parametric SDE models in the low-data, high-noise regime. Our goal is to be able to infer the model parameters (the inference problem) and the true states of the processes (the filtering problem). We develop a numerical approximation for the likelihood of the SDE using an innovative density tracking by quadrature (DTQ) method. The posterior can be deterministically tracked, as it evolves between each time interval, through a temporal and spatial grid. We focus on generating accurate estimates of the likelihood function, which allows accurate maximum likelihood estimation (MLE) and maximum a posteriori (MAP) estimates, and for vanilla Monte Carlo samplers to explore the distribution.