Strategic Monte Carlo and Variational Methods in Statistical Data Assimilation for Nonlinear Dynamical Systems
- Author(s): Shirman, Aleksandra
- Advisor(s): Abarbanel, Henry D. I.
- et al.
Data Assimilation (DA) is a method through which information is extracted from measured quantities and with the help of a mathematical model is transferred through a probability distribution to unknown or unmeasured states and parameters characterizing the system of study. With an estimate of the model paramters, quantitative predictions may be made and compared to subsequent data.
Many recent DA efforts rely on an probability distribution optimization that locates the most probable state and parameter values given a set of data. The procedure developed and demonstrated here extends the optimization by appending a biased random walk around the states and parameters of high probability to generate an estimate of the structure in state space of the probability density function (PDF). The estimate of the structure of the PDF will facilitate more accurate estimates of expectation values of means, standard deviations and higher moments of states and parameters that characterize the behavior of the system of study. The ability to calculate these expectation values will allow for an error bar or tolerance interval to be attached to each estimated state or parameter, in turn giving significance to any results generated. The estimation method’s merits will be demonstrated on a simulated well known chaotic system, the Lorenz 96 system, and on a toy model of a neuron. In both situations the model system provides unique challenges for estimation: In chaotic systems any small error in estimation generates extremely large prediction errors while in neurons only one of the (at minimum) four dynamical variables can be measured leading to a small amount of data with which to work.
This thesis will conclude with an exploration of the equivalence of machine learning and the formulation of statistical DA. The application of previous DA methods are demonstrated on the classic machine learning problem: the characterization of handwritten images from the MNIST data set. The results of this work are used to validate common assumptions in machine learning work such as the dependence of the quality of results on the amount of data presented and the size of the network used. Finally DA is proposed as a method through which to discern an `ideal' network size for a set of given data which optimizes predictive capabilities while minimizing computational costs.