## On the Construction of Transport Maps for Scalable Inference and Optimal Communication with Applications in Global Health

- Author(s): Mesa, Diego Alberto
- Advisor(s): Coleman, Todd P
- et al.

## Abstract

The need to analyze large, complex, and multi-modal data sets has become increasingly common across modern scientific environments. Addressing the unique challenges that these datasets pose has been the focus of much recent effort across fields such as computer science, biology and information theory. While different approaches have been developed to deal with these issues as efficiently as possible, many provide point estimates without a clear indication of the uncertainty associated with those estimates. This uncertainty is of critical importance for decision making in many different areas (e.g. digital health and large-scale system design), but has been not adequately addressed in modern, large-scale techniques. A characterization of the uncertainty associated with an estimate can be obtained by an accurate representation of the posterior distribution, in a Bayesian inference framework. Traditionally this has been unobtainable, as a full characterization of the posterior requires calculation of a highly nontrivial integral and/or the ability to efficiently draw samples from the posterior. Motivated by these issues, we extend previous results on the convexity of Bayesian Inference through an Optimal Transport framework (Kim et al. 2013) and provide a more general push-forward theorem for pushing a distribution P to a distribution Q. Moreover, we demonstrate that under mild assumptions, leveraging the technique of Alternating Direction Method of Multipliers (ADMM) (Boyd et al. 2012), the push-forward can be carried out in a distributed and scalable manner. We show how the efficient Bayesian Inference framework described in (Kim et al. 2013) is a special case of this more general push-forward theorem, and how through ADMM, solving for the optimal Bayesian map reduces to solving a series of maximum a posteriori (MAP) estimation problems. This greatly simplifies adoption by leveraging existing infrastructure for solving large scale MAP problems.

Using the theory of optimal transport, we also consider the dual problem of optimal communication. Many problems can naturally be cast as an optimal communication problem where a message W is signaled sequentially with feedback across a noisy channel. We model these problems as $W \in \cW \subset \reals^d$ and consider optimizing encoding strategies that map $W$ and $Y_1, \ldots, Y_{n-1}$ into $X_n$ that have a dynamical systems flavor. The decoder, with knowledge of the encoder's strategy, simply performs Bayesian updates to sequentially construct a posterior belief $\psi_n$ about the message after observations $Y_1,\ldots,Y_n$. In this thesis we use the theory of optimal transport to expand the Posterior Matching Scheme (PM) (Shayevitz and Feder) to address two unmet needs: (a) We develop a generalization to the PM scheme for arbitrary memoryless channels where $\cW \subset \reals^d$ for any $d \geq 1$. Specifically, we develop recursive encoding schemes that share the same mutual-information maximizing and iterative, time-invariant properties and reduce to the original scheme when $\cW=[0,1]$; (b) We define notions of reliability and achievability in a manner analogous to (Shayevitz

amp; Feder 2011) but in terms of almost-sure convergence of random variables. With this, we then develop necessary and sufficient conditions for the scheme to be reliable and/or attain optimal convergence rate (e.g. achieve capacity). We show that both of these conditions have the same necessary and sufficient condition: the ergodicity of a random process $(\tW_n)_{n \geq 1}$ within the encoder of a PM scheme. Using the theory of optimal transport, we construct schemes in (a), exploiting an invariability property implicit in these schemes to show the equivalent conditions in (b).To instantiate the framework described above, we consider an important application in public health: Fetal Alcohol Spectrum Disorders (FASD). In FASD, early identification of affected infants is critical to successfully treating children affected with the disease. As part of an ongoing longitudinal cohort study in Ukraine, infants were evaluated at 6 and 12 months with the Cardiac Orienting Response (COR)--an inexpensive, easy to administer assessment tool for identification of developmental delay--and with the Bayley Scales of Infant Development, II. These CORs were collected during a habituation/dishabituation learning paradigm. Below we consider the CORs effectiveness in assessing an individual's risk of later developmental delay and compare its predictive utility to that of the 6-month Bayley, where we show that the COR paradigm significantly outperforms the 6 month Bayley. As the resources required to obtain a Bayley score are substantially more than in a COR-based paradigm, the findings are suggestive of its utility as an early scalable screening tool. Further work is needed to incorporate this initial screening test within a large-scale sequential risk segmentation framework.