## Algebraic methods for evaluating integrals In Bayesian statistics

- Author(s): Lin, Shaowei
- Advisor(s): Sturmfels, Bernd
- et al.

## Abstract

The accurate evaluation of marginal likelihood integrals is a difficult fundamental problem in Bayesian inference that has important applications in machine learning and computational biology. Following the recent success of algebraic statistics in frequentist inference and inspired by Watanabe's foundational approach to singular learning theory, the goal of this dissertation is to study algebraic, geometric and combinatorial methods for computing Bayesian integrals effectively, and to explore the rich mathematical theories that arise in this connection between statistics and algebraic geometry. For these integrals, we investigate their exact evaluation for small samples and their asymptotics for large samples.

According to Watanabe, the key to understanding singular models lies in desingularizing the Kullback-Leibler function K(w) of the model at the true distribution. This step puts the model in a standard form so that various central limit theorems can be applied. While general algorithms exist for desingularizing any analytic function, applying them to non-polynomial functions such as K(w) can be computationally expensive. Many singular models are however represented as regular models whose parameters are polynomial functions of new parameters. Discrete models and multivariate Gaussian models are all examples. We call them regularly parametrized models. One of our main contributions is showing how this polynomiality can be exploited by defining fiber ideals for singular models and relating the properties of these algebraic objects to the statistics. In particular, we prove that a model is put in standard form if we monomialize the corresponding fiber ideal. As a corollary, the learning coefficient of a model is equal to the real log canonical threshold (RLCT) of the fiber ideal.

While complex log canonical thresholds are well-studied in algebraic geometry, little is known about their real analogs. In Chapter 4, we prove their fundamental properties and simple rules of computation. We also extend Varchenko's notion of Newton polyhedra and nondegeneracy for functions to ideals. Using these methods, we discover a formula for the RLCT of a monomial ideal with respect to a monomial amplitude. For all other ideals, this formula is an upper bound for their RLCT. Our tools are then applied to a difficult statistical example involving a naive Bayesian network with two ternary random variables.

Because our statistical models are defined over compact semianalytic parameter spaces W, we need to extend standard asymptotic theory of real analytic functions over neighborhoods of the origin to functions over domains like W. Chapter 3 summarizes these results which are critical for other proofs in this dissertation. We also give explicit formulas for the full asymptotic expansion of a Laplace integral over W in terms of the Laurent coefficients of the associated zeta function. In Chapter 5, we apply these formulas to Laplace integrals Z(n) with nondegenerate phase functions, and describe algorithms for computing the coefficient C in the first term asymptotics Z(n) = C n^{-l} (log n)^{t-1}. Procedures for calculating all higher order coefficients are also developed and explained.

Watanabe's treatment of singular models assumes knowledge of the true distribution. In this dissertation, we also explore marginal likelihood integrals of exponential families given data where the true distribution is unknown. This is the context in which Schwarz, Haughton, and Geiger and Rusakov studied the Bayesian Information Criterion (BIC). We find here that the log likelihood ratio of the data is equal to the Kullback-Leibler function of the model at the maximum likelihood distribution. Therefore, all the methods we developed for Kullback-Leibler functions apply, so we describe how to compute the full asymptotics of the marginal likelihood integral by monomializing the associated fiber ideal.

Lastly, to complement developments in asymptotic estimation as well as in Markov Chain Monte Carlo (MCMC) estimation, we present, in Chapter 2, symbolic algorithms for computing marginal likelihood integrals exactly for discrete data of small samples. The underlying statistical models are mixtures of independent distributions, or, in geometric language, secant varieties of Segre-Veronese varieties. For these models, the numerical value of the integral is a rational number, and exact evaluation means computing that rational number rather than a floating point approximation. These exact results provide a gold standard with which approximation methods can be compared.