Fast MCMC algorithms, Stability and DeepTune
- Author(s): Chen, Yuansi
- Advisor(s): Yu, Bin
- et al.
Drawing samples from a known distribution is a core computational challenge common to many disciplines, with applications in statistics, probability, operations research, and other areas involving stochastic models. In statistics, sampling methods are useful for both estimation and inference, including problems such as estimating expectations of desired quantities, computing probabilities of rare events, gauging volumes of particular sets, exploring posterior distributions and obtaining credible intervals etc.
Facing massive high dimensional data, both computational efficiency and good statistical guarantees are more and more important in modern statistical and machine learning applications. In this thesis, centered around sampling algorithms, we consider the fundamental questions on their computational and statistical guarantees: How to design a fast sampling algorithm and how long should it be run? What are the statistical learning guarantee of these algorithms? Are there any trade-offs between computation and learning?
To answer these questions, first we start with establishing non-asymptotic convergence guarantees for popular MCMC sampling algorithms in Bayesian literature: Metropolized Random Walk, Metropolis-adjusted Langevin algorithm and Hamiltonian Monte Carlo. To address a number of technical challenges arise enroute, we develop results based on the conductance profile in order to prove quantitative convergence guarantees general continuous state space Markov chains. Second, to confront a large class of constrained sampling problems, we introduce two new algorithms, Vaidya and John walks, to sample from polytope-constrained distributions with convergence guarantees. Third, we prove fundamental trade-off results between statistical learning performance and convergence rate of any iterative learning algorithm, including sample algorithms. The trade-off results allow us to show that a too stable algorithm can not converge too fast, and vice-versa. Finally, to help neuroscientists analyze their massive amount of brain data, we develop DeepTune, a stability-driven visualization and interpretation framework via optimization and sampling for the neural-network-based models of neurons in visual cortex.