Tosh, Christopher

Algorithms for Statistical and Interactive Learning Tasks

2018

Tosh, Christopher
Advisor(s): Dasgupta, Sanjoy

Abstract

In the first part of this thesis, we examine the computational complexity of three fundamental statistical tasks: maximum likelihood estimation, maximum a posteriori estimation, and approximate posterior sampling. We show that maximum likelihood estimation for mixtures of spherical Gaussians is NP-hard. We also demonstrate that hardness of maximum likelihood estimation implies hardness of maximum a posteriori estimation and approximate posterior sampling in many instances.

In the second part of this thesis, we explore the behavior of a common sampling algorithm known as the Gibbs sampler. We show that in the context of Bayesian Gaussian mixture models, this algorithm can take a very long time to converge, even when the data looks as though it were generated by the model. We also demonstrate that when a particular variant of the Gibbs sampler is used in the context of a class of bipartite graphical models, called Restricted Boltzmann Machines, it can be guaranteed to converge quickly in certain instances.

In the third part of this thesis, we consider learning problems in which the learner is allowed to solicit interaction from a user. In the context of classification, we present an efficient active learning algorithm whose performance is guaranteed to be comparable to any active learning algorithm for the particular instance under consideration. We also introduce a generic framework, termed interactive structure learning, for interactively learning complex structures over data, and we present a simple and effective algorithm for this setting which enjoys nice statistical properties.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC San Diego

Algorithms for Statistical and Interactive Learning Tasks