Playing Games to Reduce Supervision in Learning
In this dissertation, we explore two fundamental sets of inference problems arising in machine learning and statistics. We present robust, efficient, and straightforward algorithms for both, adapting sensitively to structure in data by viewing these problems as playing games against an adversary representing our uncertainty.
In the problem of classification, there is typically much more unlabeled data than labeled data, but classification algorithms are largely designed to be supervised, only taking advantage of labeled data. We explore how to aggregate the predictions of an ensemble of such classifiers as accurately as possible in a semi-supervised setting, using both types of data. The insight is to formulate the learning problem as playing a game over the unlabeled data against an adversary, who plays the unknown true labels. This formulation uses unlabeled data to improve performance over labeled data alone in an extremely general and efficient way, without model assumptions or tuning parameters. We demonstrate this by devising and evaluating a number of practical, scalable semi-supervised learning algorithms. The theoretical contributions include a proof that the optimal aggregation rules in this semi-supervised setting are artificial neurons for many natural loss functions, with efficient convex algorithms for learning them.
We also provide fundamental results for a second set of problems relating to sequential learning and testing. Random variation in such situations can typically be described by a martingale, a generalization of a random walk that describes any repeated fair game. We describe the concentration behavior of a martingale's sample path, extending to finite times the law of the iterated logarithm, a classic result of probability theory. With this powerful tool, we are able to show how to design simple sequential tests that use as few samples as possible to detect an effect, provably adapting to the unknown effect size. We also apply our results to optimally correct the p-values of many common statistical hypothesis tests, making them robust to the common practice of 'peeking' at test results and waiting for a significant one to report.