Demographic Inference from Large Samples: Theory and Methods
The emergence of next-generation sequencing has revolutionized our ability to interrogate the genome, leading to fascinating new insights into the nature of humans and many other species. At the same time it has created theoretical and computational challenges associated with the need to perform robust and efficient inference on increasingly massive genetic data sets.
In this thesis I focus on those challenges in the context of demographic inference, which estimates the past history of a population on the basis of genetic data sampled at the present. I derive the basic theoretical models which enable such inferences. I then refine them to formulate a new inference procedure for reconstructing size history using hundreds of whole genomes at a time, a significant increase over existing methods. I complement this algorithmic advance with some theoretical results on the accuracy of demographic inference as sample sizes grow large.