Generative models in an inverse graphics framework are appealing models for visual perception. How might childrenacquire them? We present a computational procedure for learning generative models of human faces using developmen-tally plausible input. Our statistical model of shape and appearance initially uses the average face as a template with asimple Gaussian process model of deformations. We iteratively learn the statistical distribution of faces by performinganalysis-by-synthesis on a small number of images and combine the results to construct an improved generative model.Our analysis-by-synthesis framework combines a convolutional neural network for fast inference with a Markov chainMonte Carlo process for detailed refinement. This learning strategy quickly captures the variation of natural faces anddemonstrates an efficient way to learn the distribution of faces.