We address the problem of modeling the spatial and temporal second-order statistics of video sequences that exhibit both spatial and temporal regularity, intended in a statistical sense. We model such sequences as dynamic multiscale autoregressive models, and introduce an efficient algorithm to learn the model parameters. We then show how the model can be used to synthesize novel sequences that extend the original ones in both space and time, and illustrate the power, and limitations, of the models we propose with a number of real image sequences.