Optimal prediction methods compensate for a lack of resolution in the numerical solution of time-dependent differential equations through the use of prior statistical information. We present a new derivation of the basic methodology, show that field-theoretical perturbation theory provides a useful device for dealing with quasi-linear problems, and provide a nonlinear example that illuminates the difference between a pseudo-spectral method and an optimal prediction method with Fourier kernels. Along the way, we explain the differences and similarities between optimal prediction, the representer method in data assimilation, and duality methods for finding weak solutions. We also discuss the conditions under which a simple implementation of the optimal prediction method can be expected to perform well.