An Integrative Framework of Model Evaluation
An important aspect of empirical research is the construction of a model that represents the data. In psychological and educational measurement, models are typically evaluated regarding their ability to fit well to the observed data. Philosophers of science have long recognized that goodness-of-fit to the realized data is an insufficient metric of a model’s usefulness; models should also be appraised regarding their generalizability to unseen data. Frequentist statistics, Bayesian inference, and information theory seem to offer philosophically and methodologically dissimilar perspectives on model evaluation. However, this dissertation develops a simple theoretical framework that integrates these three perspectives. Within this framework, the information-theoretic principle of minimum description length is explored in the context of item response theory modeling. The findings reveal that complexity in item response theory is defined not by the number of freely estimated parameters in a model, but by its functional form. The frequentist, Bayesian, and information-theoretic approaches are then utilized in evaluating the usefulness of a unidimensional 3-parameter logistic model of item response data from the Program for International Student Assessment. Philosophical ramifications, future research directions, and implications for educational and psychological measurement are discussed.