The Affordances of Item Response Theory: The Case of a Complex Learning Progression about Evolution
The National Research Council (NRC) has published multiple reports in recent years that describe potential improvements to elementary science instruction, and call for higher standards for the development of related assessments. To meet this charge, the Development of the Conceptual Underpinnings of Evolution (CUE) project was funded by the National Science Foundation and produced the data analyzed with item response theory in this dissertation.
The CUE project team designed curricula for introducing concepts of micro-evolution to second- and third-grade students to engage young learners in reasoning about complex scientific phenomena. The team studied student capacities for understanding concepts in micro-evolution, while investigating the interplay of cognitive development, learning, and instruction. The team also developed one-on-one interview assessments, based on a learning progression that describes children’s pathways to more sophisticated types of reasoning about micro-evolution. The design of this progression was a novel attempt to model early learning in this context, informed by cognitive development theory, classroom experiences, and assessment activities in an iterative process. Throughout this work, the CUE team examined the complexity of children’s conceptualization of the concepts under study, especially with respect to the differentiation and order of conceptual levels within the learning progression and their multidimensional nature.
This dissertation examines the affordances of Item Response Theory (IRT) in investigating and validating the hypothetical learning progression and representing the range of student thinking, while enabling the quantitative assessment of learning gains. The empirical findings address practical questions about item function, multidimensionality, open-ended items, and pre-post effects, and ultimately support the use of this complex learning progression to model student thinking. This dissertation presents an uncommon case where IRT is applied to interview data, and complementary outcome measures were chosen to best represent the complex phenomenon under study. This work provides a model for strengthening validity arguments by examining assumptions about how items, responses, and coding schemes relate to the theoretical construct, and how the scores they yield inform what we know about student reasoning and learning gains.