Causal Inference for Personalized Educational Systems
Skip to main content
Open Access Publications from the University of California


UCLA Electronic Theses and Dissertations bannerUCLA

Causal Inference for Personalized Educational Systems


Educational systems have traditionally been evaluated using cross-sectional studies, namely,examining a pretest, posttest, and single intervention. Although this is a popular approach in education, it does not model valuable information such as confounding variables, feedback to students, and other real-world deviations of studies from ideal conditions. Moreover, learning inherently is a sequential process and should involve a sequence of interventions. Nowadays, due to the availability of a large volume of educational data, researchers can develop more intelligent inference algorithms. We propose to exploit the rich features in time series data and use them to develop more intelligent and individualized educational systems. Our approach is five-fold: First, we model the sequential nature of education using hidden Markov models and show that analysis of a sequence of student actions is predictive of posttest results. Second, we propose more intelligent experimental designs by collecting richer data from students by including questions on potential confounders in the diagnostic test and instructor interventions during office hours. Third, we propose various experimental and quasi-experimental designs for educational systems and quantify them using the graphical model and directed acyclic (DAG) graph language. We discuss the application and limitations of each method in education. Fourth, we propose to model the education system as time-varying treatments, confounders, and time-varying treatments-confounders feedback. We show that if we control for a sufficient set of confounders and use appropriate inference techniques such as the inverse probability of treatment weighting (IPTW) or g-formula, we can close the backdoor paths and derive the unbiased causal estimate of joint interventions on the outcome. Fifth, we compare the g-formula and IPTW performance and discuss the pros and cons of using each method.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View