Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Causal Inference and Large Language Models from the Causal Invariance Framework

Abstract

Statistics serves as the grammar of all science, and central to the goal of science is understanding cause-effect relationships. Scientists rely on research methodology and statistical tools to uncover causal relationships, and engineers rely on statistical methods to create artificial assistants to aid daily life. Neither statistical learning nor next-word-prediction (used to train artificial general intelligence) are consistent with rational causal learning and reasoning in humans. The present thesis examines the fundamental goals and assumptions made in dominant statistical methods and discusses their implications for statistical inference and commonsense reasoning in artificial general intelligence (AGI). The first section introduces and evaluates a causal alternative to logistic regression, which estimates the causal power (from the causal invariance framework) of treatments among covariates. Causal invariance is defined as the influence of a candidate cause (elemental or conjunctive) that is independent of background causes, with the aspiration of acquiring knowledge that’s useable, in the minimalist sense being able to generalize from a learning context to an application context. The second and final section investigates current benchmark tasks used to evaluate causal reasoning in large language models (e.g., GPT-3, GPT-4), and introduces a stricter test informed by psychological literature on human causal cognition under the causal invariance framework.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View