Jones, Cameron R; Trott, Sean; Bergen, Benjamin

Does reading words help you to read minds? A comparison of humans and LLMs at a recursive mindreading task

2024

Creative Commons 'BY' version 4.0 license

Abstract

There is considerable debate about the origin, mechanism, and extent of humans' capacity for recursive mindreading: the ability to represent beliefs about beliefs about beliefs (and so on). Here we quantify the extent to which language exposure could support this ability, using a Large Language Model (LLM) as an operationalization of distributional language knowledge. We replicate and extend O'Grady, et al. (2015)'s finding that humans can mindread up to 7 levels of embedding using both their original method and a stricter measure. In Experiment 2, we find that GPT-3, an LLM, performs comparably to humans up to 4 levels of embedding, but falters on higher levels, despite being near ceiling on 7th-order non-mental control questions. The results suggest that distributional information (and the transformer architecture in particular) can be used to track complex recursive concepts (including mental states), but that human mentalizing likely draws on resources beyond distributional likelihood.

Main Content

For improved accessibility of PDF content, download the file to your device.

Proceedings of the Annual Meeting of the Cognitive Science Society

Does reading words help you to read minds? A comparison of humans and LLMs at a recursive mindreading task