Yiu, Eunice; Sandbrink, Kai J; Gopnik, Alison

To observe or to bet? Investigating purely exploratory and purely exploitative actions in children, adults, and computational models.

2024

Creative Commons 'BY' version 4.0 license

Abstract

Autonomous agents often need to decide between choosing actions that are familiar and have previously yielded positive results (exploitation) and seeking new information that could help uncover more effective actions (exploration). We present an ‚Äúobserve or bet‚Äù task that separates ‚Äúpure exploration‚Äù from ‚Äúpure exploitation‚Äù: 75 five-to-seven-year-old children, 60 adults and computational agents have to decide either to observe an outcome without reward, or to bet on an action without immediate feedback at varying probability levels. Their performances were measured against solutions from the partially observable Markov decision process and meta-RL models. Children and adults tended to choose observation more than both algorithm classes would suggest. Children also modulated their betting policy based on the probability structure and amount of evidence, exhibiting ‚Äúhedging behavior‚Äù a strategy not evident in standard bandit tasks. The results provide a benchmark for reasoning about reward and information in humans and neural network models.

Main Content

For improved accessibility of PDF content, download the file to your device.

Proceedings of the Annual Meeting of the Cognitive Science Society

To observe or to bet? Investigating purely exploratory and purely exploitative actions in children, adults, and computational models.