Skip to main content
eScholarship
Open Access Publications from the University of California

Mapping the unknown: The spatially correlated multi-armed bandit

Abstract

We introduce the spatially correlated multi-armed banditas a task coupling function learning with the exploration-exploitation trade-off. Participants interacted with bi-variatereward functions on a two-dimensional grid, with the goal ofeither gaining the largest average score or finding the largestpayoff. By providing an opportunity to learn the underly-ing reward function through spatial correlations, we modelto what extent people form beliefs about unexplored payoffsand how that guides search behavior. Participants adapted toassigned payoff conditions, performed better in smooth thanin rough environments, and—surprisingly—sometimes per-formed equally well in short as in long search horizons. Ourmodeling results indicate a preference for local search options,which when accounted for, still suggests participants werebest-described as forming local inferences about unexploredregions, combined with a search strategy that directly tradedoff between exploiting high expected rewards and exploring toreduce uncertainty about the spatial structure of rewards.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View