Song, Mingyu; Niv, Yael; Cai, Ming Bo

Learning what is relevant for rewards via value-based serial hypothesis testing

2020

Creative Commons 'BY' version 4.0 license

Abstract

Learning what is relevant for reward is a ubiquitous and crucialtask in daily life, where stochastic reward outcomes can de-pend on an unknown number of task dimensions. We designeda paradigm tailored to study such complex scenarios. In the ex-periment, participants configured three-dimensional stimuli byselecting features for each dimension and received probabilis-tic feedback. Participants selected more rewarding featuresover time, demonstrating learning. To investigate the learningprocess, we tested two learning strategies, feature-based rein-forcement learning and serial hypothesis testing, and found ev-idence for both. The extent to which each strategy was engageddepended on the instructed task complexity: when instructedthat there were fewer relevant dimensions (and therefore fewerreward-generating rules were possible) people tended to seri-ally test hypotheses, whereas they relied more on learning fea-ture values when more dimensions were relevant. To explainthe behavioral dependency on task complexity and instruc-tions, we tested variants of the value-based serial hypothesistesting model. We found evidence that participants constructedtheir hypothesis space based on the instructed task condition,but they failed to use all the information provided (e.g. rewardprobabilities). Our current best model can qualitatively capturethe difference in choice behavior and performance across taskconditions.

Main Content

For improved accessibility of PDF content, download the file to your device.

Proceedings of the Annual Meeting of the Cognitive Science Society

Learning what is relevant for rewards via value-based serial hypothesis testing