A computationally rational analysis of response strategy in a probability learning task
Intelligent behavior requires the ability to adapt to an ever-changing environment. But are humans rational or normative in this ability? We apply a resource-rational analysis to the data from a probability learning task (Gagne et al., 2020). Our analysis hypothesizes that people seek to maximize the expected utility of behavior, while simultaneously minimizing the complexity of their behavioral policies. We report evidence consistent with this hypothesis. We also show that people adopt simpler policies in situations of greater environmental stability, and interpret this as a consequence of reward maximization.