Skip to main content
Open Access Publications from the University of California

Playing the Lottery of a Lifetime: The Effect of Socially Induced Aspiration on Q-Learning Agents


Our aspirations are influenced by the rewards obtained by people around us. How adaptive are these inherited aspirations in stochastic, lottery-like environments? We study the behavior of social Q-learning agents in two multi-armed bandit (MAB) settings: 1) a standard task where one arm gives a higher reward than others and, 2) a lottery task where all arms give a high reward with some small probability. We define aspiration as a function of rewards attained by a previous generation, and happiness as a linear combination of rewards and aspiration. We find that in the standard MAB task, higher aspiration encourages exploration, and agents who learn from the ‘top’ agents accumulate more rewards and happiness. However, in the lottery task, higher aspiration doesn’t improve performance; instead, agents who learn from the ‘top’ agents are more unhappy. Together, this research highlights the context-dependent nature of aspirations and their implications to modern society.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View