Learning to Control Schedules of Reinforcement
An ability to quickly learn about relationships between actions and outcomes is essential for adaptive behavior. Such learning can be complicated when the action-outcome relationship depends on the latency with which the action is performed. Inferences about such latency- modified contingencies can greatly improve an agent’s performance by allowing for a timing of responses that optimizes the probability of an outcome given an action. Here we specify a Bayesian reinforcement learner that infers the functional and causal form of the relationship between response latencies and response contingent outcome probabilities. The performance of this Bayesian learner is contrasted with that of a model-free actor-critic algorithm, using behavioral data from three latency-modified schedules of reinforcement. Results suggest that a model-based characterization of latency-modified schedules provides a superior account of free operant behavior. Next the test of latency-modified schedules is extended to multi-action contexts where the latency of performing one action modifies the probability of an outcome given performance of a different action. Participants quickly discovered optimal rates of responding on both the modified and modifying actions, even when rewards contingent on performing the modifying action provided local incentives for deviating from the optimal rate. A final set of experiments tested the hypothesized explicit inference about latency- modified schedules. In summary, these models and experiments highlight the advantages of incorporating latency into a causal structure of actions and goals.