Mayo, David; Scott, Tyler R; Ren, Mengye; Elsayed, Gamaledin; Hermann, Katherine; Jones, Matt; Mozer, Michael

Multitask Learning Via Interleaving: A Neural Network Investigation

2023

Creative Commons 'BY' version 4.0 license

Abstract

The most common settings in machine learning to study multitask learning assume either that a random task is selected on each training trial, or that one task is trained to mastery and then training advances to the next. We study an intermediate setting in which tasks are interleaved, i.e., training proceeds on task A for some period of time, switches to another task B before A is mastered, and continues to alternate. We examine properties of modern neural net learning algorithms and architectures in this setting. The networks exhibit effects of task sequence that are qualitatively similar to established phenomena in human learning and memory, including: forgetting with relearning savings, task switching costs, and better memory consolidation with interleaved training. By improving our understanding of such properties, one can design learning schedules that are suitable given the temporal structure of the environment. We illustrate with a momentum optimizer that resets momentum following a task switch and leads to reliably better online cumulative learning accuracy.

Main Content

For improved accessibility of PDF content, download the file to your device.

Proceedings of the Annual Meeting of the Cognitive Science Society

Multitask Learning Via Interleaving: A Neural Network Investigation