Nascimento Ferreira, Lucas

Controlling Neural Language Models for Affective Music Composition

2021

Nascimento Ferreira, Lucas
Advisor(s): Whitehead, Jim

Creative Commons 'BY-NC-SA' version 4.0 license

Abstract

Deep generative models are currently the leading method for algorithmic music composition. However, one of the major problems of this method consists of controlling the trained models to generate compositions with given characteristics. This dissertation explores how to control deep generative models to compose music with a target emotion. Given the limitation of labeled data, this dissertation focuses on search-based methods that use a music emotion classifier to steer the distribution of a pre-trained musical language model. Three different search-based approaches have been proposed. The first one is a genetic algorithm to optimize a language model towards a given sentiment. The second one is a decoding algorithm, called Stochastic Bi-Objective Beam Search (SBBS), which controls the language model at generation time. The third method is also a decoding algorithm but based on Monte Carlo Tree Search. SBBS has been applied to generate background music for tabletop roleplaying games, matching the emotion of the story being told by the players. A dataset of symbolic piano music called VGMIDI has been created to support the work in this dissertation. VGMIDI currently has 200 pieces labeled according to the circumplex model of emotion, and an additional 3,640 unlabelled pieces. The three methods were evaluated with listening tests, in which human subjects indicated that the methods could convey different target emotions.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Santa Cruz

Controlling Neural Language Models for Affective Music Composition