Humans are remarkably proficient at decomposing and recombiningconcepts they have learned. In contrast, while deep learning-based
methods have been shown to fit large datasets and out-perform humans
at some tasks, they often fail when presented with conditions even
just slightly outside of the distribution they were trained on. In
particular, machine learning models fail at compositional
generalization, where the model would need to predict how concepts fit
together without having seen that exact combination during training.
This thesis proposes several learning-based methods that take
advantage of the compositional structure of tasks and shows how they
perform better than black-box models when presented with novel
compositions of previously seen subparts. The first type of method is
to directly decompose neural network into separate modules that are
trained jointly in varied combinations. The second type of method is
to learn representations of tasks and objects that obey arithmetic
properties such that tasks representations can be summed or subtracted
to indicate their composition or decomposition. We show results in
diverse domains including games, simulated environments, and real
robots.