Skip to main content
eScholarship
Open Access Publications from the University of California

Developing Object Permanence from Videos

Abstract

Humans learn that temporarily occluded objects continue to exist within the first months of their lives. Deep learning mod- els, on the other hand, struggle to generalize such concepts from observations, due to missing proper inductive biases. Here, we introduce the first self-supervised interpretable ma- chine learning model that learns about object permanence di- rectly from video data without supervision. We augment a slot- based autoregressive deep learning system with the ability to adaptively and selectively fuse latent imaginations with pixel- based observations into consistent object-specific ‘what' and ‘where' encodings over time. We show that (i) Loci-Looped tracks objects through occlusions and anticipates their reap- pearance while outperforming state-of-the-art baseline models, (ii) Loci-Looped shows signs of surprise when the principle of object permanence is violated, and (iii) Loci-Looped's internal latent loop is key for learning object permanence.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View