Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Relational Inductive Biases for Visual-Semantic Embeddings

Abstract

Neural networks have been shown effective at learning rich low-dimensional representations of high-dimensional data such as images and text. There has also been many recent works using neural networks to learn a common embedding between data of different modes, specifically between images and textual descriptions, a task commonly referred to as learning visual-semantic embeddings. This is typically achieved using a separate encoder for images and text and a contrastive loss. Inspired by recent works in relational reasoning and graph neural networks, this work studies the effects of using a relational inductive bias on the quality of learned visual-semantic embeddings. Training and evaluation is done using caption-to-image and image-to-caption retrieval on the MS-COCO dataset.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View