Skip to main content
eScholarship
Open Access Publications from the University of California

Preserved Structure Across Vector Space Representations

Abstract

Certain concepts, words, and images are intuitively more sim-ilar than others (dog vs. cat, dog vs. spoon), though quantify-ing such similarity is notoriously difficult. Indeed, this kindof computation is likely a critical part of learning the categoryboundaries for words within a given language. Here, we usea set of 27 items (e.g. ‘dog’) that are highly common in in-fants’ input, and use both image- and word-based algorithmsto independently compute similarity among them. We findthree key results. First, the pairwise item similarities derivedwithin image-space and word-space are correlated, suggest-ing preserved structure among these extremely different rep-resentational formats. Second, the closest ‘neighbors’ for eachitem, within each space, showed significant overlap (e.g. bothfound ‘egg’ as a neighbor of ‘apple’). Third, items with themost overlapping neighbors are later-learned by infants andtoddlers. We conclude that this approach, which does not relyon human ratings of similarity, may nevertheless reflect stablewithin-class structure across these two spaces. We speculatethat such invariance might aid lexical acquisition, by servingas an informative marker of category boundaries.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View