Skip to main content
eScholarship
Open Access Publications from the University of California

Leveraging context for perceptual prediction using word embeddings

Abstract

Pre-trained word embeddings have been used successfully in semantic NLP tasks to represent words. However, there is continued debate as to whether they encode useful information about the perceptual qualities of concepts. Previous research has shown mixed performance when embeddings are used to predict these perceptual qualities. Here, we tested if we could improve performance by providing an informative context. To this end, we generated decontextualised (“charcoal”) and contextualised (“the brightness of charcoal”) word2vec and BERT embeddings for a large set of concepts and compared their ability to predict human ratings of the concepts’ brightness. We repeated this procedure to also probe for the shape of those concepts, finding that it can be better predicted than brightness. We consider the potential advantages of using context to probe specific aspects of meaning, including those currently thought to be poorly represented by language models.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View