Skip to main content
eScholarship
Open Access Publications from the University of California

Word prediction is more than just predictability: An investigation of core vocabulary

Abstract

What words are central in our semantic representations? In this experiment, we compared the core vocabulary derived from different association-based and language-based distributional models of semantic representation. Our question was: what kinds of words are easiest to guess given the surrounding sentential context? This task strongly resembles the prediction tasks on which distributional language models are trained, so core words from distributional models might be expected to be easier to guess. Results from 667 participants revealed that people's guesses were affected by word predictability, but that aspects of their performance could not be explained by distributional language models and were better captured by association-based semantic representations.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View