Skip to main content
eScholarship
Open Access Publications from the University of California

What Do Computers Know About Semantics Anyway? Testing DistributionalSemantics Models Against a Broad Range of Relatedness Ratings

Creative Commons 'BY' version 4.0 license
Abstract

Distributional Semantics Models (DSMs) are a primary method for distilling semantic information from corpora. However,a key question remains: What types of semantic relations do DSMs detect? Prior work has addressed this question using alimited set of ratings that typically are either amorphous (association norms) or restricted to semantic similarity (SimLex,SimVerb). We tested four DSMs (SkipGram, CBOW, GloVe, PPMI) using multiple hyperparameters on a theoretically-motivated, rich set of relations involving words from multiple syntactic classes spanning the abstract-concrete continuum(21 sets of ratings). Results show wide variation in the DSMs’ ability to account for the ratings, and that hyperparametertuning buys comparatively little for improving correlations. For CBOW and SkipGram, we included word and contextembeddings. For SkipGram, there was a marked improvement in simulating the human data by averaging them. Ourresults yield important insights into the types of semantic relations that are captured by DSMs.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View