Kelly, M. A.; Xu, Yang; Calvillo, Jesus; Reitter, David

Which Sentence Embeddings and Which LayersEncode Syntactic Structure?

2020

Creative Commons 'BY' version 4.0 license

Abstract

Recent models of language have eliminated syntactic-semanticdividing lines. We explore the psycholinguistic implicationsof this development by comparing different types of sentenceembeddings in their ability to encode syntactic constructions.Our study uses contrasting sentence structures known to causesyntactic priming effects, that is, the tendency in humans to re-peat sentence structures after recent exposure. We comparehow syntactic alternatives are captured by sentence embed-dings produced by a neural language model (BERT) or by thecomposition of word embeddings (BEAGLE, HHM, GloVe).Dative double object vs. prepositional object and active vs.passive sentences are separable in the high-dimensional spaceof the sentence embeddings and can be classified with a highdegree of accuracy. The results lend empirical support to themodern, computational, integrated accounts of semantics andsyntax, and they shed light on the information stored at differ-ent layers in deep language models such as BERT.

Main Content

For improved accessibility of PDF content, download the file to your device.

Proceedings of the Annual Meeting of the Cognitive Science Society

Which Sentence Embeddings and Which LayersEncode Syntactic Structure?