Skip to main content
eScholarship
Open Access Publications from the University of California

Evidence Against Syntactic Encapsulation in Large Language Models

Creative Commons 'BY' version 4.0 license
Abstract

Transformer large language models (LLMs) perform exceptionally well across a variety of linguistic tasks. These models represent relationships between words in a sentence via “attention heads”, which assign weights between different words. Some attention heads automatically learn to “specialize” in identifying particular syntactic dependencies. Are syntactic computations in such heads encapsulated from non-syntactic information? Or are they penetrable to external information, like the human mind where information sources such as semantics influence parsing from the earliest moments? Here, we tested whether syntax-specialized attention heads in two LLMs (BERT, GPT-2) are modulated by the semantic plausibility of their preferred dependency. In 6 out of 7 cases, we found that implausible sentences reduce attention between words constituting a head's preferred dependency. Therefore, even in heads that are best candidates for syntactic encapsulation, syntax is penetrable to semantics. These data are broadly consistent with the integration of syntax and semantics in human minds.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View