McGee, Thomas A.; Blank, Idan A

Evidence Against Syntactic Encapsulation in Large Language Models

2024

Creative Commons 'BY' version 4.0 license

Abstract

Transformer large language models (LLMs) perform exceptionally well across a variety of linguistic tasks. These models represent relationships between words in a sentence via ‚Äúattention heads‚Äù, which assign weights between different words. Some attention heads automatically learn to ‚Äúspecialize‚Äù in identifying particular syntactic dependencies. Are syntactic computations in such heads encapsulated from non-syntactic information? Or are they penetrable to external information, like the human mind where information sources such as semantics influence parsing from the earliest moments? Here, we tested whether syntax-specialized attention heads in two LLMs (BERT, GPT-2) are modulated by the semantic plausibility of their preferred dependency. In 6 out of 7 cases, we found that implausible sentences reduce attention between words constituting a head's preferred dependency. Therefore, even in heads that are best candidates for syntactic encapsulation, syntax is penetrable to semantics. These data are broadly consistent with the integration of syntax and semantics in human minds.

Main Content

For improved accessibility of PDF content, download the file to your device.

Proceedings of the Annual Meeting of the Cognitive Science Society

Evidence Against Syntactic Encapsulation in Large Language Models