Skip to main content
eScholarship
Open Access Publications from the University of California

Revealing Long-term Language Change with Subword-incorporated WordEmbedding Models

Abstract

We propose an augmented word embedding model that better incorporates subword information with additional parametersthat characterize the semantic weights of characters in composing words. Our model can reveal some interesting patternsof long-term change in Chinese language, which provides novel evidence and methodology that enriches existing theoriesin evolutionary linguistics. The resulting word vectors also has decent performance in NLP-related tasks.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View