Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Barbara

UC Santa Barbara Previously Published Works bannerUC Santa Barbara

Unsupervised morphological segmentation in a language with reduplication

Abstract

We present an extension of the Morfessor Baseline model of unsupervised morphological segmentation (Creutz and Lagus, 2007) that incorporates abstract templates for reduplication, a typologically common but computationally underaddressed process. Through a detailed investigation that applies the model to Māori, the Indigenous language of Aotearoa New Zealand, we show that incorporating templates improves Morfessor’s ability to identify instances of reduplication, and does so most when there are multiple minimally-overlapping templates. We present an error analysis that reveals important factors to consider when applying the extended model and suggests useful future directions.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View