Skip to main content
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Phonotactic probability in Amharic : : a psycholinguistic and computational investigation


In this dissertation, the robustness of the relationship between the lexical frequency of phonotactic patterns and word-acceptability is examined for words of Amharic, an understudied Semitic language. The patterns under investigation span the whole verb root and include both under-represented and over-represented consonant distributions in the lexicon. A state-of-the-art probabilistic model, the Maximum Entropy phonotactic learner, is used to acquire a phonotactic grammar from the input (the lexicon) and the predictions of that grammar are compared with the results of two Amharic nonce-word rating tasks designed specifically to investigate a range of consonantal phonotactic patterns. The first task investigates consonant co-occurrence patterns (homorganic consonants, identical consonants, and fricatives). In the Amharic verb lexicon, identical consonants are under- represented in some locations and over-represented in others whereas homorganic consonants and fricatives (a previously unknown pattern independently acquired by the model) are under-represented. The phonotactic learner successfully learned the under-represented patterns and the comparison between the model predictions and the experimental results show evidence for a relationship between lexical frequency and word acceptability for under -representation. However, speaker judgements show no preference for over-representation. The second task examines the distribution of single consonants within the verb root with respect to under-representation, over- representation and positional restrictions. Evidence for a relationship between lexical frequency and phonotactic probability was observed for both under-represented and over-represented consonants, but tied to a particular location. The correlation between speaker judgments and model predictions is low for this task, due in part to the way the model deals with over-representation. This investigation demonstrates not only that word acceptability is influenced by phonotactic probability for both under-represented and over-represented patterns, but also that probabilistic models can be used to investigate the phonotactics of a language, even in the absence of speaker judgement data. These models can therefore be used to assess the phonotactics of languages where experimental data is difficult to obtain and broaden our knowledge of phonotactic typology

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View