Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Towards grammaticality and fluency : characterizing and correcting ESL errors using dictionary random walks and other means

Abstract

We present two novel classes of noisy channel models to address verb infinitive/present participle confusion and word choice production errors in text produced by English as a second language (ESL) authors in an extension of Park and Levy (2011). In our word choice model, which is the primary contribution of this work, we model the English word choices made by ESL authors as a random walk across an undirected bipartite dictionary graph composed of edges between English words and associated words in the author's native language. We use cascades of weighted finite-state transducers (wFSTs) to model language model priors, verb form confusion and random walk-induced noise, and observed sentences, and expectation maximization (EM) to learn model parameters. Additionally, we explore the use of online EM for model training. We show that such models can make intelligent verb form and dictionary-based word substitutions to improve grammaticality and fluency in an unsupervised setting

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View