Skip to main content
Open Access Publications from the University of California

Department of Statistics, UCLA

Department of Statistics Papers bannerUCLA

Genomewide Motif Identification Using a Dictionary Model


This paper surveys and extends models and algorithms for identifying binding sites in non-coding regions of DNA. These sites control the transcription of genes into messenger RNA in preparation for translation into proteins. We summarize the underlying biology, review three different models for binding site identification, and present a unified model that borrows from the previous models and integrates their main features. We then describe maximum likelihood and maximum a posteriori algorithms for fitting the unified model to data. Finally, we conclude with a prospectus of future data analyses and theoretical research.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View