Skip to main content
Open Access Publications from the University of California

Department of Linguistics

Department of Linguistics bannerUC Berkeley


With the first linguistics department to be established in North America (in 1901), Berkeley has a rich and distinguished tradition of rigorous linguistic documentation and theoretical innovation, making it an exciting and fulfilling place to carry out linguistic research. Its original mission, due to the anthropologist Alfred Kroeber and the Sanskrit and Dravidian scholar Murray B. Emeneau, was the recording and describing of unwritten languages, especially American Indian languages spoken in California and elsewhere in the United States. The current Department of Linguistics continues this tradition, integrating careful, scholarly documentation with cutting-edge theoretical work in phonetics, phonology, and morphology; syntax, semantics, and pragmatics; psycholinguistics; sociolinguistics and anthropological linguistics; historical linguistics; typology; and cognitive linguistics.

Department of Linguistics

There are 1397 publications in this collection, published between 1957 and 2023.
University of California, Berkeley, Miscellaneous Papers and Publications (15)
12 more worksshow all
Dissertations, Department of Linguistics (278)
275 more worksshow all
Publications of the Survey of California and Other Indian Languages (109)
106 more worksshow all
Proposals from the Script Encoding Initiative (133)

Proposal to Encode the Takri Script in ISO/IEC 10646

This is a proposal to encode the Takri script in the international character encoding standard Unicode. This script was published in Unicode Standard version 6.1 in January 2012. Takri was used in northern India and surrounding countries in South Asia. It was the writing system for the Chambeali and Dogri languages, well as Jaunsari, Kulvi, and Mandeali. It was the official script in a number of states of north and northwestern India from the 17c until the mid-20c, when it was gradually replaced by Devanagari.

Proposal for encoding the Lepcha script in the BMP of the UCS

This is a proposal to encode the Lepcha script in the international character encoding standard Unicode. The script was published in Unicode Standard version 5.1 in March 2008. Lepcha, or Rong, is the name of the Sino-Tibetan language spoken in Sikkim and West Bengal and the script used to write it. The script is said to have been invented about 1720 CE.

Proposal to encode the Old Sogdian script in Unicode

This is a proposal to encode the Old Sogdian script into the Unicode Standard. The script was published in Unicode version 11.0 in 2018. Old Sogdian was used in the Kultobe inscriptions, the 'Ancient Letters' of Dunhuang, China, the upper Indus inscriptions dated from 4-7c CE, and on inscribed coins and vessels from the area of Chach (modern Tashkent, Uzbekistan).

130 more worksshow all
Open Access Policy Deposits (344)

On the Pre-Columbian origin of Proto-Omagua-Kokama

Cabral (1995, 2007, 2011) and Cabral and Rodrigues (2003) established that Kokama and Omagua, closely-related indigenous languages spoken in Peruvian and Brazilian Amazonia, emerged as the result of intense language contact between speakers of a Tupí-Guaraní language and speakers of non-Tupí-Guaraní languages. Cabral (1995, 2007) further argued that the language contact which led to the development of Kokama and Omagua transpired in the late 17th and early 18th centuries, in the Jesuit mission settlements located in the provincia de Maynas (corresponding roughly to modern northern Peruvian Amazonia). In this paper I argue that Omagua and Kokama were not the product of colonial-era language contact, but were rather the outcome of language contact in the Pre-Columbian period. I show that a close examination of 17th and 18th century missionary chronicles, Jesuit texts written in Omagua and Kokama, and modern data on these languages, make it clear that Omagua and Kokama already existed in a form similar to their modern forms by the time European missionaries arrived in Maynas in the 17th century. Moreover, I show that several key claims regarding ethnic mixing and Jesuit language policy that Cabral adduces in favor of a colonial-era origin for Kokama are not supported by the available historical materials. Ruling out a colonial-era origin for Omagua and Kokama, I conclude that Proto-Omagua-Kokama, the parent language from which Omagua and Kokama derive, was a Pre-Columbian contact language. 

341 more worksshow all