Skip to main content
eScholarship
Open Access Publications from the University of California

Department of Linguistics

Proposals from the Script Encoding Initiative  bannerUC Berkeley

The “Proposals from the Script Encoding Initiative” series contains documents that propose scripts and characters for inclusion in the international standard Unicode. All documents have received technical review by the Unicode Technical Committee, and were funded (wholly or in part) by the Script Encoding Initiative in the Department of Linguistics at UC Berkeley or by its NEH-sponsored counterpart, the Universal Scripts Project.

Most of the scripts and characters in these documents have been published in the Unicode Standard. Occasionally the names, glyphs, or code points have been changed from the proposal document when published in the Unicode Standard. Comments or corrections to the contents of the proposals should be sent to the proposal author, who can be contacted through Deborah Anderson in the Dept. of Linguistics.

Proposal to add the Tangsa Script in the SMP

(2021)

This is a proposal to add the Tangsa script to the international encoding standard Unicode. The Tangsa script (ISO 15924: Tnsa) is used for writing the Tangsa languages (ISO 639-3: nst), which are spoken in Arunachal Pradesh, India and Sagaing Region of Myanmar. The script was created in 1990 by Mr. Lakhum Mossang. 

The characters are scheduled to be published in Unicode Standard version 14.0 in September 2021.

Financial support was provided by the National Endowment for the Humanities for the Universal Scripts Project (PR-253360-17), part of the Script Encoding Initiative at UC Berkeley.

Final proposal to encode Old Uyghur

(2020)

This is a proposal to encode the Old Uyghur script in the international standard Unicode. The script is scheduled to be published in the Unicode Standard version 14.0 in September 2021. The Old Uyghur script (ISO 15924: OUgr)  flourished between 8c and 17c CE. Though originally used to write medieval Turkic languages, it later was used for writing languages as Chinese, Mongolian, Tibetan, and Arabic. 

Financial support was provided by the National Endowment for the Humanities for the Universal Scripts Project (PR-253360-17), part of the Script Encoding Initiative at UC Berkeley, and the Unicode Adopt-a-Character program. The Unicode Consortium hosts the document registry that contains this proposal. 

Proposal for encoding the Vithkuqi script in the SMP of the UCS

(2020)

This is a proposal to add the Vithkuqi script (sometimes referred to as Veqilharxhi, Büthakukye, or Beitha Kukju) to the international character encoding standad Unicode. The script was devised in the period between 1824 to 1845, but didn't take hold in the latter part of the 19c. However, there are efforts in the 21st century to revive the script for artistic and cultural purposes.  The Vithkuqi script (ISO 15924: Vith) was used to write the Albanian language (ISO 639-3: sqi). 

The script is scheduled to be published in Unicode Standard version 14.0 in September 2021. Note: A number of the recently invented characters have not been approved yet. 

Financial support was provided by the National Endowment for the Humanities (PR-253360-17) for the Universal Scripts Project, part of the Script Encoding Initiative at UC Berkeley. The Unicode Consortium hosts the document registry that contains this proposal.

Final proposal to encode the Cypro-Minoan script in the SMP (WG2 N5135)

(2020)

This is a proposal to get the Cypro-Minoan script into the international character encoding standard Unicode. The Cypro-Minoan script is an undeciphered syllabary used on Cyprus and surrounding areas during the Late Bronze Age (ca. 1550-1050 BCE).  

The script is scheduled to be published in Unicode Standard version 14.0 in September 2021. Note: An updated chart (31 Dec. 2020) with one additional character is located at: https://www.unicode.org/L2/L2020/20156r-n5137r-cyprominoan-font.pdf. The repertoire in this latter document should be used as a reference point until the publication of the script in Unicode 14.0.

Financial support was provided by the National Endowment for the Humanities (PR-253360-17) for the Universal Scripts Project, part of the Script Encoding Initiative at UC Berkeley. The Unicode Consortium hosts the document registry that contains this proposal.

Cover page of Arabic additions for Quranic orthographies  

Arabic additions for Quranic orthographies  

(2019)

This is a proposal to add 39 Arabic Quranic characters to the international character encoding standard Unicode. The characters are used to represent text in minority orthographies, and many were contained in earlier documents submitted by others (cited on page 1 of this proposal).The characters are scheduled to be published in Unicode Standard version 14.0 in September 2021. A few modifications have been made to the names or location of characters in Unicode, so users should check the code charts when the Unicode Standard is published. The charts will be accessible at: https://www.unicode.org/charts/.

Cover page of Proposal for encoding the Toto script in the SMP of the UCS

Proposal for encoding the Toto script in the SMP of the UCS

(2019)

This is a proposal to encode the Toto script in the international character encoding standard Unicode. The script is scheduled to be published in Unicode Standard version 14.0 in September 2021. The script is used to write the Toto language used in a village in India near Bhutan.  The ISO 15924 code is Toto.

Cover page of Proposal to encode the Chorasmian script in Unicode

Proposal to encode the Chorasmian script in Unicode

(2019)

This is a proposal to encode the Chorasmian script in the international character encoding standard Unicode. The script was published in Unicode Standard version 13.0 in 2020. The script was used to write the Chorasmian language from 2c BCE until 8-9c CE in the area of the Oxus river delta, located in present-day Uzbekistan, Kazakhstan, and Turkmenistan. Chorasmian is an extinct Eastern Iranian language. The ISO 15924 code for Chorasmian is Chrs.

Cover page of Proposal for encoding the Yezidi script in the SMP of the UCS

Proposal for encoding the Yezidi script in the SMP of the UCS

(2019)

This is a proposal to encode the Yezidi script in the international character encoding standard Unicode. The script was published in Unicode Standard version 13.0 in 2020. The script was used to write the Kurmanji language in the Kurdistan region and surrounding areas. The script is used to write two important manuscripts and was revived in 2013 by the Spiritual Council of Yezidis in Georgia

Cover page of Proposal to encode Dives Akuru in Unicode

Proposal to encode Dives Akuru in Unicode

(2018)

This is a proposal to encode the Dives Akuru script in the international character encoding standard Unicode. The script was published in Unicode Standard version 13.0 in 2020. The script was used to write the Dhivehi language from the 9-20c in the Maldives.

Cover page of Proposal to encode the Elymaic script in Unicode

Proposal to encode the Elymaic script in Unicode

(2017)

This is a proposal to encode the Elymaic script in the international character encoding standard Unicode. The script was published in Unicode Standard version 12.0 in March 2019. The script was used to write Achaemenid Aramaic from the 2c BCE to 3c CE in the area of present-day southwest Iran.