Skip to main content
Open Access Publications from the University of California


Himalayan Linguistics is a free peer-reviewed web journal and archive devoted to the study of the languages of the Himalayas. It includes the series Languages and Peoples of the Eastern Himalayan Region, which incorporates the North East Indian Linguistics (NEIL) volumes.

Himalayan Linguistics

Issue cover


Burushaski and unique Slavic isoglosses

Comparative historical studies have established over five hundred lexical correspondences between autochthonous Burushaski words and Indo-European as well as significant grammatical correlations. A genetic relationship has been proposed. Within these correspondences, the correlations of Burushaski with Slavic together with other branches are numerous and regular. These are not the subject of this paper. We concentrate exclusively on Burushaski isoglosses with words or meanings uniquely found in Slavic which consequently often have unclear, difficult or competing etymologies. The stratification of these isoglosses is complex. It appears that we might be dealing with various layers. In some cases, the phonetic and formal make up suggests a correlation of remote antiquity, yet in many instances it is difficult to establish a chronology. Most of the isoglosses involve cultural borrowing, with the direction of borrowing unclear, but a significant number (the considerable correspondences in the names of body parts, grammatical particles) may point to a closer genetic relationship.

Sub-grouping Kho-Bwa based on shared core vocabulary

Tianshin Jackson Sun (Sun 1992, Sun 1993) was the first to suggest the phylogenetic relatedness of a number of highly divergent, endangered and poorly described languages of Western Arunachal Pradesh, later named the ‘Kho-Bwa cluster’ by Van Driem (2001). In this paper, we make use of what are predominantly new data from our own field work, covering a total of 22 linguistic varieties. In a list of 100 lexical entries, cognate roots were tagged and subsequently a pairwise “cognacy percentage” was computed which forms the basis for a hierarchic cluster analysis. The result of this analysis and some further considerations confirm earlier reported views of a phylogenetic relationship between these languages. The appendix contains the full data set with cognacy statements. All computer code is available and documented on Github (

  • 1 supplemental file

Segmenting and POS tagging Classical Tibetan using a memory-based tagger

This paper presents a new approach to two challenging NLP tasks in Classical Tibetan: word segmentation and Part-of-Speech (POS) tagging. We demonstrate how both these problems can be approached in the same way, by generating a memory-based tagger that assigns 1) segmentation tags and 2) POS tags to a test corpus consisting of unsegmented lines of Tibetan characters. We propose a three-stage workflow and evaluate the results of both the segmenting and the POS tagging tasks. We argue that the Memory-Based Tagger (MBT) and the proposed workflow not only provide an adequate solution to these NLP challenges, they are also highly efficient tools for building larger annotated corpora of Tibetan.

Re-evaluation of the evidential system of Lhasa Tibetan and its atypical functions

This paper describes the specific contexts in which evidentials may be used in Lhasa Tibetan. I first give a brief presentation of the notion of evidentiality in interaction with pragmatics, which has been mainly used for describing Lhasa Tibetan. Then I re-evaluate the analysis of the evidential verb system in Lhasa Tibetan. I show there are indeed eight evidentials. I only focus on the six first evidentials: egophoric, sensorial, factual, inferential, mnemic and self-corrective with controllable verbs. The two other evidentials are the quotative and the hearsay particles.

Then, I focus on the specific functions of the evidentials with controllable verbs. I present the use of the intentional egophoric with the non-SAP and controllable verbs when the speaker refers to personal knowledge and I also discuss some of its restrictions. Then, I present the specific uses of the sensorial, factual and inferential evidentials.

King’s pig: A story in Lhagang Tibetan with a grammatical annotation on a narrative mode

The article primarily provides one full narrative story named King’s pig with a grammatical annotation of Lhagang Tibetan, a dialect of Minyag Rabgang Khams, spoken in the easternmost Tibetosphere, i.e., Kangding Municipality, Ganzi Prefecture, Sichuan Province, China. It also analyses a basic narrative construction and differences from general speeches, and shows that a narrative mode has an additional strategy regarding the evidential expressions as well as TAM marking which are observed neither in general conversations nor in elicitations. This implies a necessity of different descriptions depending on styles when one writes a reference grammar of this language.