Caijia: Cross-dialectal documentation of a highly endangered language in Guizhou Province of China
|Depositor:||Lü Shanshan, Wang Jian, Li Lan|
The deposited data concern several dialects of Caijia spoken in several villages in Weining and Hezhang counties in Guizhou Province of China.
The data are collected by three well trained linguists Li Lan, Wang Jian, and Lü Shanshan and community members themselves.
The collection created within this project concerns speakers of Caijia. The Caijia people are recorded as being descendants of the State Cai (1045-447 BC) in many local gazetteers and official historical documents. Even though the Caijia people were officially denominated as Caijia Miao 'Hmong of Caijia', their customs are very different from the Hmong. However, the people of Caijia remain an unidentified ethnic group at present and they do no possess an independent ethnic identity.
Language information Caijia 蔡家 or meŋ²¹ni⁵⁵ŋoŋ⁵⁵ is an under-described and critically endangered language spoken by fewer than 1,000 people (1982) in the northwest of Guizhou 贵州 province in southwestern China, mainly in Hezhang 赫章 and Weining 威宁 counties of Bijie 毕节 prefecture. As for the genetic affiliation, Caijia remains unclassified. Hu (2013) considers it to be a Chinese dialect, while Hsiu (2017) proposes that it belongs to the Tibeto-Burman Branch. Zhengzhang (2010) confirms a strong connection between Caijia and Old, Middle Chinese. It is only certain that Caijia is a Sino-Tibetan language. Caijia is a tonal and typical isolating language. There are four tones: 21, 33, 55, and 24. It possesses very little derivational morphology. Most words are monosyllabic, and compounding is the major strategy of word formation. Neither parts of speech nor grammatical relations are reflected morphologically, while the latter are coded by word order and prepositions (for oblique arguments), often derived from verbs as in Chinese. The basic word order is SVO. However, OSV, i.e. object-topicalisation, and SOV, i.e. the object marking construction, commonly occur as well. Oblique arguments occur both pre- and postverbally.
The deposited data of three years include: 120 hours video, 60 hours audio, IMDI metadata, 12 hours ELAN (fully annotated), 2 hours ELAN (transcriptions), 45 minutes Praat, a collection of stories with annotated transcription and literal and free translations into English and Chinese (PDF/A), a sketch grammar, and a dictionary.
Acknowledgement and citation
Users of any part of the collection should acknowledge Li Lan, Wang Jian and Lü Shanshan as the data collectors and researchers. Users should also acknowledge the Endangered Languages Documentation Programme (ELDP) as the funder of the project. Individual speakers whose words and/or images are used should be acknowledged by name. Any other contributor who has collected, transcribed or translated the data or was involved in any other way should be acknowledged by name. All information on contributors is available in the metadata.
Please cite the corpus in the following way to refer to any data from the corpus:
Li, Lan, Wang Jian & Lü Shanshan. 2018-2021. An archive of Caijia. London: SOAS, Endangered Languages Archive.