Comprehensive documentation and archiving of Teleut, Eushta-Chat, and Melets Chulym
|Depositor:||Andrey Filchenko, Denis Tokmashev, Valeriya Lemskaya|
The primary archiving output of the project is language data as media products - best contributing to the contemporary demand of language documentation. The collection of media files consists of video, audio and appropriate metadata. Approximately 50 hours of multimedia materials (video and audio) were recorded for each language during the project. Video recording has been prioritized as a more representative mode of the culturally specific communication patterns, containing not only traditional linguistic modality, but also documenting possibly wider multimodal aspects of communication (speech situations, speaker positioning, gesture, mimicry, etc).
It is expected that by the end of the project, at least 20% of the recorded data (narrations, tales, songs etc.) will receive full interlinearization and free-translation using FLEX, and will be integrated with multimedia formats using ELAN, which the depositors are familiar with. It is furthermore expected that at the output of the corpora of Teleut, Eushta-Chat, and Melets Chulym will consist of not less than 12000-15000 words for each language, representing authentic spontaneous interlinearized texts with culturally appropriate content, which are crossreferenced with the respective video and/or audio records (where available). Each recorded speech event will be provided with appropriate metadata detailing the participants, events, locations and technical specifics (date, place, speaker: name, date of birth, brief relevant sociolinguistic background, collector, titles, keywords, etc. describing the recorded event).
Metadata are collected and archived operationally in the form of texts within uniform tables (MS.xls) and then integrated with the language data into the archive framework using Arbil. When available, photographs, pictures, maps etc. complement the textual metadata.
As of now, all three languages/dialects of the project remain among the most neglected of the Turkic languages of Siberia while the adjacent Shor, Khakas, Chulym-Turkic, Tofa have already been a subjects of both Russian and international scholarly projects, while the languages/dialects of the project Teleut, Eushta-Chat and Melets Chulym are as highly endangered and lesser documented. Urbanization and the destruction of the traditional land-use practices, the outflow of youth from villages to the cities, the lack of a clear policy in the field of language education – all lead to a constant narrowing of the scope and function of the three languages of the project as the languages of daily communication. The communities live in the all-Russian type settled villages, usually not traditional in outlook. The domestic life is also rather common throughout Siberia. The traditional practices occur but rarely, and the groups seem to be quite assimilated due to the overall economic and social situation, as well as due to mixed marriages with other people groups (other Turkic ethnic groups, some local Uralic ethnic groups, Russians, Ukrainians, Belorussians, etc). Based on previous experience and recent contacts (ELDP SG 0277) there is a local demand and readiness to cooperate in the documentation and preservation of these endangered languages and cultures, establishing electronic resources, particularly in modern multimedia data formats, preservation and analysis of natural discourse, and communication patterns. It was noticed in 2014 during the ELDP SG 0277 project in Teleut communities, that one of the impacts has been in the domain of raising the awareness about the Turkic indigenous languages in this area, their level of documentation and description, their degree of endangerment, the need for their urgent documentation and value for their integration into the conventional debates in linguistics and anthropology. The ELDP SG 0277 Teleut documentation pilot project was one of the first such projects of its kind. It is strongly anticipated that the impact will be more considerable with the collection, processing and offering access to more representative data in size and diversity. The Teleut pilot project enjoyed growing interest and support of the local community and the activists in language/culture preservation and revival (local school, local library, individual representatives of the Teleut community). Among the project consultants were local educators and senior school children, who organized extra-curricular activities involving Teleut language and culture. These children – young Teleut speakers – were involved in the project as consultants, and displayed interest in the project activities and readiness to train in basic techniques and methods of language documentation and archival under the supervision and auspices of the local school and library. These and other representatives of the communities (Eushta-Chat, Melets Chulym) expressed their desire to cooperate further on documentation at all stages and in all activities: recording data and metadata, transcription and annotation, archival (locally and centrally), and using archived data to produce applied materials (reference, pedagogical, etc.). Furthermore, representatives of the adjacent Siberian Turkic communities, Eushta-Chat (Tomsk Tatars) and Melets Chulym expressed interest in the project and the documentation program. Thus, the project has the support of the respective communities: the Teleut community of Kemerovo region, the Eushta-Chat communities and the Melets Chulym community. As for Teleut, the co-applicants, Dr.Denis Tokmashev is an ethnic Teleut, actively involved with the community and enjoying family and local support in Teleut documentation efforts.
This collection is of special importance due to a number of reasons.
First, it displays a comprehensive documentation of a critically endangered language, Melets Chulym, that has fewer than 10 fluent speakers. The collectors have tried to document as many spheres of language use as possible and make it accessible to the ELAR users for further possible research.
Second, the majority of videos recorded for Melets Chulym show spekers of Melets Chulym talking to each other - something that has not been much recorded so far.
Third, any sort of comprehensive documentation of Eushta Chat has not been made so far. This project is the first one to do it with modern methods accepted for documentation at present.
Moreover, the presented collection will be much enriched by both recordings from 2017 and 2018, and digitized cassettes recorded in the 1970s stored at the Tomsk State Pedagogical University archives, something that the public have had no use of since the time of taking.
Some educational materials for the Melets Chulym language collected and processed previously are being re-checked within the project (a Russian-Melets Chulym dictionary accompanied by grammar reference and text samples with translation) - a material the community representatives have long been asking the academia to elaborate. A copy of the book will be uploaded upon completion and/or publication.
The majority of bundles in this collection are audio-video recordings. There are ca. 12 hours video recorded for Teleut, 16 hours for Eushta Chat, and 16.5 hours for Melets Chulym recorded in 2016. The recordings include narratives, conversations (monologues and polylogues) and interviews of various genres (stories, personal narratives, historical narratives, fictional narratives, songs and folk poetry, friendly talks, and other).
There are also:
8 hours of Eushta Chat and 28.5 hours of Melets Chulym audio that include recordings of field work on language peculiarities and structure (Russian-Turkic translation, discussions on the languages and community history, ethnographic information and metadata recorded in Russian and other).
All the speakers have given their consent for the recordings to be shared with academia.
As of May 2018 there are interlinearized and translated ELAN transcriptions for 1 hour for each language. ELAN transcriptions and interlinearisations will continue to be added as they are processed.
The deposit consists of three parts - the Teleut, Eushta Chat and Melets Chulym. The Teleut and the Melets Chulym have been previously briefly recorded by the team within field trip/small grants:
FTG0135: the deposit was timely submitted to the ELAR in October 2008. The deposit is a sample data collection including: monologues and dialogues speech events, autobiographical narratives, jokes, brief exchanges, humorous songs, stories from village life; video, audio and graphic formats (partial morphological annotation (glossing) of the texts, approx. 50%); digital media: metadata. The data was supplemented by metadata in IMDI format. Some parts of the corpus were annotated using ELAN. Language and metadata formats mostly comply with ELAR guidelines.
SG0277: completed in December 2014. The deposit was timely submitted to the ELAR in December 2014 – January 2015, including: monologues and dialogues, autobiographical narratives, jokes, brief exchanges, humorous songs, biographical stories from village life, containing over 220 sessions totaling over 22 hours of recording, over 25% of which is fully interlinear-glossed using Flex, and integrated in ELAN format, with respective metadata integrated in Arbil format.
Apart from that, the collectors (Denis Tokmashev and Valeriya Lemskaya) are adding their personal collections they have made during work on their Russian doctoral theses and postdoc research (since 2008 and 2005 respectively). It must also be mentioned that Denis Tokmashev's late father was a native speaker of Teleut and language activist whose personal archive has been used and will be added to the project deposit.
None of the data in this collection may be used as evidence in court.
Acknowledgement and citation
Users of any part of the collection should acknowledge Andrey Filchenko as the principal investigator and Denis Tokmashev and Valeriya Lemskayaas the data collectors and researchers. Users should also acknowledge the Endangered Languages Documentation Programme as the funder of the project. Individual speakers whose words and/or images are used should be acknowledged by name. Any other contributor who has collected, transcribed or translated the data or was involved in any other way should be acknowledged by name. All information on contributors is available in the metadata.
To refer to any data from the corpus, please cite the corpus in this way:
Filchenko, Andrey. 2016-2019. Comprehensive documentation and archiving of Teleut, Eushta-Chat, and Melets Chulym: three areally adjacent critically endangered Turkic languages of Siberia. London: SOAS, Endangered Languages Archive. [insert deposit URL here]. Accessed on [insert date here].