An Audiovisual Corpus of Caquinte (Arawak)

An Audiovisual Corpus of Caquinte (Arawak)

Language: Caquinte (Arawak)
Depositor: Zachary O'Hagan
Location: Peru
Deposit Id: 0413
Grant id: IGS0282
Funding body: ELDP
Level: Deposit

Summary of deposit

Caquinte is an Arawak language of southeastern Peruvian Amazonia. This collection represents the general documentation of Caquinte in the form of audiovisual recordings of historical, autobiographical, and mythological texts; lexical and grammatical elicitation; interviews; conversations; meetings; etc. Some of these materials were originally written by consultants; others were delivered orally and transcribed. The result as of December 2018 is a segmented, glossed FLEx corpus of approximately 9,750 lines, most of which is translated into Spanish, with additional transcriptions of some texts in ELAN.

The data was collected by Zachary O'Hagan as part of the research for his PhD in linguistics at the University of California, Berkeley, beginning in 2011, and regularly from 2014 forward (before the granting period). All fieldwork was carried out in the community of Kitepampani, and the principal consultants were Antonina Salazar Torres, Joy Salazar Torres, Emilia Sergio Salazar, and Miguel Sergio Salazar.

Group represented
Caquinte tradition has it that all Caquintes originally lived at the mouth of the Pogeni River. Perhaps around the middle of the 19th century, they fled to the headwaters of this river due to intense conflicts with Ashaninkas, who continued to raid these upriver settlements well into the 20th century. The deeds of famous warriors such as Taatakini, known for their bravery against Ashaninkas, date from this period. Caquintes seem to have remained isolated in the Pogeni headwaters until the 1950s, when some began to migrate over the hills into the headwaters of the Mipaya and Huitiricaya rivers. Upon the death of a prominent warrior, Shankentini, in about 1959, one extended Caquinte family moved to the Matsigenka community of Puerto Huallana (Picha River), recently formed by the Summer Institute of Linguistics. SIL undertook an expedition to the upper Pogeni in 1969, and in 1975 to the Ageni River. At the latter location, the family that had moved to Puerto Huallana returned to clear an area for a community that would become Kitepampani, where SIL members resided from the following year. The Caquintes who founded Kitepampani encouraged many of their relatives in the Pogeni basin to move, and many did. Beginning in the early 1980s, the concentration of Caquintes in Kitepampani began to radiate outwards, eventually resulting in the founding of the set of communities where Caquintes live today. The fieldwork on which this documentation project is based was conducted in Kitepampani, which as of December 2018 has a little over 100 residents. Since 2006 the petrochemical company Repsol has been operative in Caquinte territory, resulting in significant changes in material culture in the form of cash, outside goods, and cement homes, and a health post. In a similar period, the municipality of Echarate and, since 2016, Megantoni, has undertaken the construction of a primary school, community center, and a system of running water. Everyone who lives in Kitepampani speaks as their native and daily language either Caquinte or related Matsigenka.

Language information

The Caquinte language belongs to the Kampa branch of the Arawak language family. It is spoken in Peru by a few hundred people in some half-dozen communities in the headwaters of the Pogeni River (Junín region), Mipaya River (Cuzco region), and Huitiricaya River (Cuzco region). Depending on the community, the daily language of any given household may be Caquinte or related Ashaninka and/or Matsigenka.

Caquinte is a polysynthetic, headmarking, largely agglutinative language, with remarkably complex verbal morphology. Basic word order is VSO, with preverbal positions available for topics and foci. Nouns can be categorized according to gender and alienability, but, unlike other Kampa languages, not animacy.

Special characteristics
This collection is the only archival collection of materials related to Caquinte in the world. Of special note is the large FLEx corpus, allowing for the easy searching of lexical and grammatical patterns. The focus on mythological texts and interviews about traditional life ways serves as crucial documentation of traditional Caquinte cultural practices at a time of rapid cultural changes.

Deposit contents
This deposit is focused on audiovisual recordings, approximately 48 hours of .wav files and 26.5 hours of .mp4 files in the genres described in the deposit summary above. In addition, there are 5 hours of transcription in the form of .eaf files; an .xml export of a FLEx database of approximately 9,750 segmented and glossed lines (with most translated into Spanish) and 3,450 headwords; and some field notes.

Deposit history
Depositor Zachary O'Hagan first visited Kitepampani for a one-week pilot trip in September 2011, returning for annual 8- to 12-week periods beginning in 2014. Field trips in 2014 and 2015 were funded by an Oswalt Endangered Language grant administered by the Survey of California and Other Indian Languages at the University of California, Berkeley; field trips in 2016, 2017, and 2018 were funded by ELDP. Early documentation focused on processing written versions of traditional stories, at the request of speakers, some of whom liked to record read versions at the end. Later audio and video recordings increased, as did the sorts of genres, as described in the preceding summary of the deposit. The main focus of data processing has been in segmentation, glossing, and translation of texts in FieldWorks Language Explorer (FLEx). December 2018 is the date of the first deposit with ELAR.

Other information
An equivalent deposit, and one that will be developed further with materials beyond the 2018 field season, is available via the Survey of California and Other Indian Languages, here:

Acknowledgement and citation

Users should acknowledge Zachary O'Hagan as the original researcher and depositor of any of the materials contained in this collection. Use of the materials in this collection is strictly for non-commercial purposes only. This deposit is part of an active, ongoing research project. The depositor requests that researchers interested in aspects of this collection for linguistic research contact him directly at They are strongly encouraged to consult the digital catalogue of the Survey of California and Other Indian Languages for more up-to-date information regarding this research project and collection: Citation for the latter collection is available at this link.

Please cite this ELAR collection as:

O'Hagan, Zachary. 2018. Caquinte Field Materials. London: SOAS, Endangered Languages Archive. URL: Accessed on [insert date].


Collection online
Resources online and curated


Zachary O'Hagan
Responsive image
Affiliation: University of California, Berkeley

Deposit Statistics

Data from 2020 September 19 to 2020 September 19
Deposit hits:1
Downloaded files
Without statistics

Showing 1 - 10 of 103 Items

The speaker tells a story about Woolly Monkey and Anteater.

Recorded on: 2018-07-20

The speaker tells a story about the 'ashibanti' spirit, associated with shamans, and a woman.

Recorded on: 2018-08-10

Two videos, with separate accompanying audio files, illustrating the views of two Caquinte women regarding the local activities of Repsol

Recorded on: 2016-08-05

Inalienable (classificatory) nouns; lexicon; focus; demonstratives and markers -tika, =ga, =Npani; interrogatives with second-position clitics; meaning of clitics =sa, =sakanika, =te; contrastive topic =ga.

Recorded on: 2017-08-19

July 11: meaning and ordering of second-position clitics =kea, =mpa, =sa, =sakanika, =tari, =te; distribution and ordering of contrastive topic =ga and ostensive -tika; interaction of some of these markers with demonstratives. July 15: verbal affix ordering. July 18: verbal affix ordering; lexical elicitation. July 21: ordering of second-position clitics =geti, =ka, =me, =sano, =shia, =ta, =tari; meaning of =gitatsi, =ha, =ka, =kea, =satine. July 31: verbal affix ordering; meaning of -amaNpeg, -na, =kea; distribution of focus pronouns; questions regarding traditional house construction, beliefs about death and the afterlife. August 3: different series of pronouns; lexical elicitation; beliefs regarding the places Kamameniari, Kobirichaigirini, Tsonkatagaroni, Kompiroshiato. August 5: lexical elicitation. August 10: contrastive topic =ga; lexical elicitation. August 14: lexical elicitation based on Jaame Ontsajigero Otsapapae (the draft of a book of stories produced by the Ministry of Education). August 18: meaning of verbal suffixes -ima, -giha; affix ordering; lexical elicitation; questions regarding fire fans.

Recorded on: 2017-07-11

Topics range over the entire lexicon.

Recorded on: 2014-07-06

Topics focus on flora and fauna

Recorded on: 2015-08-07

Recordings made in the researcher's temporary lodgings while the consultant was a student at the Universidad Nacional de San Antonio Abad del Cusco

Recorded on: 2011-08-29

Reviews meaning of and reasoning behind Caquinte names collected during genealogy interviews; recorded in the home of JSS

Recorded on: 2016-09-08

July 15: focus; contrastive topic; quantification; "always" and "never"; negation with -tsi; =ta; "-sati" pronouns; -ki vs. -panajaN; -itsi vs. -it. July 18: contrastive topic. July 27: review of unknown vocabulary from early recordings made by Kenneth Swift. July 29: semantics of different second-position clitics.

Recorded on: 2018-07-15