The Tocharians, the mysterious people that travelled more than 4000 km and ended up in a desert

The Takla Makan desert in Western China is in the middle of nowhere. Being there feels more like having landed on a deserted Tatooine than on earth; most villages are very scarcely populated and sand rocks, red desert sand, and dried salt rivers outdo the surroundings. The climate is horrible: winters are freezing, summers extremely hot and dry; springs and autumns are endurable, but temperatures between day and night often differs by 30̊ C. In a village called Subashi I met a villager, who had used 20 years to dig a well (by hand, I assume, considering the many years he had spent on the project). The well was obviously very deep, but it contained no water.
Nevertheless, French and German expeditions 100 years ago found the remnants of an Indo-European language in the sand-filled grottoes of this desert. The language, which was wrongly labelled ‘Tocharian’, after an Iranian tribe mentioned by the ancient Greeks, turned out to represent a branch of its own on the large Indo-European tree. In recent years, research has revealed new and interesting knowledge of this mysterious people, how they lived, where they came from, and what their language looked like.
During the first millennium ACE, the Tocharian civilization flourished along the Silk Road. By that time, Tocharian had split into two languages, which for the sake of simplicity are labelled Tocharian A and Tocharian B. The Tocharian culture was in important aspects not very different from other early Eastern medieval civilizations: they possessed a warrior class, a nobility, royals, farmers, and a religious class of monks, which lived from welfare in the form of alms by the working population. The Tocharians were Buddhists and learned to write by Buddhist missionaries from India, and the system they used to write their language was an adaptation of the Indic Brahmi script. Accordingly, most texts, which date between 300-1100 ACE, are of Buddhist content. A large part of the literary sources represent Tocharian adaptations of the Indian Buddhist canon – parallels in Sanskrit cannot always be found. After the Islamic conquest of Central Asia and the closing of the Silk Road, the Tocharian kingdoms collapsed, the Tocharian language died out, the area was depopulated, and the desert sand quickly buried all traces of the Tocharian people and their language.   
Even though out texts in Tocharian are of a relatively late date, at least compared to the ancient civilizations of the Mediterranean or the Fertile Crescent, archeology, archaeogenetics and - most of all – language give us rich information about the prehistory of the Tocharians.
It is evident that the Tocharians left the Indo-European homeland very early and migrated towards the East. Even though Tocharian is a centum language and actually has more similarities with western than eastern Indo-European languages, it clearly forms its own branch on the Indo-European tree. The long absence from the Indo-European proto-language, together with a long period in isolation from other Indo-European languages, has resulted in two languages with very weird and complex structures. The languages have many case forms, like Uralic and Caucasian languages, and they have double causatives, like Turkic languages. But even though the Tocharian categories clearly show non-Indo-European impact in the typological structure, the inflectional forms themselves are all of Indo-European descent: the setup of verbs easily matches Greek or Sanskrit in its complexity and variety of forms. Most forms and categories reconstructed to Indo-European are there, but often in a reorganized structure and with changed use and meaning.
Even though most preserved texts are of Buddhist context, the language and the specific Tocharian version of Buddism shows many traces of a pre-Buddhist, pagan faith, not very different from what we assume was present in early Indo-European. We have a sun-god and a moon-god, as well as remnants of the so-called heroic myths and the concept of ‘eternal glory’, which is well represented in epic tales such as the Iliad, the Odyssey, or the Mahabharata.  
Tocharians borrowed words from the Turkic Uighur language, from Chinese, and from Sanskrit; the latter in large amounts – almost half of the Tocharian lexicon has its source in Sanskrit. Uighur also borrowed from Tocharian. However, if we move back in time, Tocharian also borrowed a substantial amount of vocabulary, often administrative terms, from Iranian. In the period between 500 BCE and onwards, Tocharian seemed basically to be a recipient language, something that indicates that Tocharian during this period was a less important regional language than, for instance, Chinese (in the East) or various Iranian languages (in the West). If we look earlier than that, we find interesting and striking language contacts of Tocharian. Early forms of Tocharian are found in Uralic languages, and very likely, a pre-form of Tocharian is responsible for the Indo-European borrowings into Early Chinese. Therefore we may assume that Tocharians had a more important cultural role in the archaic period than in the antique period, when they basically were target of language borrowing.

Archaeological track record in the Tocharian-speaking area is astonishingly rich: most famous are the well-preserved mummies, which look like Celts with their pointy hats, tattoos and red braids. Studies of their DNA indicate several origins, in the earlier layers mainly Western European haplogroups, in later layers preferably Central Asian or Eastern haplogroups. The patrilinear DNA is mainly R1a1, a haplogroup associated with the Proto-Indo-European migration out of Eastern Europe.
However, there are many enigmas that still look for a solution. One of the most complex issues is the large amount of obscure lexemes in Tocharian. Even though the core vocabulary of Tocharian is completely Indo-European, most words of the lexicon (except for the many Sanskrit borrowings, of course) have either no etymology or a very uncertain etymology. It is possible that the Tocharians borrowed words from a long-lost substrate language – but what would that be? There are few traces of significantly different cultures in the area, preceding the Tocharians. Alternatively, Tocharian picked up words from several extinct, unrelated languages of Eurasia on their way from Eastern Europe to the Takla Makan desert. Very few, reliable etymologies in Tocharian can be sourced in any of the living language families of Asia.  
Coming up next: Heroic, lethal, or filthy animal? The history of pig words
References: (Adams 2013; Carling 2005; Carling et al. 2009; Mallory and Mair 2000; Malzahn 2011-2018; Pinault 2008)
Adams, Douglas Q. (2013), Dictionary of Tocharian B. : Revised and Greatly Enlarged. (Amsterdam: Rodopi).
Carling, Gerd (2005), 'Carling, Gerd. Proto-Tocharian, Common Tocharian, and Tocharian – on the value of linguistic connections in a reconstructed language', in Karlene Jones-Bley, et al. (eds.), Proceedings of the Sixteenth Annual UCLA Indo-European Conference (Journal of Indo-European Studies - Monograph Series; Washington: Institute of Man), 47-70.
Carling, Gerd, Pinault, Georges-Jean, and Winter, Werner (2009), Dictionary and thesaurus of Tocharian A (Wiesbaden: Otto Harrassowitz).
Mallory, J. P. and Mair, Victor H. (2000), The Tarim mummies : ancient China and the mystery of the earliest peoples from the West (London: Thames & Hudson).
Malzahn, Melanie (2011-2018), CEToM - A Comprehensive Edition of Tocharian Manuscripts.
Pinault, Georges-Jean (2008), Chrestomathie tokharienne : textes et grammaire (Leuven: Peeters).