Tocharian: An Indo-European language from China. Tocharian is a language that was spoken in the Tarim Basin in the Northwest of present-day China (Xīnjiāng region, north of Tibet). In the middle of the Tarim Basin there is a large desert, which is surrounded by several oases and enclosed by high mountain ranges. Tocharian is an Indo-European language, related to Latin, Greek, Celtic, and, among many others, English. A few examples suice to illustrate this: mātär ‘mother’; pātär ‘father’; protär ‘brother’; ñem ‘name’; kas ‘six’; keu ‘cow’. Michaël Peyrot studied Comparative Indo-European Linguistics and Dutch Language and Literature at Leiden University, where he also defended his PhD thesis The Tocharian subjunctive in 2010 (published 2013 with Brill). From 2011 to 2014, he worked at the University of Vienna for A comprehensive edition of Tocharian manuscripts. He then moved to Berlin for his Marie Curie project Niya Tocharian: language contact and prehistory on the Silk Road (2014–2016) at the Berlin-Brandenburg Academy of Sciences and Humanities. His NWO-funded VIDI project Tracking the Tocharians from Europe to China: a linguistic reconstruction at Leiden University runs from 2016 to 2021. Michaël Peyrot Today, the Tocharian language is extinct. How is it known altogether? It is attested in paper manuscripts that have been found on the northern edge of the Tarim Basin, in the territory of the former city-states Kuča, Yānqí and Turfan. These manuscripts, dating from 500–1000 BCE, could be preserved until the present day, thanks to the extremely arid desert climate in the region. Nevertheless, the pieces that survive are only fragments of a Tocharian literature that must once have been quite substantial. The number of manuscript fragments can be estimated at 9,000 for variety “B”, originally from Kuča, but also found in Yānqí and Turfan, and 2,000 for variety “A”, originally from Yānqí , but also found in Turfan. However, these are mainly small 12 pieces of larger leaves: the number of leaves that are completely preserved is only a couple of hundred, and these are mostly just a single leaf of a larger text. In order to decipher the content of the fragmentary manuscripts, betterpreserved parallels in other languages are crucial. Fortunately, these do in many cases exist: Tocharian literature is almost entirely Buddhist. Buddhism arose in what is today northern India and Nepal, in the 6th century BCE. When emperor Aśoka, who reigned over almost the entire Indian subcontinent, made Buddhism the state religion in the 3rd century BCE, it spread far beyond its place of origin. From Gandhāra in present-day northern Pakistan, it then expanded northwest into Afghanistan, where it lourished in the Kushan empire, as well as north into the Tarim Basin, from where it spread further into central China. The fact that anything is known at all about Tocharian is due completely to the spread of Buddhism into the Tarim Basin. Not only can the texts be deciphered thanks to parallel texts in other languages, Buddhism was also the reason why Tocharian and several other languages of the region were written down in the irst In order to decipher the content of the fragmentary manuscripts, betterpreserved parallels in other languages are crucial. place. Initially, Buddhist literature was not written in the local languages, but only in the Middle Indian language of Gandhāra, Gāndhārī. The transmission of the texts must also, to a large part, have been oral. From the middle of the irst millennium onwards, texts were written down in the local vernaculars. These were Tocharian A and B in the northeast of the Tarim Basin, the Iranian language Khotanese in the southwest of the Tarim Basin, and later also Tumšuqese, related to Khotanese, in the northwest. All four languages are written in a variety of Brāhmī, a family of Indian scripts. Parallel to texts in the local languages, Sanskrit Buddhist texts were produced, as Sanskrit had replaced Gāndhārī as the language of Buddhism in the region. Michaël Peyrot 14 With all Tocharian Buddhist literature set in India, it comes as no surprise that the Tocharian language contains many words that are borrowed from Sanskrit. Almost the entire lexicon of religious terms is Sanskrit, and in most cases they are easily recognisable because they contain letters that otherwise do not occur in native Tocharian words, such as th, d and dh, which must in normal spoken Tocharian all have been pronounced as t. Words of this type are e.g. Tocharian B bodhisātve ‘bodhisattva’ (an enlightened being who is to become a Buddha) and brāhma e ‘brahmin’ (a member of the class of priests). Only some of the basic religious concepts are expressed with indigenous terms, such as pelaikne ‘law’ (Sanskrit dharma) and yāmor ‘act, fate’ (Sanskrit karma). Some words cannot come from Sanskrit, but point to a Gāndhārī source. These were apparently borrowed before Sanskrit became dominant. An example is amāne ‘monk’, which goes back to Gāndhārī amana, not to Sanskrit śrama a. Before the arrival of Buddhism and Indian culture, Tocharian was also inluenced by other languages. The most important among these were Iranian. Iranian is a large language family that does not only comprise the Farsi language of Iran, but also, among others, Kurdish, Ossetic in the Caucasus, Pashto in Afghanistan and Pakistan, and smaller languages in Afghanistan, Tajikistan and western China. Some of the Iranian inluence in Tocharian can be attributed to its two Iranian neighbours in the Tarim Basin: Khotanese in the southwest and Tumšuqese in the northwest. However, most must derive from several other Iranian varieties. Among these, a small group of words stands θ out because they derive from an archaic form of Iranian and point to contacts in the 1st millennium BCE, long before the attestation of Tocharian. An example is Tocharian B etswe ‘mule’, which has been borrowed from Old Iranian *atswa‘horse’, the source of e.g. Avestan (the language of Zaraθuštra / Zoroaster) aspa- and Farsi asb. The Tocharian B word cannot be from Khotanese or Tumšuqese because the Khotanese word is aśśa-, whose śś could not have given Tocharian tsw. close at all to them within the IndoEuropean language family. This is shown, for instance, by the word for ‘horse’, which can be reconstructed as *h1e uo- (cf. Latin equus, Greek híppos). In Indian and Iranian, which together form the Indo-Iranian branch, the sound * is relected as an s-sound: Avestan aspa-, Sanskrit áśva-, Khotanese aśśa-. However, in Tocharian it is relected as a k: the inherited word for ‘horse’ is yakwe in Tocharian B. The common ancestor of the Indo-European languages, ProtoIndo-European, was spoken in the Eastern European steppe, probably approximately from 4500 to 3500 BCE. There is increasing consensus Even though Tocharian is so heavily inluenced by Sanskrit, Gāndhārī, and several Iranian languages, it is not T h oc a that Indo-Iranian, together with other branches of Indo-European, descends from an Indo-European culture called Yamnaya, dated approximately from 3500 to 2500 BCE. From the Eastern European steppe, the Indo-Iranians moved east and then south through present-day Turkmenistan. The Indians moved southeast into India, while the Iranians remained to their north and moved west and east, into Iran and onto the Eurasian steppe. With Indians south of the Tarim Basin, Iranians in the west of the Tarim Basin, and probably still more Iranians on the Kazakh steppe and possibly even north and partly east of the n ria B Yānqí Kuca To c h Tu m š u q e s e Turfan n aria A Loulan Tumšuq Kašgar Tarim Kh ot Khotan ane B asin Niya ān Niya G se d r hā ī Languages of the Tarim Basin around 500 CE (@ Michaël Peyrot) 15 Mobility and language Michaël Peyrot 16 Tarim Basin, it is highly remarkable that Tocharian does not show any closer resemblance to the IndoIranian languages: all inluence, even though some is early, is from a later date. At present, the best explanation for this situation seems to be that the Tocharians moved east over the steppe before the Indo-Iranians started to spread. At the eastern end of the steppe, north of the Altai mountains, an archaeological culture is found that is termed “Afanas’evo”. This culture, close to and largely contemporary with Yamnaya (also 3500–2500 BCE), is often thought to represent a very early phase in the development of the Tocharians. Assuming that the Afanas’evo people, who have left no trace of their language, were early Tocharians, the main problem remaining is the enormous time gap of 3,000 years between the end of the Afanas’evo Culture and the attestation of the earliest manuscripts. Possibly, the link between the Afanas’evo Culture and the Tarim Basin is formed by the so-called Tarim Mummies. The Tarim Mummies are not real mummies, but rather ancient humans that are surprisingly well preserved, due to the extremely arid and in winter very cold climate of the Tarim Basin. They are from several sites throughout the Tarim Basin, and from diferent periods. Most interesting are the oldest, which date from the early 2nd millennium BCE. They belong to the “Xiaohe Horizon”, which comprises the sites of G mùgōu / Qäwriġul, Xi ohé / Ördek and Ayala Mazar, all of which are today in uninhabitable parts of the desert. Chronologically, it makes perfect sense to connect the early Tarim Mummies with the Afanas’evo Culture on the one hand and with the Tocharian city-states on the other. However, there is no way to be certain of the language of either the Afanas’evo people or the Tarim Mummies given the total absence of written sources. But we can try to reconstruct the migration route of the Tocharians in order to see whether it is possible that Tocharian was spoken in the Tarim Basin already in the early 2nd millennium BCE. In the NWO-funded VIDI project Tracking the Tocharians from Europe to China such a reconstruction is carried out based on linguistic evidence. The many layers of contact for which there is evidence in the Tocharian language will be used to establish where and when the Tocharians have been in contact with which other languages. The fact that anything is known at all about Tocharian is due completely to the spread of Buddhism into the Tarim Basin. 