This week's blog post will deal with a complex topic: gender assignment.
As I have described in a previous post, gender involves a classification of nominal entities in language. Gender can generally be defined as classes of nouns which are reflected in the behaviour of associated words (Corbett 1991: 1). That is, gender is indicated by agreement of various elements. Gendered languages have varying number of genders present and they vary with respect to assignment, or how individual lexical items receive a gender (Audring 2014, 2017). Some languages assign gender based on semantic principles (semantic assignment systems), in which gender reflects categories such as biological sex or animacy. Other languages have formal assignment systems, which can be divided into morphological and phonological assignment (Corbett 1991: 7-8). Thus, gender assignment may be guided by semantic qualities (e.g., male/female, level of abstractness, shape), by morphological criteria (e.g., stem formation, inflection class, derivational suffixes), or by phonological criteria (e.g. word-final vowels or consonants). Languages may use semantic factors only, or a combination of semantic and formal factors, but all gender languages have a some semantic core (Corbett 1991: 8).

When looking at gender assignment in Indo-European culture vocabulary (the 100-culture list of our database, consisting of 8,500 gender- and cognacy-coded lexical items), some interesting tendencies emerge. We cannot investigate the phonological and morphological assignment principles on the data in its current shape (words in languages have not ben coded for morphology or phonology), but many other interesting tendencies can be extracted from the data.
First, the total distribution of genders of lexical items in the data is straightforward as masculine<feminine<neuter<alternans (see below). This is also reflected in the timeline of evolution of genders (see below), where we see that the masculine dominates in the early period, but weakens during the antique period and then regains strength during the first and in particular the second millenia ACE, on behalf of the feminine and in particular the neuter.  
We code all concepts for various semantic properties listed in the literature as important for gender assignment, such as animacy, collectiveness, countability, sexus, concreteness, and form/shape. In addition, we divide gender by different concepts classes, which we conclude by patterns of colexification and semantic change in the data.
We find that animated concepts (animals in our data) are significantly associated with the masculine gender (we compile both male and female forms of animals, but the overrepresentation of masculine for the general terms is important in the data). Further, we find that collectives as well as concepts coded as materials are significantly associated with the neuter gender. Our data does not contain abstract nouns, but surprisingly, we find that sharp and sticking implements are significantly associated with the feminine gender.
These tendencies for semantic properties undelie the overrepresentation of particular genders in certain semantic classes, which can be seen in the heatmap of gender distribution in relation to different classes above. In this heatmap, which divides concepts into classes, we can observe that neuter is overrepresented for metals and materials and drink and drugs, masculine is overrepresented for all animals, feminine is overrepresented for weapons, trees and insects (honeybee). This indicates that assignment is not just caused by semantic property, it is very likely also caused by semantic class, but more research and data is required to prove this assumption.

Audring, Jenny (2014), 'Gender as a complex feature', Language Sciences, 43, 5-17.
--- (2017), 'Calibrating complexity: How complex is a gender system?', Language Sciences, 60, 53-68.
Carling, Gerd (2019), Mouton Atlas of Languages and Cultures. Vol. 1: Europe, Caucasus, Western and Southern Asia (Berlin - New York: Mouton de Gruyter).
Corbett, Greville G. (1991), Gender (Cambridge textbooks in linguistics, 99-0104661-0; Cambridge: Cambridge Univ. Press).
--- (2014), The expression of gender [Elektronisk resurs] (Berlin ;: De Gruyter Mouton).
Corbett, Greville G. and Fraser, Norman M. (2000), 'Gender assignment: a typology and a model', in Gunter Senft (ed.), Systems of Nominal Classification (Cambridge: Cambridge University Press), 293-325.
Corbett, Greville G. and Fedden, Sebastian (2016), 'Canonical Gender', Journal of Linguistics, 52 (3), 495-531.
Van Epps, Briana 2019. Sociolinguistic, comparative and historical perspectives on Scandinavian gender: With focus on Jamtlandic. PhD dissertation, Lund.
 


Distribution of the genders alternans, commune, neuter, feminine, and masculine in the dataset (lexemes of 104 concepts in 105 Indo-European languages)


Timeline of gender distribution in the lexical dataset (by Briana Van Epps).

Continuing my blogposts about gender, I will say a few words about gender stability. Over time, words often change their gender. This is well known, for instance, in Germanic languages, the words for 'sun' and 'moon' are feminine and masculine respectively (as in German die Sonne and der Mond), whereas other branches of Indo-European the situation is the reverse (Italian sole masculine 'sun' and luna feminine 'moon').
The important and interesting thing here is to investigate the reasons for gender stability or instability. Are they connected to a specific gender? Or are they connected to specific words? Or is gender stability a matter of frequency? There are still very few, if no studies that look at gender stability, using large-scale data sets.
If we consider fist the issue of gender instability in our culture data set for Indo-European, we notice that is little difference between the genders when it comes to stability in cognates. We distinguished three classes, cognates with more than 90% same gender (stable class), cognates with between 90-50% same gender (dominant class), and cognates with under 50% same gender (change class). Wee notice that all three genders masculine, feminine, and neuter have approximately the same distribution within the classes stable, dominant and changing gender (see picture below). However, the masculine is slightly overrepresented in the stable group, feminine in the dominant group and neuter in the change group, meaning that the masculine is most stable, feminine a bit less stable, and neuter must untable. However, the differences are small.
What is more interesting though, and probably also promising for future research on gender stability, is that there is a large variation in the stability of different semantic classes. Crops, metals, trees, vegetables, prodcuts, are all highly stable, drink & drugs, small cattle, and tillage, etc and highly unstable. And so forth. If there is a connection to general frequency remains to be controlled for the entire Indo-European family, but a study on gender in Scandinavian languages only (Van Epps, Carling & Sapir 2019), found a correlation between frequency and gender instability.

Van Epps Briana, Gerd Carling & Yair Sapir to appear. “Gender assignment in six North Scandinavian languages: Patterns of variation and change”, to appear in a journal.
 


Heatmap of frequency of occurrence of various semantic classes in the different categories stable (

The large Scandinavian languages, such as Swedish and Danish, have lost their three-gender system to a system of commune and neuter. However, several smaller dialects or languages, such as Jamtlandic and Elfdalian, have preserved the system of three genders. In a new study from our research group, by Briana Van Epps and me, we investigate the assignment principles of gender in Jamtlandic. The dialect indicates an instability of the feminine gender, which is visible, among others, in gender assignment of loanwords.

DOI to the paper (Nordic Journal of Linguistics (2019), 1-33):
https://doi.org/10.1017/S0332586519000209

Abstract:
AbstractIn this study, we present an analysis of gender assignment tendencies in Jamtlandic, a lan-guage variety of Sweden, using a word list of 1029 items obtained from fieldwork. Mostresearch on gender assignment in the Scandinavian languages focuses on the standard lan-guages (Steinmetz 1985; Källström 1996; Trosterud 2001, 2006) and Norwegian dialects(Enger 2011, Kvinlaug 2011, Enger & Corbett 2012). However, gender assignment prin-ciples for Swedish dialects have not previously been researched. We find generalizationsbased on semantic, morphological, and phonological principles. Some of the principlesapply more consistently than others, some‘win’in competition with other principles; amultinomial logistic regression analysis provides a statistical foundation for evaluatingthe principles. The strongest tendencies are those based on biological sex, plural inflection,derivational suffixes, and some phonological sequences. Weaker tendencies include non-core semantic tendencies and other phonological sequences. Gender assignment inmodern loanwords differs from the overall material, with a larger proportion of nounsassigned masculine gender.
 


Density heatmaps indicating the frequency of languages as source (y) and target (x) language in loan events, by their ranking in a Language Power Index rank.

A study in PLOS ONE shows that borrowing is hierchical: borrowings are most likely to take place from a more prestigious language to a less prestigious one. In addition, borrowing is caused by increased cultural labour intensity.

Abstract
All languages borrow words from other languages. Some languages are more prone to borrowing, while others borrow less, and different domains of the vocabulary are unequally susceptible to borrowing. Languages typically borrow words when a new concept is introduced, but languages may also borrow a new word for an already existing concept. Linguists describe two causalities for borrowing: need, i.e., the internal pressure of borrowing a new term for a concept in the language, and prestige, i.e., the external pressure of borrowing a term from a more prestigious language. We investigate lexical loans in a dataset of 104 concepts in 115 Eurasian languages from 7 families occupying a coherent contact area of the Eurasian landmass, of which Indo-European languages from various periods constitute a majority. We use a cognacy-coded dataset, which identifies loan events including a source and a target language. To avoid loans for newly introduced concepts in languages, we use a list of lexical concepts that have been in use at least since the Chalcolithic (4000–3000 BCE). We observe that the rates of borrowing are highly variable among concepts, lexical domains, languages, language families, and time periods. We compare our results to those of a global sample and observe that our rates are generally lower, but that the rates between the samples are significantly correlated. To test the causality of borrowing, we use two different ranks. Firstly, to test need, we use a cultural ranking of concepts by their mobility (of nature items) or their labour intensity and “distance-from-hearth” (of culture items). Secondly, to test prestige, we use a power ranking of languages by their socio-cultural status. We conclude that the borrowability of concepts increases with increasing mobility (nature), and with increased labour intensity and “distance-from-hearth” (culture). We also conclude that language prestige is not correlated with borrowability in general (all languages borrow, independently of prestige), but prestige predicts the directionality of borrowing, from a more prestigious language to a less prestigious one. The process is not constant over time, with a larger inequality during the ancient and modern periods, but this result may depend on the status of the data (non-prestigious languages often remain unattested). In conclusion, we observe that need and prestige compete as causes of lexical borrowing.

In this blog, I will try – as far as possible – to switch between lexicon and grammar. Most topics are related to ongoing research either by me or by people in our research group. I will also try to have PhDs and other researchers writing guest posts, sharing their research. Contact me if you want to contribute!

Since I began by posting a picture on the Eurasian diversity for the words for WHEEL, my first post is lexical: I will talk about terms for vehicles. Within Indo-European studies, the issue of the words for vehicle-related terms is an important issue. Generally, it is believed that the invention of the wheel as a means of transport during early Chalcolithic was, together with the domestication of the horse for traction, the innovation that spread the Indo-European family over all Eurasia. However, there are several enigmas surrounding the origin of vehicles and wheeled transports. First, archaeology does not help us very much. The early wheels, hubs, and naves were made of wood, a non-durable material. Further, the spread of the wheel was so swift that we cannot know where it appeared first. Before the wheeled transport, there were other uses of the wheel: millstones for grinding, the pottery wheel, and spindles for spinning, so the word for wheel in the Indo-European proto-language had several potential functions. More important is the entire complexity of wheel and transport-related lexemes in Indo-European and its neighbors.
For Indo-European, a set of forms for wheel and transport can be reconstructed to the proto-language. Beginning with WHEEL, we have at least 3 common terms (PIE *h₂wērg-wn̥t-ōn 'wheel, circle’, PIE *h₂urg-i- 'wheel, circle', PIE *kʷekʷlo-, *kʷel-o- 'wheel, circle' < PIE *kʷel(H)- ‘to turn‘; PIE *Hróth₂o- 'wheel, circle' < PIE *(H)reth₂- 'to run'). Besides, we have terms for HUB or NAVE, which also mean ‘navel’ (PIE * h₃enbh-, * h₃nebh- ‘navel, nave, hub’, PIE *h₃nobh-li- 'navel, nave'), a reconstructed lexeme for WAGON (PIE *weǵhno- 'wagon' < PIE *uoǵh- 'to carry, drive'), The process of creating a word for ‘wheel’ from a verb meaning ‘to roll’ is found also outside of Indo-European, such as in Caucasian languages (Proto-Kartvelian *gor- 'wheel; to roll',  Proto-Nakh *gur- 'wheel', Proto-Dagestanian *gur- 'to whirl, to roll; wheel‘ (Georgian gor-gor-a 'wheel', Chechen gur-ma 'wheel for plough’); Proto-Kartvelian *bor- 'rotation', Proto-Nakh *bor-a 'mill's wheel', Proto-Dagestanian *bor-a 'wheel‘ (Georgian borbali 'wheel', Laz bor-bol-ia 'wheel', Laz  bur-in-i ’rotation; spinning’, Beshta örræ 'wheel', Avar ber 'wheel')).
It is evident that the Indo-Europeans knew the wheel and also used wheeled transports. Whether these transports took them over large areas is questionable: the wagons were heavy, the wheels of solid wood and roads were absent. Wagons were more likely used for loading and traction, such as for pulling hay from the field to the barn. Caucasians also had a word for WAGON (PKv *sa-kʰum- 'carriage', PNWC *kwə 'carriage, cart', PD *hankʰwə- 'carriage, vehicle‘ (Megr o-kʰim-o 'carriage', Adyghe kʷə, kʰwə 'wagon', Ubykh  kʰwə 'cart', Dargwa urkʰura 'carriage', Lezg akʰur 'carriage'). Apparently, these wagons were not fit to transgress the high Caucasus Mountains and spread the languages over the open plains.
Proto-Indo-European also had several words for YOKE (e.g., PIE *yug-o- 'yoke’). YOKE is a highly stable word in Indo-European, which practically did not change its form and was not substituted in languages. If the root was substituted, new forms were derived from roots meaning ‘to bind’ (Proto-Slavic *arь̀mъ, *arьmò 'yoke, ox-yoke' < PIE *h₂er- 'join’, Proto-Celtic *wedo- ‘yoke, harness’ < PIE *wedh- 'bind'). Interestingly enough, the Caucasians use the same root for the YOKE (PKv *uɣ-el- 'yoke', PNWC *ɣəw 'yoke', PD *ur- 'yoke’ (Georgian uɣeli 'yoke', Megrelian uɣeli 'yoke', Ubukh ɣawə 'yoke', Tabarasan uɣ-in 'cart (drawn by a single ox), Udi ọq' 'yoke')). The yoke, independent whether it was put on a bull, horse, donkey or human, had a very simple and straight-forward function, which did not change over the millennia: to put a device over the neck for facilitating traction and carrying.
The vehicles words in languages are highly interesting. Words for the parts of vehicles, such as the wheel or the hub, are seldom borrowed and remain stable in most languages. The words for WAGON and AXIS change more frequently: they are more often borrowed, and they often switch or expand their meaning. Both WAGON and AXIS frequently change or colexify their meanings, in particular to meanings referring to the sky and the firmament, e.g., ‘Polar star’, ‘axis’ or ‘firmament’. This says us something about the cultural importance of the wheel and the transport: words are frequently projected to the firmament, something that has a natural cause.
References (Anthony 2007; Carling To appear (2019); Greenfield 2010; Mallory and Adams 2006; Piggott 1983)

Coming up next: the language of deixis

References
Anthony, David W. (2007), The horse, the wheel, and language : how Bronze-Age riders from the Eurasian steppes shaped the modern world (Princeton, N.J. ;: Princeton University Press).
Carling, Gerd (To appear (2019)), Mouton Atlas of Languages and Cultures. Vol. 1: Europe, Caucasus, Western and Southern Asia (Berlin - New York: Mouton de Gruyter).
Greenfield, Haskel J. (2010), 'The Secondary Products Revolution: the past, the present and the future', World Archaeology, 42 (1), 29-54.
Mallory, James P. and Adams, Douglas Q. (2006), The Oxford introduction to Proto-Indo-European and The Proto-Indo-European world (Oxford linguistics; Oxford ;: Oxford University Press).
Piggott, Stuart (1983), The earliest wheeled transport : from the Atlantic coast to the Caspian Sea ([London]: Thames and Hudson).

Deixis refers to pointing by using language. Deixis seems to be universal – all languages have a system for denoting at least two dimensions of deixis: ‘here’ and ‘there’. Deixis is marked either by deictic markers without person reference ‘here’, ‘there’, or deictic markers with person reference, ‘s/he /that here’, ‘s/he /that there’. Almost without exception, deictic words are accompanied by gestures.
Deictic systems are very interesting – their purpose is clearly communicative and they are deeply rooted in our cognitive system. Think of a hunting situation: a speaker wants to communicate to a companion that a game animal is hiding among the bushes. Or that a dangerous snake has been seen among the rocks.

-Where? asks the second speaker.
-Over there! answers the first speaker, pointing in the direction of the presumed hiding animal.
-Where over there? Did you really see it yourself?
-No, I am not sure … I thought I saw something...

In situations such as these, languages have found out different effective ways to standardize the communication, often by means of intricate and complex systems of deixis. But even if the preconditions for deixis is imprinted in our brains, the ways in which systems come out is highly diverse and pronounced cultural.

Deictic systems – at least the ones we are used to – typically distinguish two or three dimensions of deixis): ‘here’, ‘there’ and ‘over there’. In language, these dimensions are also mirrored in the sound structure of their words – a phenomenon that seems to be almost universal. Forms for ‘here’ are expressed by sounds that have higher frequency, e.g., vowels i, e or consonants such as s, t. In contrast, forms for ‘there’ are expressed by sound with lower frequency, such as the vowels a, o, u, and consonants m,b. This has to do with our apprehension of our surrounding environment: we associate closeness with familiarity, safety, smallness and higher voice or pitch, whereas we associate distance with unfamiliarity, threat, large size, and a lower voice or pitch. To fully understand this phenomenon, think about the sound of a cat versus the growl of a tiger. Which one do we want to have closer? This opposition between high and low frequency in here- and there-forms is stable in languages. If is becomes distorted by change in the sound structure, the opposition becomes restored within generations.

Besides forms for the basic deictic distinctions, some languages have expanded their deictic systems in various directions, introducing a large amount of additional information.
A system such as this is found in Kamaiurá, a Tupí-Guaraní language spoken in Upper Xingu, Brazil. Kamaiurá is a prototypical Amazonian language: they have mother-in-law language, evidentiality (linguistic ‘truth-marking’), male versus female speech, and nominal tense. In the system of deictic terms, there are four basic dimensions of deixis, ‘s/he /that here’ (close to speaker), ‘s/he /that there’ (close to listener), or ‘s/he /that over there’ (away from both speaker and listener), and ‘s/he /that over there’ (far away from both speaker and listener). Besides these four basic dimensions, there is a large set of forms, in total around 20. In normal speech, such as when someone tells a story or reports an event, these deictic forms are highly important: they communicate a number of dimensions of an event: the time, the place, the role of the speaker, what may come next, or what the speaker or the participants know or don’t know, as well as modalities, feelings, and so forth.
One deictic form denotes ‘s/he/ that, close to speaker but invisible’ – a form used for instance about someone talking inside a house, who is heard through the door. Another form is used to mark that the referent is more or less close, heard but not seen, and again, another form marks that the referent is over there, neither heard nor seen, and the speaker is uncertain about its status – the source is secondary, ‘hearsay’. There is also a that form refers to ‘that guy I don’t know the name of’ or ‘the guy I don’t remember’, and one form notes that someone is close but not visible: this is used for instance when talking about an absent son. Further, forms may denote that someone is moving away or is located close to something else of importance. The system is impossible to learn for an outsider: since the use of the forms are consolidated in each and every situation the language is used, only native speakers can learn to master the system in full.
 
References: (Carling et al. 2017; Diessel 2011; Johansson and Carling 2015)
 
Coming up next: The Tocharians, the mysterious people who travelled more than 4000 km and ended up in a desert
 
Carling, Gerd, et al. (2017), 'Deixis in narrative: a study of Kamaiurá, a Tupí- Guaraní language of Upper Xingu, Brazil', Revista Brasilieira de Linguística Antropológica, 9 (1), 13-48.
Diessel, Holger (2011), 'Deixis Demonstratives', in Claudia Maienborn, Klaus von Heusinger, and Paul Portner (eds.), Semantics: An International Handbook of Natural Language Meaning (Handbooks of Linguistics and Communication Science (Handbooks of Linguistics and Communication Science): 33 (1-3); Berlin, Germany: de Gruyter Mouton), 2407-32.
Johansson, Niklas and Carling, Gerd (2015), 'The de-iconization and rebuilding of iconicity in spatial deixis', Acta Linguistica Hafniensia: International Journal of Linguistics, 47 (1), 4.
 


In Scandinavian folklore, there is a story about a lethal pig, the Gloso (‘glowing sow’), which kills lonely hikers on their way home at Christmas Eve. The pig is black with glowing, red eyes, and its back is a sharp saw: running between humans’ legs, the creature cuts humans in two parts. The only way to survive a Gloso is to jump into the ditch as soon as you spot the animal’s glowing eyes from a distance. Stories about lethal pigs are also found in Celtic mythology, in the tale about Mag Mucrime, pigs from the underworld, which haunt and ravage the lands, killing people and destroying the fields. However, the ancient Celts were also very fond of their pigs. Typically, helmets and shields of Celtic warriors were decorated by boars – and we should not forget that a pig is in the centre in one of the most important Celtic epic tales, the story of Mac Da Thó’s pig. Likewise, in Germanic mythology, the pig is the animal of the god of fertility, Frey, and the boar Sæhrímnir, which can be eaten again and again, plays a central role as provider of meat to the dead warriors and the gods od Valhalla.
      
How come that the most important protein source to ancient Neolithic farmers had such different roles in various cultures of the Eurasian continent? Banned in some cultures, worshipped in others, and in other associated with death and the netherworld – apparently, the pig did not stay neutral to ancient people. Our answers are partly found in language.

Like the cow, goat and sheep, the pig belongs to the earliest domesticated animals, dating back to 10-11,000 BP in Anatolia and West Asia. Very likely, the first pigs were wild pigs attracted to human settlements by the waste. The early farmers, who very quickly must have understood the value of pigs as a protein source, successively domesticated them by killing the males and keeping the females for reproduction. In fact, even today, pigs are the most effective protein source of farming, besides chicken. The great danger associated with the hunting of wild boars must have contributed to the early farmers’ high esteem of pig domestication.
Domestication of pigs spread with the spread of farming, but for some reason – maybe that pigs are useless for herding or that they are easily infected by sickness – the domestication did not reach as far as the domestication of cow, goat, and sheep. Pigs are extremely unusual in Ancient Egypt, and pig domestication never reached Central Asia. In parts of West Asia and Anatolia, there was a decline in pig domestication already in early antiquity, something that was later transformed into a complete ban though religion, as in Judaism and later on also in Islam. In cultures where pig domestication was continued (Eastern and Western Europe, and the Mediterranean), the pig received a dual role in cultures: it was both an animal associated with death and the underworld, worshipped in chthonic sacrifices, as well as an animal symbolizing fertility and prosperity. This is found both in Graeco-Roman, Celtic, and Germanic mythology.

Can linguistics help us solving this enigma? There are several ways of investigating cultural patterns by language: either to look at the origin of words and their etymology down to the proto-language, or to consider the colexification patterns (meanings that co-occur in a language) and the meaning change patterns of words in genetically related languages. Stability and spread of cognates, as well as borrowing tendencies are important methods as well.

If we look at linguistic reconstructions, the picture is complex and interesting. Pig words, including a general word for ‘pig’ (generic), which is often the same as ‘sow’, as well as ‘piglet’, can be reconstructed to Proto-Indo-European (PIE *suH- ‘pig’, PIE *porḱo- ‘young pig, piglet’). These lexical roots, which had the meaning of ‘pig’ and ‘piglet’ already in the proto-language, indicate the Indo-Europeans had domesticated pigs. They are represented in a vast majority of pig words in Indo-European languages. Besides, some sub-branches replaced the forms or added new words for the pig terms. In Germanic languages, the male pig was derived from a root meaning ‘infertile’ (PGm *galtan- ‘boar’ < *gald(j)a- ‘infertile’ < PIE *ghol-tó-), indicating that male pigs or boars were gelded rather than killed. Several languages created new lexemes by referring to the grunting sound of pigs, such as Lithuanian čiūkà, kūkà ‘pig’ (Balto-Slavic *kyaw-, *kyū- < PIE *kew-, *kū- 'to howl') or Old and Modern Irish cráin ‘sow’ (Proto-Celtic *krākni- 'sow'). Some languages used the wide-spread Indo-European root for ‘young of animal’ (PIE *wetso- 'young of animal' < *wet- 'year').
The wild boar has its own root in Proto-Indo-European (PIE *h₁pr-o- '(wild) boar'), e.g., Latin aper, but very often, this root comes to represent both the wild and domesticated male pig, such as Croatian vȅpar, German Eber. Several languages use the Proto-Indo-European root PIE *h₂wŕ̥s-en- 'male' for the wild boar, such as Sanskrit varāha-, Hindi varāh, bā̆rāh. Else, a combination of a root meaning ‘wild’ and the root PIE *suH- 'pig' is very frequent, as in Bulgarian díva svinjá, German Wildschwein. In general, words with the meaning ‘wild boar’ also frequently mean ‘(domesticated) boar’, something that indicates that the wild boar was represented by the (male) boar, in contrast to the (female) sow, which represented the domesticated pig.     
Caucasian proto-languages, Proto-Kartvelian, Proto-North-West-Caucasian, Proto-Nakh, and Proto-Dagestanian all have reconstructed words with the meaning ‘pig’ (PKv *ɣor- ‘pig’, PNWC *ɣaw- ‘pig, piglet’, PN *eɣ-ə ‘pig’; PD *bol’- ‘pig’, PKv *burw- ‘gilt (female pig, 3-12 months old); suckling pig’, PNWC *bl˜’-ə ‘sow, female pig’, PN *borl’- ‘colourful’). This clearly points out that the early Caucasians domesticated the pig, something that we know they did early on.
Uralic, on the other hand, borrowed their pig words from Indo-European or Iranian (Proto-Finnic *sika ‘pig’, Proto-Finno-Ugric *porśas, *porćas, loan from Indo-Iranian), indicating that the early Uralic tribes did not domesticate the pig – they adapted pig domestication from Indo-European tribes.

However, the patterns of meaning change and colexification of pig words give an interesting picture. First, pig words often change to the meaning of other animals, often large and ‘chubby’ animals, such as ‘elephant’, ‘stallion’, or ‘camel’. In particular, this is the case in Caucasian languages. The domestic pig occasionally points in the direction of negative connotations, such as ‘filthy person’, ‘immoral person’, ‘fat’, or ‘greedy’. However, meaning changes and colexifications in the direction of power and fertility are frequent, such as 'bull', ‘hero’, ‘powerful’, 'king', ‘manly’, ‘chieftain’ and ‘husband’, in particular with the (wild) boar.

It is obvious that ancient people both worshipped and admired their pigs, but language indicates that they most of all respected the wild boars, probably because they were dangerous and hard to hunt. The domestic pigs were highly evaluated but also, apparently, looked down upon. The dangerous pigs we know from mythology have not given much imprint on language.
 
(Carling To appear (2019); Gamkrelidze et al. 1995; Larson and Fuller 2014; Mallory and Adams 1997)
 
Carling, Gerd (To appear (2019)), Mouton Atlas of Languages and Cultures. Vol. 1: Europe, Caucasus, Western and Southern Asia (Berlin - New York: Mouton de Gruyter).
Gamkrelidze, Tamaz Valerianovič, Ivanov, Vjačeslav Vsevolodovič, and Winter, Werner (1995), Indo-European and the Indo-Europeans : a reconstruction and historical analysis of a proto-language and a proto-culture (Trends in linguistics. Studies and monographs, 99-0115958-X ; 80; Berlin: Mouton de Gruyter).
Larson, Greger and Fuller, Dorian Q. (2014), 'The Evolution of Animal Domestication', Annual Review of Ecology, Evolution & Systematics, 45, 115-36.
Mallory, James P. and Adams, Douglas Q. (1997), Encyclopedia of Indo-European culture (London: Fitzroy Dearborn).

Hyllested, Adam 2017. Again on ancient pigs in Europe.
 


Semantic network of colexifications (blue, purple) and meaning change in etymologies (red) of the core concepts (green) PIG, WILD BOAR, and PIGLET in 85 Indo-European languages. Graph by Niklas Johansson.

I was asked by my friend and colleague Victor Mair (University of Pennsylvania) to come up with my 'safe list' of loans from and into Tocharian. This is a very interesting and challenging topic, which I will continue working upon in a couple of coming posts. First, I will start with the most tricky one: Tocharian loan contacts with Chinese.
Establishing Tocharian loans from and into Chinese are particularly complex for two reasons: first, the reconstruction of Chinese phonology at various stages in the Chinese prehistory, which is connected to many uncertainties and a large amount of debate, and second, the reconstruction of Tocharian phonology, which is particularly tricky and complex. The fundamental question is: How can we be certain that a specific word was borrowed at a certain stage from one reconstructed language to another? The prehistory of both languages can be stratified into various stages, Pre-Proto and Proto-Chinese, Old Chinese (Early and Late) Middle Chinese, and Pre-Proto- and Proto-Tocharian, Common Tocharian, Pre-A and Pre-B, and Tocharian A and B. Beyond that, we have the proto-languages Proto-Indo-European and Proto-Sino-Tibetan, which can be further stratified into stages on their way to Proto-Chinese and Proto-Tocharian. 
How can we know that a word, that obviously looks as if it was borrowed from Indo-European, is borrowed from Tocharian? The answer is that we have to show that specific Tocharian sound changes have taken place in the specific borrowed lexeme. These changes also have to be identified in the target language from the corresponding period. The process is very tricky, and the result is very few certain loans, more uncertain loans, and a huge number of uncertain loans.

Tocharian loans from Old Chinese (before 2nd ct BCE)
Toch. AB klu ‘rice’ was borrowed from Old Chinese: Mod. Ch. dào, Mid. Ch. *dawX, Old Chin. *C-luu-? ‘rice, rice-paddy’ (GSR 1078). In Middle Chinese, the initial cluster OChin. *gl- was simplified to d-. 
Toch. B rapaññe ‘of the last month of the year’ (LP 12 a2 rapaññe meṃne ikäṃ-wine ‘on the second day of the month rapaññe’), an adjective formed on a noun *rāp, from Old Chinese: Mod. Ch. là, Mid. Ch. *lap, Old Ch. *raap (GSR 637j) ‘winter sacrifice’. It is likely that an earlier meaning of the Chinese word is reflected in Tocharian.
Toch. A ri B rīye 'town' < Common Toch. *riye matches the Old Chinese reconstruction of Mod. Ch. lĭ, Mid. Ch. *liX, Old Ch. *r̯ǝ-? (GSR 978a) ‘walled city’. The word may also be a Tocharian loan in Old Chinese.
Further loans include  Toch. A truṅk Toch. B troṅk 'cave' 

Tocharian loans from Early Middle Chinese (possibly 3-4th ct ACE)
TA ṣoṣtäṅk ‘tax collector, banker’ (Skt. śreṣṭhin-) corresponds to Niya ṣoṭhaṃga ‘tax collector’, Bactr. σωταγγο < *šoštaṅgV. A possible source is Mod. Ch. shōucáng, Mid. Ch. *syuw+dzang, Old. Ch. *xiw-N-s-(h)raŋ (GSR 1103a+727g´) ‘receive, accept, gather’ + ‘conceal, store’.
TA ṣukṣ ‘(smaller) village’, TB kwaṣo* ‘village’. Parallel Mod. Ch. sù, Mid. Ch. *sjuwk, Old Ch. *suk (GSR 1029a) ‘lodge, mansion’. Itō & Takashima (1996:401) reconstruct Old Ch. *sjәkw-s with a final *-s (that has a function of localisation and production of nomina actionis etc.).
Toch. A āṅk* ‘seal, stamp’, Mod. Ch. yìn, Mid. Ch. *ʔjinH, Old Ch. *ʔin-s (GSR 1251f), *ʔi̯əɳ (Takashima) ‘seal, stamp’.

Further loans include
Toch. B cāk, tau  '(dry measures)', Toch. B cāne 'money'. Toch. B śakuse 'brandy', Toch. B ṣaṅk '(measure of volume)', TA yāmutsi TB yāmuttsi 'waterfowl' < 'parrot', Toch. B ṣitsok 'millet alcohol', Toch. B ṣipāṅkiñc 'abacus', Toch. A Toch B cok 'lamp', Toch. A lyäk Toch. B lyak 'thief', Toch. A < Toch. B tseṃ 'blue, Toch. A nkiñc Toch. B ñkante 'silver'.

These words give important indications of the impact of the Chinese culture on Tocharian. The track will be continued further on.

Sources/References
Carling, Gerd. Proto-Tocharian, Common Tocharian, and Tocharian – on the value of linguistic connections in a reconstructed language. In: Jones-Bley, Karlene, Huld, Martin E., Volpe, Angela Vella,  Dexter, Miriam Robbins Proceedings of the Sixteenth Annual UCLA Indo-European Conference. Journal of Indo-European Studies. Monograph Series (Institute for the Study of Man) 50, 47-70.
Kim, Ronald. (1999). Observations on the absolute and relative chronology of Tocharian loanwords and sound changes. Tocharian and Indo-European studies, 8, p. 111–138.
Lubotsky, Alexander, & Starostin, Sergei. (2003). Turkic and Chinese loan words in Tocharian.
Židek, Jan. (2017). Tocharian Loanwords in Chinese [Dissertation]. Praha: Univerzita Karlova.


I found a fun map on twitter, from the Foreign Service Institute (see below), which categorises the difficulty of learning a language identified as number of weeks. According to the map (which applies to (American) English speakers), Swedish and French are languages that are supposed to be very easy to learn, whereas, e.g., Russian is found among the more difficult languages. Even though applied to English speakers, the map would not be very different to a speaker of Swedish or German. Why is that so? If you ask normal people (i.e., without a degree in linguistics), the answer would naturally be that languages like English and Swedish “have no grammar”. If you ask what they mean by “grammar”, many would come up with the answer “they have no cases”.
In learning a language like Russian, we have, early on, to start learning many case forms, and then to learn the rules for how to apply them in language. This is difficult to most of us using a language with prepositions (in, on, on top of, towards) rather than cases. But why do some languages have cases instead of prepositions? Or, to reverse the question, why do some languages have prepositions instead of cases? And are really the usages of prepositions easier to learn than the usages of cases? Very few languages (such as Hungarian or Ossetian or other exotic languages in the Caucasian mountains) have as many cases as any normal language such as Swedish or English has prepositions. The rules of English prepositions are also hard to learn, and speakers of, e.g., Swedish often make mistakes in the use of prepositions.

However, if we take a look at the map of learning difficulties in contrast to the map of case system types below (data from the DiACL database), the correspondence between the two maps (in the parts that overlap) is striking. Analytical systems are the easiest, followed by fusional, and finally by agglutinating and other more complex systems. It would be very interesting to see what the map looks like to native speakers of Finnish or Russian.
 
Case systems are interesting, since they indicate that languages are circular in their evolution. Case systems are basically of three types:

isolating (or analytic), with no cases, relations between participants in an event is expressed by prepositions (or postpositions), agglutinating, with cases expressed by affixes with a simple function (plural, dative), which are attached to the noun stem, fusional, where paradigms are built by cases which may mark several functions, such as feminine + dative + plural.


Case systems are typically of one of these types, where isolating systems are small, with 0-2 distinctions (e.g., English, Swedish, Danish), fusional systems are medium-large (e.g., Russian, German), often with many different forms in the language, whereas agglutinating systems tend to be large (e.g., Finnish, Hungarian, Turkic). Agglutinating systems are ruled by the principle, one suffix – one function (e.g., plural, dative).
Systems are seldom ‘pure’: most languages have case systems that are partly isolating, partly, agglutinating, partly fusional. That is what makes them difficult to learn.

Why is the case situation the way it is? The structure of case systems has multiple explanations, and linguists are not yet aware of all the details in the process and development of case systems. One important reason for the outcome is language change and the cyclical behaviour of case systems: Fusional systems (e.g., Russian) tend to break down or erode to isolating systems (e.g., English), which in may merge their combinations of noun + adposition into an agglutinating system (e.g., Turkish). And agglutinating systems, again, may fuse their forms to become fusional. However, in this cycle, languages may become stuck for millennia between states, where various types of mixed, weird and complex systems, with many and irregular forms, become standard.
Besides time and cyclic change, geography and language contact shape case systems. The situation is complex: case systems show clear tendencies of sharing similarities over language, branch and family boundaries. For instance, no case is more frequent in Western Europe, fusional cases are more frequent in Eastern Europe and in various conservative pockets (islands, forests) such as in Iceland, Faroe Islands, Germany, and Dalecarlia, and agglutinating cases are more frequent on the Asian landmass (except for in the east, China). But the map is complex: historical explanations struggle with geographic explanations, which in turn struggle with typological cyclic behaviour explanations, when we try to explain the structure of case systems.


 
 


From https://twitter.com/AmericanGeo/status/1010364347502059520


Distribution of types of nominal case systems in modern (top) and ancient (bottom) languages. Dark red (1) targets no cases, green represent fusional types, pink/purple nuances agglutinating systems.


Illustration of the morphological cycle of case systems. Tocharian is an example of a mixed system which has moved in the opposite direction, from fusinal to agglutinating..


The very concept of ‘secret’ languages appears as if it is taken out of a crime novel. We may think of military secret codes, jargons by criminal inmates, or suburban youth slang. However, are not all languages (except for, let us say, standard English) in some sense ‘secret’, as long as they are spoken by a close group of people and unintelligible to outsiders? This is true in many cases: for instance, minority languages, immigrant languages, local languages or dialects, youth jargons, or ethnolects – these represent communication systems that are restricted to a closed group of speakers and not shared by outsiders. So what makes a language a ‘secret’ language? The answer is complex.
A secret language is no one’s mother tongue – this is probably the most important distinction from a ‘normal’ language. Rather, secret languages represent traditionally a jargon that was transferred from father to son, together with an occupation or a life-style, the purpose of which was not just to keep outsiders, but also members of the own family, outside. Secret languages, connected to various occupations, are found in Europe as well as in Africa and South America. They are very often the idiom of occupations with a distinct social function, most typical occupations that are excerpted within the society but which have with a special, often low, status. In Europe, pedlars, dealers, chimney sweepers or circus people, but also various types of low-status occupations, such as the executioner's henchman or skinners, used to have their own secret languages. In Africa, to mention an example, we have documented secret languages among healers, skinners, and sandal flickers. 
Linguistically, secret languages do not possess their own grammar, like ‘normal’ languages do. Their grammatical system relies on the grammar system of another language, most normally the majority language of the country where they occur. The grammar is often simplified and syntactic patterns are replaced by pidgin-like structures. A frequently occurring phenomenon is to borrow the ‘appearance’ of a language, by means of stress patterns, prosody, dialectal variation and gestures, but to switch all content words, sometimes the entire lexicon. The lexicon is either taken more or less completely from another language, or it is an ad hoc-conglomerate of words from various adjacent source languages. Very often, secret languages ‘distort’ their words by various complex patterns of morphological transformation; for instance, they truncate words and add heavy suffixes, they reverse syllables or letters, or they add epenthetic vowels within words. The result is a language that ‘melts in’ – from distance they appear as if they are a native or indigenous idiom, but not one single word is understandable to outsiders.

On Scandinavian soil, there are several traditional secret languages. One is the pedlars language, which in fact is two, one in the isolated county of Dalecarlia, gråmål ‘grey language’ or monsing, the main pedlars’ secret language, which during the 20th ct. transformed into a prisoners’ language. The vocabulary of monsing is based on multiple languages. Many words are borrowed from Scandoromani, the language of the indigenous Swedish Romani speakers, other words are from Low German, Rotwelsch, the Medieval secret jargon of European outsiders, from Finnish, Russian, as well as from Swedish. Swedish words are totally changed by linguistic distortion. Sources of monsing go back to the 17th ct. and they give us a glimpse of the type of communication that monsing speakers had. Besides communication related to their occupation, much of the content is rude, such as talk is about the farmers (who are supposed to be stupid) and in particular their wives and daughters (who are target of their sexual interest).
Even though there are no ‘real’ speakers of these languages in Sweden anymore, monsing is still, together with Scandoromani and knoparmoj, the secret language of chimney-sweepers, a very important source for words in the Scandinavian vernacular languages.

Reference
Carling, Gerd, Lenny Lindell & Gilbert Ambrazaitis (2014) Scandoromani. Remnants of a Mixed Language. Boston: Brill


Samples of the Swedish secret language Monsing (from 18th and 19th ct. sources). Most of them have found their way into Swedish slang.