Which principles govern gender assignment?

This week's blog post will deal with a complex topic: gender assignment.
As I have described in a previous post, gender involves a classification of nominal entities in language. Gender can generally be defined as classes of nouns which are reflected in the behaviour of associated words (Corbett 1991: 1). That is, gender is indicated by agreement of various elements. Gendered languages have varying number of genders present and they vary with respect to assignment, or how individual lexical items receive a gender (Audring 2014, 2017). Some languages assign gender based on semantic principles (semantic assignment systems), in which gender reflects categories such as biological sex or animacy. Other languages have formal assignment systems, which can be divided into morphological and phonological assignment (Corbett 1991: 7-8). Thus, gender assignment may be guided by semantic qualities (e.g., male/female, level of abstractness, shape), by morphological criteria (e.g., stem formation, inflection class, derivational suffixes), or by phonological criteria (e.g. word-final vowels or consonants). Languages may use semantic factors only, or a combination of semantic and formal factors, but all gender languages have a some semantic core (Corbett 1991: 8).

When looking at gender assignment in Indo-European culture vocabulary (the 100-culture list of our database, consisting of 8,500 gender- and cognacy-coded lexical items), some interesting tendencies emerge. We cannot investigate the phonological and morphological assignment principles on the data in its current shape (words in languages have not ben coded for morphology or phonology), but many other interesting tendencies can be extracted from the data.
First, the total distribution of genders of lexical items in the data is straightforward as masculine<feminine<neuter<alternans (see below). This is also reflected in the timeline of evolution of genders (see below), where we see that the masculine dominates in the early period, but weakens during the antique period and then regains strength during the first and in particular the second millenia ACE, on behalf of the feminine and in particular the neuter.  
We code all concepts for various semantic properties listed in the literature as important for gender assignment, such as animacy, collectiveness, countability, sexus, concreteness, and form/shape. In addition, we divide gender by different concepts classes, which we conclude by patterns of colexification and semantic change in the data.
We find that animated concepts (animals in our data) are significantly associated with the masculine gender (we compile both male and female forms of animals, but the overrepresentation of masculine for the general terms is important in the data). Further, we find that collectives as well as concepts coded as materials are significantly associated with the neuter gender. Our data does not contain abstract nouns, but surprisingly, we find that sharp and sticking implements are significantly associated with the feminine gender.
These tendencies for semantic properties undelie the overrepresentation of particular genders in certain semantic classes, which can be seen in the heatmap of gender distribution in relation to different classes above. In this heatmap, which divides concepts into classes, we can observe that neuter is overrepresented for metals and materials and drink and drugs, masculine is overrepresented for all animals, feminine is overrepresented for weapons, trees and insects (honeybee). This indicates that assignment is not just caused by semantic property, it is very likely also caused by semantic class, but more research and data is required to prove this assumption.

Audring, Jenny (2014), 'Gender as a complex feature', Language Sciences, 43, 5-17.
--- (2017), 'Calibrating complexity: How complex is a gender system?', Language Sciences, 60, 53-68.
Carling, Gerd (2019), Mouton Atlas of Languages and Cultures. Vol. 1: Europe, Caucasus, Western and Southern Asia (Berlin - New York: Mouton de Gruyter).
Corbett, Greville G. (1991), Gender (Cambridge textbooks in linguistics, 99-0104661-0; Cambridge: Cambridge Univ. Press).
--- (2014), The expression of gender [Elektronisk resurs] (Berlin ;: De Gruyter Mouton).
Corbett, Greville G. and Fraser, Norman M. (2000), 'Gender assignment: a typology and a model', in Gunter Senft (ed.), Systems of Nominal Classification (Cambridge: Cambridge University Press), 293-325.
Corbett, Greville G. and Fedden, Sebastian (2016), 'Canonical Gender', Journal of Linguistics, 52 (3), 495-531.
Van Epps, Briana 2019. Sociolinguistic, comparative and historical perspectives on Scandinavian gender: With focus on Jamtlandic. PhD dissertation, Lund.

Distribution of the genders alternans, commune, neuter, feminine, and masculine in the dataset (lexemes of 104 concepts in 105 Indo-European languages)

Timeline of gender distribution in the lexical dataset (by Briana Van Epps).