Most languages ​​are not English

featured image

Something has gone wrong with psychology experiments. Only 12% of the world’s population is from Western industrialized countries, yet 96% of participants in psychology experiments are from Western industrialized countries. The question can therefore be asked to what extent the results of the many psychology experiments conducted every year can be generalized to the entire population – from 12% to 88%.

Article continues after ad

Take, for example, attachment theory – the relationship with at least one primary caregiver for normal social and emotional development in children and the importance of having a secure base. The theory is largely based on Western-biased measures of sensitivity and security – no surprise given the trial participants. According to this view, the emphasis is on the child’s independence, uniqueness and exploration.

But in other parts of the world, for example in Japan, team spirit and cooperation are more important. Securely attached children communicate their feelings more openly – this is considered positive in the Western world but disliked in Japan.

world population

Source: Crayon

Or take the field of numerical cognition, which studies the cognitive, developmental, and neural aspects of numerical and mathematical thinking. While the Western world organizes time, size, and numbers from left to right, the indigenous people of lowland Bolivia organize numbers in either direction. These cross-cultural differences can have a significant impact on psychological theories – for example, the Spatial Numerical Association of Response Codes (SNARC) theory, which claims that we use the mental number line in our mathematical reasoning with subscript numbers on the left and right. High numbers on the right.

The importance of considering generalizability across populations was recognized more than a decade ago when psychologists argued that most people are not eccentric—that is, they are not from Western, educated, industrialized, rich, and democratic (WEIRD) societies—and that the findings of undergraduates may not necessarily be generalizable. Americans in particular.

Article continues after ad

Languages ​​are not strange either

For language research, things are not very different from participants in psychology experiments. The vast majority of psycholinguistics and computational linguistics studies rely on English as the target language.

The results found for English are easy to generalize to other languages. We may be able to draw conclusions regarding language acquisition, language processing, and language disorders. We may use language input for social psychological theories, clinical practice, or language models for artificial intelligence.

However, English is only one of over 7,000 languages ​​in the world, and not even the most widely used language by native speakers. Therefore, English cannot be a prototype for those other seven thousand languages.

In fact, half of the languages ​​spoken around the world have a completely different structure than English. For example, English has subject-verb-object (SVO) word order, which is common among Indo-European languages. But this order is completely different for the 58 percent of the world’s languages ​​that have another word order (SOV is the most common, but not the only one). Drawing generalizable conclusions about grammatical structures requires taking these structures into account.

Or take another example. English lacks grammatical gender. Unlike about 50% of the world’s languages ​​that contain grammatical gender, in English we do not say “the organization and its employees” or “the book and its pages.” One can imagine that determining the grammatical gender of a word might have an effect – however small – on the meaning of that word.

Article continues after ad

Here is the last one. Approximately 60 to 70 percent of the world’s languages ​​feature case systems. For these languages, you can easily state that John kissed Mary, but given the case system, it is clearly Mary who did the kissing. English is not one of them, because word order determines who did what to whom.

Linguistic relations

It is perhaps not surprising that language researchers primarily analyze English. When psychologists were limited to offline experiments, it was not surprising that the experiments were conducted primarily with English participants. When online experiments emerged, it became easier to reach participants all over the world, and to step away from the WEIRD community.

Linguistic researchers in psychological and computational fields have been largely tied to English because they simply did not have the tools needed for other languages. These text analysis software packages mainly focus on the English language. Searching for languages ​​other than English was more difficult, and comparison between languages ​​was almost impossible.

Article continues after ad

We recently published an article that aims to change that. There’s a tool called Lingualyzer — a linguistic parser — that analyzes several dozen languages ​​across hundreds of features, the same 351 multidimensional linguistic metrics across 41 different languages. This provides an easy way to compare languages ​​or analyze a single text written in a particular language.

The world’s 7,000 or so languages ​​are organized into language families, with brother/sister languages ​​being more closely related because they historically stem from the same mother tongue. Using this tool, we are now able to understand the relationships between some of these languages.

A Dendrogram was generated from the distance matrix based on the differences in Lingualyzer output between languages

Source: Guido Leenders and Max Luers

One can view these language families from an evolutionary or linguistic perspective, as historical linguists have done. But it may now also be possible to look at these language families from within languages. In order to do this, ideally, what one needs are parallel texts, which is text that contains the same content translated into multiple languages.

Take UN texts for example: they must be stated in exactly the same way across multiple languages. We took texts from the Universal Declaration of Human Rights, translated into 41 languages, asked Lingualyzer to analyze these texts along a large number of dimensions, and grouped these languages ​​based on the linguistic results. The results were dramatic: the grouping of the languages ​​was broadly consistent with the grouping that taxonomists call performing when looking at genetic relationships across languages.

Such studies empirically show which languages ​​have close relationships with other languages. But more importantly, it allows moving beyond exotic languages.

Just as experimental psychologists and clinical psychologists should be careful about generalizing from WEIRD language, psycholinguists, cognitive scientists, and other linguistic researchers should be careful about generalizing from WEIRD language. Thanks to some easy-to-use mathematical tools, this is now more feasible.

References

Eberhard, D. M., Simons, J. F., and Fennig, C. D. (2022). Ethnology: Languages ​​of the World (25 editions). Seal International.

Henriques, J., Heine, S.J., and Norenzayan, A. (2010). The strangest people in the world? Behavioral and Brain Sciences, 33(2-3), 61-83.

Luers, M. (2021). Keeping these words in mind: How language creates meaning. Rowman and Littlefield.

Hutchinson, S., and Loewers, M. M. (2014). Language statistics explain the spatial-numerical association of response codes. Psychological Bulletin and Review, 21, 470-478.

Rothbaum, F., Wise, J., Bott, M., Miyake, K., & Morelli, G. (2000). Attachment and Culture: Security in the United States and Japan. American psychologist, 55(10), 1093-1104.

Previous Post Next Post

Formulaire de contact