The short answer: probably southern Africa.
As I note in Chapter 7 of WEIT, it was Darwin who first pointed out, in The Origin, the similarity between the evolution of languages and the evolution of species. Languages evolve in a straight line, like some lineages of plants and animals, and they sometimes split, so that different languages like French and German nevertheless have a common ancestor. One can draw a “tree of life” for languages just as one can do for species. (The parallel isn’t perfect, of course: there is “horizontal transmission” of words across languages, and, as Steve Pinker pointed out to me, language “mutations” are not “random” in the sense that they don’t arise irrespective of their utility.)
But the parallel between genetic and linguistic evolution resulted in a really nice paper published this week in Science on the origin of human language. The author, Quentin Atkinson at the Department of Psychology at the University of Auckland, was inspired by earlier studies that traced the origin of modern Homo sapiens using genetics and morphology. Those earlier studies showed that both genetic and morphological variation was highest in African populations, and declines with distance from Africa. To evolutionary anthropologists, this suggested that modern H. sapiens arose from an out-of-Africa migration that began about 60,000 years ago, and that the colonization of the rest of the world occurred through a series of sequential “founder events,” in which smallish groups of humans moved from one place to another.
Atkinson looked at a linguistic analogue of genetic variation: variation in phonemes, defined as “the smallest contrastive unit in the sound system of a language.” English, for example, has about 44 phonemes, including many consonant sounds like “b” and “p”, and fewer vowel phonemes (the “a” sound in “about” and in “bad” are two distinct phonemes). It’s also been known for a while that different human languages vary widely in the number of phonemes. One site says the following:
The total number of English phonemes is about 44, the exact number depending on the speaker’s accent. In terms of the languages of the world, the smallest number of phonemes known to exist is the 11 of Rotakas, an Indo-Pacific language. [JAC: Hawaiian has 13.] The largest is the 141 of !Xu, a language spoken in southern Africa; the average number of phonemes in a language is in fact 31. About 70% of languages have between 20 and 37 phonemes.
And, as Pinker noted when I asked him about this paper (which he likes), “I always noticed that the San [JAC: previously called “Bushmen”] had more than a hundred phonemes, the Polynesians less than a dozen (hence the long, polysyllabic names in Hawaii and New Zealand”).
It’s also well known that the number of phonemes in a language is significantly correlated with the number of its speakers, presumably because phonemes undergo stochastic loss by “phoneme drift” in small populations—just as genetic variation undergoes stochastic loss by genetic drift in small populations.
Atkinson decided to study the geographic distribution of phonemes in languages throughout the world. He looked at 504 languages for which phoneme number was available, and immediately observed that, like genetic variation itself, phoneme diversity was highest in Africa and lowest in Oceania, with “clinal” (gradual geographic) variation from high to low number in between. Here’s the plot from Figure 1 of his paper:
(He also confirmed that phoneme diversity was indeed correlated with the number of speakers of a language.)
The plot above suggested another hypothesis: that language originated in Africa, where it retains a high number of ancestral phonemes, and then spread through successive founder events to the rest of the world, losing phonemes through “linguistic drift” at each event. This, of course, would require something that linguists find nearly unbelievable: modern languages retain vestiges of the structure they had 60,000 to 10,000 years ago, the period when modern humans colonized the globe. The clinal variation in phonemes in the plot above would then reflect successive loss of “sound units” by successive establishment of populations by small numbers of migrating ancestors.
To test this, Atkinson made a model that assumed spoken language had originated in one place and spread throughout the globe, and also that phoneme number was correlated with present population size. He then removed the effect of population size to pinpoint an area where language could have originated. Here’s Figure 2A from his paper, with the most likely area of language origin being the lightest color, and successively darker regions showing the inverse relationship between phonemic diversity and distance. As you see, Ground Zero for language is southwest Africa:
This mirrors very nicely the path of human migration out of Africa suggested by genetic and phenotypic data: we moved into Eurasia, then western and southwestern Asia, crossed the Bering Strait about 20,000 years ago, and made it to Polynesia only a few thousand years ago. Compare the Old World part of the map above with this map of human genetic diversity taken form the New York Times. Both maps show the same pattern: a high-diversity focus in southwest Africa, with diversity decreasing as one moves further from that area.
Another nice result was Atkinson’s observation that (after controlling for population size) phoneme diversity also declines in the Americas with distance from the Bering Strait, as expected if our ancestors hip-hopped southward after the migration from Asia.
Atkinson took into account possible complicating factors, like a multi-region origin of language (not really supported by the data), and the idea that phonemic diversity simply results from more contact with speakers of other languages, so you get diverse by absorbing the speech of your neighbors (that wasn’t supported, either). As far as I can see in my linguistic ignorance, Atkinson’s conclusions appear not only provocative but pretty sound. It’s amazing to think that modern languages retain, in their number of phonemes, vestiges of their ancestry and information about early human migration. This will be a surprise to linguists, who, as far as I know, see ancestry of languages decaying completely after a few thousand years.
At the end of his paper, Atkinson suggests that language itself may have fostered the “out-of-Africa” movement of H. sapiens:
Truly modern language, akin to languages spoken today, may thus have been the key cultural innovation that allowed the emergence of these and other hallmarks of behavioral modernity and ultimately led to our colonization of the globe.
I think that’s taking it a bit too far: after all, there could have been many other reasons for colonization, including increased population sizes that mandated movement to avoid competition (these sizes would, of course, be correlated with phoneme number!) or the decimation of big game. Every researcher likes to think that her object of study was the key feature in creating modern human culture. But I can excuse Atkinson’s last speculative sentence, for his analysis is truly creative and remarkable—a pathbreaking study of the relationship between human language and human evolution.
____________
Atkinson, Q. D. 2011. Phonemic diversity supports a serial founder effect model of language expansion from Africa. Science 332:346-349.



