Where on Earth did language begin?

April 16, 2011 • 5:42 am

The short answer: probably southern Africa.

As I note in Chapter 7 of WEIT, it was Darwin who first pointed out, in The Origin, the similarity between the evolution of languages and the evolution of species.  Languages evolve in a straight line, like some lineages of plants and animals, and they sometimes split, so that different languages like French and German nevertheless have a common ancestor.  One can draw a “tree of life” for languages just as one can do for species. (The parallel isn’t perfect, of course: there is “horizontal transmission” of words across languages, and, as Steve Pinker pointed out to me, language “mutations” are not “random” in the sense that they don’t arise irrespective of their utility.)

But the parallel between genetic and linguistic evolution resulted in a really nice paper published this week in Science on the origin of human language.  The author, Quentin Atkinson at the Department of Psychology at the University of Auckland, was inspired by earlier studies that traced the origin of modern Homo sapiens using genetics and morphology.  Those earlier studies showed that both genetic and morphological variation was highest in African populations, and declines with distance from Africa.  To evolutionary anthropologists, this suggested that modern H. sapiens arose from an out-of-Africa migration that began about 60,000 years ago, and that the colonization of the rest of the world occurred through a series of sequential “founder events,” in which smallish groups of humans moved from one place to another.

Atkinson looked at a linguistic analogue of genetic variation: variation in phonemes, defined as “the smallest contrastive unit in the sound system of a language.”  English, for example, has about 44 phonemes, including many consonant sounds like “b” and “p”, and fewer vowel phonemes (the “a” sound in “about” and in “bad” are two distinct phonemes).  It’s also been known for a while that different human languages vary widely in the number of phonemes.  One site says the following:

The total number of English phonemes is about 44, the exact number depending on the speaker’s accent. In terms of the languages of the world, the smallest number of phonemes known to exist is the 11 of Rotakas, an Indo-Pacific language. [JAC: Hawaiian has 13.] The largest is the 141 of !Xu, a language spoken in southern Africa; the average number of phonemes in a language is in fact 31. About 70% of languages have between 20 and 37 phonemes.

And, as Pinker noted when I asked him about this paper (which he likes),  “I always noticed that the San [JAC: previously called “Bushmen”] had more than a hundred phonemes, the Polynesians less than a dozen (hence the long, polysyllabic names in Hawaii and New Zealand”).

It’s also well known that the number of phonemes in a language is significantly correlated with the number of its speakers, presumably because phonemes undergo stochastic loss by “phoneme drift” in small populations—just as genetic variation undergoes stochastic loss by genetic drift in small populations.

Atkinson decided to study the geographic distribution of phonemes in languages throughout the world.  He looked at 504 languages for which phoneme number was available, and immediately observed that, like genetic variation itself, phoneme diversity was highest in Africa and lowest in Oceania, with “clinal” (gradual geographic) variation from high to low number in between. Here’s the plot from Figure 1 of his paper:

(He also confirmed that phoneme diversity was indeed correlated with the number of speakers of a language.)

The plot above suggested another hypothesis: that language originated in Africa, where it retains a high number of ancestral phonemes, and then spread through successive founder events to the rest of the world, losing phonemes through “linguistic drift” at each event.  This, of course, would require something that linguists find nearly unbelievable: modern languages retain vestiges of the structure they had 60,000 to 10,000 years ago, the period when modern humans colonized the globe.   The clinal variation in phonemes in the plot above would then reflect successive loss of “sound units” by successive establishment of populations by small numbers of migrating ancestors.

To test this, Atkinson made a model that assumed spoken language had originated in one place and spread throughout the globe, and also that phoneme number was correlated with present population size.  He then removed the effect of population size to pinpoint an area where language could have originated. Here’s Figure 2A from his paper, with the most likely area of language origin being the lightest color, and successively darker regions showing the inverse relationship between phonemic diversity and distance.  As you see, Ground Zero for language is southwest Africa:

This mirrors very nicely the path of human migration out of Africa suggested by genetic and phenotypic data:  we moved into Eurasia, then western and southwestern Asia, crossed the Bering Strait about 20,000 years ago, and made it to Polynesia only a few thousand years ago.  Compare the Old World part of the map above with this map of human genetic diversity taken form the New York Times.  Both maps show the same pattern: a high-diversity focus in southwest Africa, with diversity decreasing as one moves further from that area.

Another nice result was Atkinson’s observation that (after controlling for population size) phoneme diversity also declines in the Americas with distance from the Bering Strait, as expected if our ancestors hip-hopped southward after the migration from Asia.

Atkinson took into account possible complicating factors, like a multi-region origin of language (not really supported by the data), and the idea that phonemic diversity simply results from more contact with speakers of other languages, so you get diverse by absorbing the speech of your neighbors (that wasn’t supported, either).  As far as I can see in my linguistic ignorance, Atkinson’s conclusions appear not only provocative but pretty sound.  It’s amazing to think that modern languages retain, in their number of phonemes, vestiges of their ancestry and information about early human migration. This will be a surprise to linguists, who, as far as I know, see ancestry of languages decaying completely after a few thousand years.

At the end of his paper, Atkinson suggests that language itself may have fostered the “out-of-Africa” movement of H. sapiens:

Truly modern language, akin to languages spoken today, may thus have been the key cultural innovation that allowed the emergence of these and other hallmarks of behavioral modernity and ultimately led to our colonization of the globe.

I think that’s taking it a bit too far: after all, there could have been many other reasons for colonization, including increased population sizes that mandated movement to avoid competition (these sizes would, of course, be correlated with phoneme number!) or the decimation of big game.  Every researcher likes to think that her object of study was the key feature in creating modern human culture.  But I can excuse Atkinson’s last speculative sentence, for his analysis is truly creative and remarkable—a pathbreaking study of the relationship between human language and human evolution.


Atkinson, Q. D. 2011.  Phonemic diversity supports a serial founder effect model of language expansion from Africa.  Science 332:346-349.

65 thoughts on “Where on Earth did language begin?

  1. Horizontal transmission is similar to “Horizontal gene transfer” (although more common in language)I would say that mutations aren’t always random in genes (because of natural selection varying) you could describe words like night as being hox genes and words like serendipity as being more similar to say a pseudogene (in mutation rate).

  2. What was the “original language” spoken by Adam and Eve, who taught them that language? How did Cain communicate with his new wife in the land of Nod?
    Now these are real questions science needs to answer !!

  3. One thing I’m hoping might one day happen is the reconstruction of ancestral species. We can use genetic analysis to determine a great deal about the archetypal mammal…so is there any chance we can learn enough for Craig Venter to one day grow one in a petri dish?

    It seems to me that Atkinson’s work points the way to doing the same with our grandmother tongue.

    Of course, in neither case would the reconstruction be 100% perfect. Just like with fossilized skeletons, educated guesses will have to be made to fill in missing pieces. However, that no more seems to me to be a valid reason for failing to attempt this sort of reconstruction than it does to paleobiologists.



    1. There is much more difficulty in making credible linguistic reconstructions than there is for biological ones. For a start, the linguist can only work with languages that either had written traditions or that are spoken today, unlike with fossils, present day species, DNA, and so on. Bear in mind, too, that written languages only rarely copy the spoken language perfectly; most of the time, written language retains archaisms not employed in the spoken language, or due to style constraints, only a certain register of the language is preserved. A distorted, idealised language full of archaisms – not the best data for reconstruction.

      Even reconstructing proto-Indo-European, or any proto-language for any attested language family, was and is something of a difficult task. It is also not a language that has ever been actually spoken by anyone, but an heuristic to look at language change. So we can reconstruct a plausible but idealised form of language that bears some relationship to a language from which attested languages diverged, but for accurate reconstruction, you can’t compare proto-languages without a huge leap of faith, and the other method – mass comparison – is not particularly rigorous either, except for extremely broad, analytically-nigh-on-useless conclusions.

      So the closest we can come to scientifically recreating the earliest language of mankind is to state a maxim: “the earliest human language had a very large number of phonemes”, for instance. Beyond that, it’s probably impossible to do anything.

      At least, that is how it appears from here.

      1. Well, the point of Atkinson’s research, it seems to me, is that language did evolve in a manner similar to the way biological species do, which implies that the same sorts of genetic analyses should be possible.

        If we can reconstruct ancestral genomes, we should be able to do the same with languages.

        Again, it won’t be anywhere near perfect. But I don’t think anybody goes to a natural history museum, looks at the T. rex skeleton suspended from the ceiling, and concludes that the Jurassic was dominated by animated skeletons or that the animal on display lived with a plaster-of-paris hip bone and had steel ligaments.



        1. which implies that the same sorts of genetic analyses should be possible.

          Unfortunately, no. Phonemes are very, very far removed from the fundamental elements of the universe – much further removed than genes – and they change in ways that makes reconstruction difficult. For instance, in establishing the relationship between the Austronesian and Tai-Kadai languages (both well-attested language groups), linguists have had to make do with a single language that had a small number of proto-Austronesian-like archaisms (Buyang, IIRC) and another small number of potential similarities in core Tai-Kadai and Western Austronesian vocabulary based on *only six sound changes*! (For the record, a single sound change can be the difference between an F and an H in a word [ie, Samoan “fale”, Hawaiian “hale”], or a T and a glottal stop, so six sound changes can be the difference between, well, almost any phonemes.)

          And that’s for a time depth of “only” 6,000 years or so. Unlike genes, phonemes change in ways that, while occasionally predictable, in the long term show no similarities whatsoever, at which point accurate reconstruction is literally impossible. Genes at the very least share some core molecules over enormous time depths and geographical spans. Combined with the much greater range of evidence (fossils, etc), and the story is quite different.

          And at least a plaster dinosaur skeleton is visually identical to an actual dinosaur skeleton. A proto-language, especially one at a time depth of above ten thousand years, let alone 100,000 or more, would have errors of the same magnitude as putting a femur in the jaw.

          It’s a shame, but historical linguistics is restricted in its capability, not because we want it to be but because its evidenciary basis is.

        2. I fail to see how this research shows that “language did evolve in a manner similar to the way biological species do.”

          Languages are not sets of phonemes. Even if we took this paper as evidence that many (and “all” would be going overboard) phonemes have “daughters” that spread throughout the world, that would not be evidence that the languages themselves were genetically related, provided you think the individuation conditions of languages involve, say, their syntax and semantics.

    2. If you were to “clone” an ancestral human, dressed him in a suit and sat him in Congress, nobody would be able to tell the difference.

      Of course, that doesn’t say much about Congress … nevertheless …

  4. AACK! Paper is behind a pay wall! *headdesk*

    Very exciting analysis! Those two maps gave me chills–what an amazing match-up. Dang–I knew there was a reason I should have learned !Hausa at school…

    “English, for example, has about 44 phonemes…” Don’t you wish English had about 44 symbols to go with those 44 phonemes, instead of only 26? Would make life so much easier ;-))

    1. good idea ~ I’ve noticed recently (as one of a few examples) that people are writing “loose” when they mean “lose”. More symbols for the consonants would solve this (The arts classics scholars would be annoyed I guess)

      Looking at ‘clicks’ I think it’s very interesting that this element of speech can be reinvented as in the bolded below from wikipedia on the Khoisan languages:

      English and many other languages may use clicks in interjections, such as the dental “tsk-tsk” sound used to express disapproval, or the lateral tchick used with horses. In Ningdu Chinese, flapped nasal clicks are used in nursery rhymes. In Persian, Greek, Maltese, Turkic and Levantine Arabic as well as southern Italian dialects such as Sicilian, a click accompanied by tipping the head upwards signifies “no”. Clicks occasionally turn up elsewhere, as in the special registers twins sometimes develop with each other, and in onomatopoeic usages

      I wonder if the reinvention of ‘clicks’ is just something that happens because it is in our vocal range or if it is a race memory ?

      1. What do you mean by “race memory”? If you mean cultural transmission from ancient times, then it’s not reinvented. If you’re talking about inherited anatomical capabilities, how is that different from “something that happens because it is in our vocal range”?

        1. Instinct.

          Like how we automatically throw out our arms when we topple ~ this instinct from an arboreal past is often a bad move outside of tree-living

          Like how when you tickle a human baby foot it grasps for the first six (?) months

          When twins invent a private language it can include ‘clicks’ & I wondered if there was more to this than ‘because we can’

      2. The cover term “clicks” includes at least five entirely different articulatory placements, on some you breath in, others breath out, and more. These sounds are common all over the world, but only in Southern Africa they are considered to be phonemes instead of paralinguistic.

    2. Phonetic spelling has its benefits, but also some significant drawbacks. The thing that bothers me the most about it is that it destroys evidence of the semantic or etymological relationships between words.

      Example: The first vowel sounds in “mage”, “magic”, and “magician” are all different (in American English, at least), which means the “a” would have to be a different letter in each one. Same thing with the “c” in magic and magician. Phonetic spelling would make these words not look so much like each other, even though they are quite related, and the way we spell them now reflects that.

  5. Jerry, in Atkinson’s model Ground Zero of language appears to be from Southwestern Africa, not Southeastern as you stated. A possible origin could be Pinnacle Point Cave, near Mossel Bay in South Africa. Wikipedia has an interesting article at: http://en.wikipedia.org/wiki/Pinnacle_Point#History_of_the_research
    Could the marine diet be the cause of a rapid acceleration of intelligence?
    Ivory Girl, there was no Adam and Eve, so they couldn’t have had children. Easy to answer.

    1. I saw Curtis Marean speak about his research on Pinnacle Point. Very intriguing stuff. And the proximity to the point of linguistic origins adds to it!

  6. Your link goes to a different article in Nature. Atkinson’s found a correlation, not causation, over about 1/10 of the worlds languages, which is interesting, maybe. His hypothesized explanation is very controversial, it seems to assume languages always become simpler, with no mechanisms that create new phonemes, contra the facts.

    Just offhand, Piraha and Mura, in the Amazon, also have 11 phonemes, many of the Canadian languages have large phoneme inventories, as do those in Caucasia. Can anyone who can read the article tell me how he counts phonemes with regards to tones, vowel length, clicks, and such?

    1. I’m familiar in passing with WALS (http://wals.info/feature, where he got his data). It’s unclear whether he just took phoneme ranges, or somehow got access to more exact data. At any rate, I think WALS gets their data from published grammars of various languages, so Atkinson’s analysis is as legitimate as those grammars are.

  7. There is a lot of criticism of the methodology. See here and here.

    Phonemes are an artifact of linguistic analysis, not any part of a language itself. Modern phonology has pretty much left Phoneme Theory in the dust. One problem =
    If language consists of phonemes, and
    phonemes are “the smallest contrastive unit in the sound system”, but
    some languages have no sound system,
    then either the definition is wrong, or some languages are not languages.

  8. Fascinating.

    I’m not sure, though, why you can’t reach an opposite conclusion (ie, fewer phenomes would represent a simpler and therefore more archaic language). With increasing distance from the basal language, why would you lose phenomes and not add them? In fact, why does the number of phenomes actually tell you anything about the distance from the basal language, and not merely be an indicator of local conditions where there is a need for a click or a “th” sound to distinguish “tasty snake” from “dangerous snake”?

    Of course, you’re presented with the evidence in front of you and try to make sense of it in context. But I’m not sure that the study truly tells us anything about language origins that doesn’t require some strong presuppositional biases.

    Not that those biases aren’t correct … I just don’t see reaching an independent conclusion without them.

    Phenomenon or epi-phenomenon, that is the question.

    1. The origin of a proto-language is most likely to be in the area in which there is the greatest diversity of languages in the language family. This is not just a supposition; there is a fair amount of evidence for it, and plenty of reason behind it. Think of the Founder Effect, for instance. This is also applicable not only at the extreme long term, as Atkinson has tried to employ it, but also in the relative short term of proto-language reconstruction.

      1. Well, that’s another statement for which I could argue the exact opposite point of view. Seems to me that a tribe of humans who separate far from congress with other tribes of humans would be more likely to develop a separate language family than those who are joined by proximity, and who might trade with, share mates, war against, etc.

        For example, I would expect that the language of native Americans would be fundamentally different from the Mongolian tribal languages that they carried across the Bering Strait (assuming that scenario is correct). And that indeed is what you see.

        It’s all interesting stuff; but I’m not sure the research isn’t being used to reach conclusions that have already been intuited.

        Again, they could be completely valid … but I haven’t seen anything here that convinces me that other conclusions aren’t at least as valid.

        I don’t see much null-hypothesis disproving in such research. It may well be impossible to do so.

        Gotta run … can’t continue to discuss …

        1. In Taiwan, there is a large amount of linguistic diversity in the indigenous languages, which are all Austronesian. All the Austronesian languages outside of Taiwan show only a subset of this diverse vocabulary, and lots of loanwords from other languages (like Sanskrit or Arabic) that have changed to fit the phonology of the Austronesian language. The languages on Taiwan don’t have the loanwords from Sanskrit and Arabic, but they do have the original core vocabulary, as well as extra vocabulary related to the core vocabulary that has been lost by non-Formosan Austronesian languages.

          I hope that makes sense, because that’s how we can see that the centre of diversity is also the point of origin for the languages family.

          So the Austronesian languages on Taiwan are called ‘Formosan’ languages, and the Austronesian languages off Taiwan (Javanese, Hawaiian, Malagasy, Maori, and so on) are all a subset known as ‘Malayo-Polynesian’ of one language that left Taiwan about five thousand years ago, whose closest relative today is probably Paiwan, spoken on the southern tip of Taiwan, facing the Philippines. Archaeology also bolsters the idea that Austronesian speakers went south from Taiwan. So a number of lines of evidence show that the centre of diversity is probably the point of origin.

          Diversity in genetics is also indicative of a place of origin, again due to the same founder effect.

          For these reasons, Atkinson is correct to assume that the centre of diversity is also likely the place of origin. However, due to the expansion of the Niger-Congo speakers from the northwest (the subset known as the Bantu), the concentration of Khoisan languages (which have lots of phonemes) in the southwest of Africa is a result of a contingent and relatively recent historical process. Before the Bantu expansion, Khoisan languages were almost certainly more widely distributed throughout southern Africa, and placing the origin point in southwest Africa is premature, especially given other evidence.

          Apologies for the long post, but you left a lot to answer!

    2. From the article:
      “If phoneme distinctions are more likely to be lost in small founder populations, then a succession of founder events during range expansion should progressively reduce phonemic diversity with increasing distance from the point of origin, paralleling the serial founder effect observed in population genetics.”

    3. From the article:

      “If phoneme distinctions are more likely to be lost in small founder populations, then a succession of founder events during range expansion should progressively reduce phonemic diversity with increasing distance from the point of origin, paralleling the serial founder effect observed in population genetics.”

      Pretty big ‘if’.

    4. I’m no linguist, but I do know for sure that in the Indo-European megafamily, as you go back in time towards the invention of writing, grammar tends to get MORE complex and precise, not less. English is a right ruddy mess in comparison.

      1. Kevin, I would have thought that once people had begun to realise what they were doing language would have become maximally complex very fast – I mean, within a few generations, not thousands of years. Whether this would have involved an increase in the number of phonemes would depend, I suppose, partly on the way a particular language had evolved in the first place (e.g. tones, moveable stress, more vocalic consonants or more vowels, modification of consonants or vowels, etc), and I suppose partly on the increasingly varied pronunciations that each new generation seems to create and how acceptable these prove to be (I mean, whether they are expressive or useful or confusing, for example).

        Re abadidea’s point, ancient Indo-European accidence is more complex than that of English (Dutch, French, German, modern Greek…) but accidence is only one part of grammar. English grammar is as complicated as you please, but accidence plays only a small role. I suppose if you look at ancient Greek and set it out in charts and tables of conjugations and declensions and indicatives and subjunctives and what have you then it all looks too too amazingly complicated. But it mostly boils down to tacking a limited number of suffixes onto a limited number of word-stems, and the same job is handled perfectly well in English by the use of verbal particles, pronouns and prepositions (which we regard as separate “words”). This makes English look more simple. Part of the difficulty is the survival of older systems, such as vowel-modification, side-by-side with more recently evolved patterns, like “weak” verbs. This no doubt partly accounts for the impression of amazing complexity in ancient Greek and the “right ruddy mess” of English.

        I don’t suppose the average ancient Greek worried too much about it, just as modern people generally fail to see the extraordinary complexity of their own languages. And why shouldn’t a high degree of accidence be a primitive feature? As for the greater precision of the older languages, my only experience is a fair acquaintance with Latin, and it seems to me that Latin is as precise as it needs to be, but it is not free from clumsiness and all its accidence cannot prevent it from sometimes being ambiguous and vague.

  9. We know that languages that use sign instead of phonemes have independently arisen and fairly often independent of each other (they usually require a critical mass at a young age of people who are deaf such as in a school for the deaf). One example would be Nicaraguan Sign Language which is fairly new.

    Another thing to remember about languages is the existence of creoles. A crocoduck can’t exist but the linguistics’ equivalent can.

    1. Creoles are crocoducks–LOL.

      There are no languages that use sign instead of phonemes, that’s a mischaracterization of what a phoneme is. What there are is languages that organize manual (as opposed to oral) movements into phonemes.

      As you point out, these languages are proven to originate in many different times and places, so by the basic science principle of Uniformitarianism, there should be no monogenetic origin of spoken language, in Africa or elsewhere.

      1. Actually the “phonemes” of sign languages – the first-level units – are the handshapes (which used to be called chiremes but are now confusingly called “phonemes” as well), which are placed and moved to make words/signs, the second-level units, which are combined in time and space to create utterances/sentences.

        Most sign languages seem to have about 35 handshapes, but there is no correspondence with acoustic phonemes.

        1. Actually, actually the phonemes are minimally contrastive units, just the same as they are with any language. Once they were understood, the term “chereme” was dropped since there is no need for redundant terminology to describe the same thing.

          Also, hand shapes are not ‘first level units’ but one of several parameters that must be expressed simultaneously to make up a single unit, e.g. Location, Manner of Movement, and hand or tongue Shape, Tone.

          It’s the simultaneity that confuses. Counting these simultaneous features is problematic, since a system of forty sounds and four tones could give 44 units or 160, depending, and when you deal with exotic sounds like clicks the problem is even worse. That’s why I ask how Atkinson did it.

    2. The difference with those examples is you have children who are faced with an incomplete language system to begin with, and their brains naturally create the missing pieces (which is amazing).

      But there’s no reason to think that something like this happens when children are exposed to a complete language from the start. After all, children all over the world are learning their native languages right now without adding new phonemes to them.

      1. I’d guess that before language first began children weren’t being exposed to complete languages. The assumption is that spoken language (in clear contrast to signed ones) originated once or in multiple places. Sproat says in his review
        “To test the possibility of polygenesis, he considers models with a second point of origin. That analysis posits South America as a second point of origin, but this implausible result is argued to be an artefact.”
        Why it argues this he doesn’t say.

        1. No, it’s very annoying. I suspect prejudice. I suspect that that is what you politely didn’t say.

        2. Interesting result.

          Also interesting to note that critics tries to tear a successful predictive theory down instead of proposing an alternative that explains the same phenomena. The theory tests well, and what are the chances that is an accident? The theory also coincides with the similar gene theory, and what are the chances for _that_?

          What is needed is an alternate theory and linguistics doesn’t seem to be it, for many reasons that commenters propose. I see that some comments claim that linguistics reveal language relations. (If they predict it or just describe, I don’t know.)

          But this isn’t a lineage but diversity study, it isn’t descriptive but testable. Maybe that explains the differences.

  10. Well, I can’t see how phonemic diversity even starts as a basis. Surely there are far too many variables. (Not that I know anything about it, anyway). There’s a very interesting (sceptical) discussion at http://groups.yahoo.com/group/Indo-Eurasian_research/messages/15030?threaded=1&m=e&var=1&tidx=1

    Johanna Nichols identifies a small number of phonemes that she believes significant in her hypothesis about what can be learnt about ancient human migration out of Africa by linguistic investigation, but the main emphasis of her thesis is on grammatical features, which she believes are sufficiently stable over vast periods of time to be useful. There’s an old review here: http://scicom.ucsc.edu/scinotes/9901/echoes/echoes.htm
    I’ve not heard anything more about this since the review in the New Scientist, back in 2000 (basically the same as the above). Perhaps it got squashed.

    1. That’s the same discussion I linked to above, includes the authors of both the paper itself and the criticism mentioned below.
      They describe in there how Georgian added a new click phoneme by slightly adjusting the timing of the release of the airstream.

  11. http://www.languagehat.com linked to a pretty critical review of Atkinson. http://www.cslu.ogi.edu/~sproatr/newindex/atkinson.html

    He asks how you get a language with a large phoneme set given prehistoric population sizes and patterns, why we should expect African phoneme inventories to remain more stable than those of other regions (it’s not like Africa hasn’t seen migrations or anything), and then raises the point that Atkinson didn’t compare phoneme inventories, but phoneme inventory ranges.

  12. I suppose that if dipthongs are included in the class of phomemes, then such phenomena as the Great English Vowel Shift added appreciably to the number of phonemes in English, but does this relate to the sheer numbers of English-speakers in the 15th century, and how does such a local development say anything about human languages more than 50 000 years ago?

    1. That’s actually something that struck me in the last few minutes. I’m not sure how we can draw conclusions about founders effects and language coming out of Africa by studying modern (or recently attested) languages. English has innovated massively since Indo-European; for instance, our entire vowel system is 7 or 8 times the size of that in some reconstructions of IE. I know not every proto-language is on the most solid ground (or much solid ground at all), but shouldn’t we be drawing these conclusions on the basis of phoneme diversity in the oldest reliably constructed languages?

      1. I would have thought so, and I would also have thought that the oldest reconstructible ancient languages are not nearly old enough to be of the slightest use, and most of them seem to be Indo-European. We’re stuck after about 6 000 years, or 10 000 with the Uralic languages (as far as I can tell, but I’m no linguist and I’m going on general reading here) and anything we can be reasonably sure about depends almost entirely on writing, which obviously doesn’t work before the early cities.

        I was also wondering about variations in the number of phonemes between very closely related languages of recent and well known origin, like Italian and French. Italian has a similar number of phonemes to its parent, Latin, but it has simplified consonant groups and nearly halved the number of vowels. French, on the other hand, has more vowels (including some very odd ones by Italian standards) and more consonants, including a native /w/. I can’t believe that this has anything to do with the numbers of speakers of either language, but I am prepared to bet that the influences of Gallic and Germanic are important in French phonology (and this still only takes us back to about 1 AD).

        1. There are a lot of factors contributing to sound change, like nearby linguistic diversity and resultant language contact, as well as a couple of universals grounded in the physiology of speech production and perception. At any rate, language changes way too fast for it to be clear from sound changes alone that Africa is the center of diversity.

        2. Doesn’t sound-change have a social side, too? English has greatly augmented its vocabulary as a result of the most recent invasions (Nordic and Norman) and the continuing influence throughout the Middle Ages of scholastic Latin and courtly French. English survived as a language because it borrowed the foreign words, rather than simply letting the damn things take over completely. The result has been a loss of many native English words, but a great enrichment of the sounds of English, as well as of its general vocabulary. This kind of thing must have happened elsewhere, and to me it suggests that there is a social dimension to phonemic diversity. Only think how people define themselves by the way in which they pronounce their words.

  13. “the Polynesians less than a dozen [phonemes] (hence the long, polysyllabic names in Hawaii and New Zealand”).”

    New Zealand Maori has 20 phonemes since the short and long vowels are semantically distinct (15 if you say they are not). The 40 of English are largely redundant because when unstressed the vowels converge on shwa, the “uh” of “about”. (Y.. c.n m.k. y..rs.lf p.rf.ctl. w.ll .nd.rst..d in .ngl.sh .s.ng n. v.w.ls .t .ll!)

    NZ Maori is far more economical of phonemes than English, virtually all CV pairs and CVV triads being meaningful and in use, so there is no semantic need for words to be long. The long names arise because reduplication is a regular and meaningful feature of the languages.

    And are Paekākāriki or Paraparāumu so much longer than Nottinghamshire or Worcestershire?

  14. I hope this paper isn’t claiming that language was in any sense invented, then spread among human populations. Language is an evolved capability, which we are all born with. In other words, language spread automatically as modern humans did.

    It’s also a fact that Homo erectus emigrated from Africa more than a million years before modern humans, so any theory about language driving migration would have to posit that language is that old as well, which I don’t find incredible at all.

  15. “the “a” sound in “about” and in “bad” are two distinct phonemes”
    Not in my accent they aren’t! (Northern English)

  16. Very late to party, but the Atkinson Science link take me to a Dunn Nature paper where I note that Finnish is conspicuously absent.

  17. Very late to party, but the Atkinson Science link takes me to a Dunn Nature paper where I note that Finnish is conspicuously absent.

  18. Interesting result.

    Also interesting to note that critics tries to tear a successful predictive theory down instead of proposing an alternative that explains the same phenomena. The theory tests well, and what are the chances that is an accident? The theory also coincides with the similar gene theory, and what are the chances for _that_?

    What is needed is an alternate theory and linguistics doesn’t seem to be it, for many reasons that commenters propose. I see that some comments claim that linguistics reveal language relations. (If they predict it or just describe, I don’t know.)

    But this isn’t a lineage but diversity study, it isn’t descriptive but testable. Maybe that explains the differences.

  19. From one of the links:

    “It is important to nip this idea in the bud before it spreads:”

    Really? Methinks it would be a stimulating topic, and “nipping” against the spirit of science. If it is wrong, time will tell.

    IMHO the weakness with the study, without having read it, is the claim from several places that the effect size differs from genetic studies. But then again, there is no strict inheritance, maybe that explains it.

Leave a Reply