Human Phylogeography: The lessons learned, 1

June 4, 2019 • 9:10 am

by Greg Mayer

UPDATE. A couple of readers have drawn attention to the website, gcbias, of Graham Coop, a population geneticist at UC Davis. He has excellent discussions, with nice graphics, of issues in genetic genealogy, including calculation of the number of “genetic units” in particular generations. As an example, 7 generations back you have 256 ancestors, but only 286 genetic units produced by recombination, so although, on average, you will have a chunk from each of those 256, it is entirely plausible to have zero (since inheritance is stochastic). It’s well worth browsing, and this and this are good places to start. (Thanks to rich lawler and S. Joshua Swamidass for the pointers.)


In February, I posted the syllabus for a seminar class entitled “Human Phylogeography” that I was teaching with my colleague Dave Rogers. The seminar was based primarily on a close reading of David Reich’s (2018) Who We Are and How We Got Here (published by OUP in the UK). Well, the class has concluded now, and so I thought I’d report back on what happened.

First, I’d like to say that the class was a success. We had 16 students, double the most I’ve ever had in a number of similar seminar courses over the years, and the students were very successful in engaging with the subject in both written and oral contributions to the class. One of the students was a history major, and towards the end of the semester a colleague in computer science mentioned that, quite coincidentally, he was reading the book, so he joined the class for the last few meetings. In many ways, it was what college is supposed to be like (though too often isn’t). I hope the students learned a lot. I did, and here is the first of the three most striking things I learned.

1. Recombination is a lot rarer than you think.

If you think back to the last time you studied genetics, you’ll recall the phenomenon of recombination, one aspect of which is crossing over. Crossing over occurs during meiosis. Chromosomes come in homologous pairs (23 pairs in humans, for 46 total), and in meiosis the homologues can exchange pieces with one another. The chromosomes physically touch and cross one another, which is observable under the microscope, and are called, appropriately enough, chiasmata (chiasma, sing.)

Image result for crossing over meiosis
From BioNinja, https://ib.bioninja.com.au/standard-level/topic-3-genetics/33-meiosis/crossing-over.html

Recombination is important for a variety of reasons (for one, it increases genetic variability), but for our current purposes its importance is that it breaks up the nuclear genome from 23 genetic units into more, and smaller, units (as opposed to the mitochondrial genome, which has a number of genes, but all are inherited as a single genetic unit, since there is no recombination in mitochondria). In humans, it turns out, there are only 1-2 crossovers per chromosome per generation (1.2 per chromosome in fathers, 1.8 in mothers).

Now, I’d always thought that crossing over occurred frequently enough that we could think of the genome as essentially infinitely divisible. (There are 3 billion base pairs in the human genome, so, in the limit, there would be 3 billion genetic units, so not quite infinite!) But, it turns out that crossovers occur sufficiently infrequently that there is an appreciable chance that, if you go enough generations back, you share NO genes with your ancestor. This is because the number of ancestors goes up fast (2, 4, 8, 16, 32, 64, 128, 256, etc.), but the breaking up of the genome into smaller units by crossing over isn’t fast enough to ensure that the probability of sharing nothing is near zero.

Here’s a figure from Reich’s book showing how blocks of genes are broken up by recombination.

From Reich, 2018.

You start with an entirely Neanderthal chromosome (dark), which enters the anatomically modern human population by hybridization. A few generations later, the Neanderthal chromosome has been broken up, but it still occurs as largish blocks amongst the anatomically modern sections (gray). Still later, the blocks are smaller and fewer. (We’re assuming continued backcrossing into the anatomically modern population, so the % Neanderthal decreases; there could also be selection causing changes in the frequency of Neanderthal alleles). Finally, a present day individual has his Neanderthal DNA broken up into even smaller bits.

Here’s a figure from a talk by Svante Pääbo, showing in the top row for each chromosome (there are 22 listed, from 1-22) the entire genome of “Oase Boy” from 40K years ago in Romania. The green lines are Neanderthal sites in his genome. The five rows below Oase Boy are five modern human individuals; the colored lines are their “Neanderthal bits”. Note that for each chromosome, Oase boy has the biggest block of Neanderthal genes (green fluorescece):

From Pääbo , 2018. (Click to enlarge.)

Because of the age of the Oase sample, some of the black lines are missing data, and so Pääbo infers that there are seven large continuous blocks of Neanderthal genes (yellow bars above the Oase Boy line). Note that the modern individuals have less Neanderthal DNA, and it is not in large blocks.

Because the size of the blocks breaks up in a statistically predictable fashion, you can get a “recombination clock“, so that based on the size of the blocks you can estimate how many generations ago the hybridization occurred. For Oase Boy, Pääbo estimated that his Neanderthal ancestor occurred 4-6 generations back (his great great, or great great great, or great great great great grandfather).

From Pääbo , 2018, showing Oase’s Neanderthal ancestor (red) in the 5th generation (it could also be in the 4th or 6th).

Because the placement and frequency of crossing over is stochastic (random), the situation must be statistically modeled to derive sound estimates, and there will be a range of plausible estimates. And, since some of the fossils are well dated by other means, we can also estimate the long term human generation time, as was done by Priya Moorjani and her colleagues: it’s 26-30 years.

So, the low rate of recombination allows us to construct a “recombination clock”, and to estimate generation times. This is great stuff!

It also solved for me what was a puzzle. You may recall that last year Elizabeth Warren released the results of DNA tests showing that she had American Indian ancestry several generations back. This essentially confirmed what her family’s oral history said. The amount of her Indian ancestry was small (less than 1%), and a range of generations (6-10) was provided by the analysis (as was done by Pääbo for Oase Boy).

Now, there are a number of ways which these ancestry tests can be criticized, one of the most difficult for them being that there are very few North American Indian genotypes in the database used, and thus “American Indian” relationship is indicated by relationship to Central and South American Indians. Some critics of Warren, however, made erroneous criticisms. She did not contend, as some accused her of, of saying the results showed she was Cherokee—with few if any Cherokee in the database, the ancestry tests could not determine this. (And tribal membership is a legal matter, anyway, not directly dependent on genetic similarity.)

But some critics said that the data were consistent with her having no Indian ancestry at all. I wondered how they could say that– there are 3 billion bp, and 1 % of that is still a very large number. But now I realize my error. There are very many fewer genetic units– more than 23, but a lot less than 3 billion!– due to low rates of recombination. And, because of this, if you go back several generations, there is an appreciable probability of sharing no DNA with an indubitable ancestor. I now believe the critics must have looked at the latter fact, and realized Warren may not have DNA from all of her ancestors, and thus suggested she may have no Indian ancestry. But their error is that in saying she may lack DNA from an ancestor, say, 8 generations back, they are invoking an a priori probability. But in Warren’s case, her DNA was examined, and showed that she did have Indian ancestry.


Gravel, S. 2012. Population genetic models of local ancestry. Genetics 191:607-619. pdf

Ho, S. Y., Chen, A. X., Lins, L. S., Duchêne, D. A., & Lo, N. 2016. The genome as an evolutionary timepiece. Genome Biology and Evolution 8: 3006–3010. pdf

Huff, C.D. et mult. 2011. Maximum-likelihood estimation of recent shared ancestry (ERSA). Genome Research 21:768-774. pdf

Moorjani P, Sankararaman S, Fu Q, Przeworski M, Patterson N, Reich D. 2016. A genetic method for dating ancient genomes provides a direct estimate of human generation interval in the last 45,000 years. Proceedings of the National Academy of Sciences USA 113:5652-7. pdf

Pääbo, Svante. 3 October 2018. A Neanderthal Perspective on Human Origins. (video: embedded below)

Reich, D. 2018. Who We Are and How We Got Here: Ancient DNA and the New Science of the Human Past. Pantheon, New York.

Human Phylogeography

February 23, 2019 • 11:33 am

by Greg Mayer

For the spring semester, my colleague Dave Rogers and I are teaching a seminar class entitled “Human Phylogeography.” Phylogeography is the study of the history of the genetic variation, and of genetic lineages, within a species (or closely related group of species), and in the seminar we are looking at the phylogeography of human populations. DNA sequencing now allows a fine scale mapping of the distribution of genetic variation within and among populations, and, remarkably, the ability to sequence ancient DNA from fossil remains (including Neanderthals). The seminar is based primarily on a close reading of David Reich’s (2018) Who We Are and How We Got Here (published by OUP in the UK).

A Krapina, Croatia, Neanderthal woman, photo by Jerry.

Although rarely under that rubric, human phylogeography has been a frequent topic of discussion here at WEIT, by Jerry, Matthew, and myself, including our several discussions of Neanderthals (or Neandertals) and Denisovans. So it may be of interest for WEIT readers to follow along. Below the fold I’ve placed the course syllabus, which includes the readings, and links to many newspaper articles of interest, and online postings, including many here at WEIT, and also from John Hawks Weblog, a site we’ve recommended on a number of occasions when discussing human evolution. (The newspaper links appear as images; just click to go to the story.) We just finished our third meeting, and I’ve been quite impressed by the students’ discussion and writing. We’re fortunate to have some students from anthropology or with some anthro background.

Please read along with us, or browse what seems interesting below. If you have questions or comments, post them here, and I’ll be looking in.

Continue reading “Human Phylogeography”

The peopling of the Americas

July 12, 2012 • 9:55 am

by Greg Mayer

The Americas were the last continents to be inhabited, and there has long been controversy about how and when it occurred. There is a general consensus that the earliest Americans arrived from northeastern Asia in the late Quaternary, but the exact peoples involved, the routes taken, when they arrived, and the modes of travel are all much debated. A paper by David Reich and colleagues, in press in Nature, presents evidence on one aspect of the question– did the first inhabitants arrive in one, or in more waves of migration? It has always seemed probable that the Eskimos, culturally and linguistically distinct from the American Indians to the south, and occurring on both sides of Bering Strait, represent a distinct migration, but were the more southern peoples the result of one, two, or more migrations?

Note that Na-Dene (green) and Eskimo-Aleut (red) derive in part from an Asian (black; Yoruba are African) ancestry separate from that of Amerind or First American (blue). (The Na-Dene and Eskimo-Aleut are not a single arrival from Asia; the Han Chinese are too genetically distant from east Siberian peoples to capture the ancestral source in this comparison.).
D Reich et al. Nature, in press, doi:10.1038/nature11258

Here’s the money quote from Reich et al.’s abstract:

[W]e assembled data from 52 Native American and 17 Siberian groups genotyped at 364,470 single nucleotide polymorphisms. Here we show that Native Americans descend from at least three streams of Asian gene flow. Most descend entirely from a single ancestral population that we call ‘First American’. However, speakers of Eskimo–Aleut languages from the Arctic inherit almost half their ancestry from a second stream of Asian gene flow, and the Na-Dene-speaking Chipewyan from Canada inherit roughly one-tenth of their ancestry from a third stream.

The three migrations thus were by 1) a group the authors call First American, that gave rise to almost all of the Indians of North and South America; 2) the Na-Dene, a group also linguistically identified, that occurs in the US Southwest and a few other places in the US and Canada; and 3) the Eskimo-Aleut, who arrived most recently. These three groups had also been identified by the late linguist Joseph Greenberg (who called the first group “Amerind’).

This is actually pretty much the story as I understood it from the viewpoint of a biologist paying casual attention to the anthropological results. Media accounts (NY Times, BBC) make it sound a bit more novel and controversial than I would have thought. This could be due to my not fully grasping the state of the debates within anthropology (quite possible!), or the hyping that tends to accompany reporting of even the best scientific work.

____________________________________________________________

Reich, D. et al. 2012. Reconstructing Native American population history. Nature, in press.