Human and chimp genome comparison: apples and origins

December 7, 2025 • 10:00 am

How much genetic difference separates us from our closest relatives? The conventional wisdom about humans and our closest ape relatives (chimps and bonobos) is that we share 98% of our DNA. That’s a big similarity, and implies that if we lined up our genomes side by side, only about 2 out of 100 DNA bases would differ. This figure is often used to show that we have only a tiny genetic difference from our closest relatives. To quote W. S. Gilbert of Gilbert and Sullivan, “Darwinian man, though well-behaved, at best is only a monkey shaved.”  Well, the differences go farther than mere shaving.

The “98% similarity figure” is wrong. And it’s wrong for several reasons. First, most ape genomes (chimps, gorillas, orangs, etc.) have not been as thoroughly sequenced as was the human genome. A lot of the data that went into the 98% figure was missing.  Second, you can’t just compare genomes by lining them up and looking for differences in base pairs at similar sequences.

Why not? Because the notion of “similar sequences” is ambiguous and, sometimes, meaningless. Since we diverged from our ape ancestors, there have been a lot of changes in every species’ DNA that prohibit us from simply “lining up the genomes”.  Transposable elements have invaded some species but not others, bits of the DNA have been duplicated, so there are species that have sequences that are not homologous. Bits of the genome have been inverted (turned around and reinserted), causing big differences in sequence in previously similar sequences. Further, pieces of the DNA have been moved from one chromosome to another, so DNA sequences previously in the same place are now in another place, leading to a difference in total sequence.

All this leads to a substantially greater DNA divergence between humans and chimps than the 98% figure.  These extra genomic differences were sussed out by Yoo et al. in a Nature paper  from April of last year that you can read by clicking below (or find the pdf here).They did a much improved job in sequencing six of our ape relatives: the chimp (Pan troglodytes), bonobo (Pan paniscus), Western gorilla (Gorilla gorilla), Bornean orangutan (Pongo pygmaeus), Sumatran orangutan (Pongo abelii), and the siamang (Symphalangus syndactylus), an endangered species of gibbon from SE Asia.

First, the authors give a revised set of divergence times based on DNA differences between living species.  The human vs. chimp/bonobo species, for example, split from their common ancestor about 5.5-6.3 million years ago (mya), roughly in line with previous estimates. The divergence between humans and other African apes (gorillas) occurred between 10.6 and 10.9 mya, and that between humans and orangutans about 18.2-19.6 mya.

There is a ton of genomic information in the paper, including a lessening of the similarity between humans and chimps, but also specific information about what genes and regulatory bits of DNA differ among species. These differences suggest some some intriguing future research. I’ll mention just a couple, but will refer you instead to a long tweet below which shows why the human-chimp differences have increased. It’s an excellent tweet that you can read pretty quickly, though it doesn’t detail all the many differences that the researchers describe in the Nature paper, which is exhausting for those outside the field. There are also genes whose sequences changed very rapidly, suggesting that they were acted on by natural selection.

There are a gazillion sequence and structural differences revealed among the species, including 229 bits of ape DNA (all species) that have evolved rapidly and are thus candidates for natural selection. The paper also reveals parts of the DNA that have evolved especially rapidly in the human lineage since we split from chimps/bonobos. These regions are called HAQERS, and could be candidates for the Holy Grail of such work: seeing “what makes us human”. But that question is a bit misguided.

Nevertheless, the authors found one gene, ADCYAP1, that “is differentially regulated in speech circuits.” The implication is that the changes may have something to do with why humans are the only ape with syntactic spoken language, but that gene does a lot of other stuff, too, so I don’t take that implication seriously. The FOXP2 gene, which evolved rapidly in the modern human genome relative to other species, has mutations that impede people’s ability to speak, and I well remember when it was touted as “the language gene” that enabled humans to speak. But further research showed that the accelerated human evolution of the gene was an artifact, and that the normal function of the gene is manyfold, so nobody these days takes FOXP2 seriously as the “speech gene”. All claims should be regarded as caveat emptor.

There are also several genes that are not only unique to humans, but are “associated with human evolution of the frontal cortex”, suggesting these account for our big brains. The photo below comes from the tweet shown next, and its caption comes from that tweet. (The average chimp brain is about 400 g in mass—less than a third the mass of the human brain, which weighs in at 1300-1400 g in adults.)  Again, caveat emptor with regard to the two specified genes.

Figure 3. Radiograph illustrating cranial expansion in the human lineage, which is associated with increased neocortical growth – Chimpanzee skull (left), Modern Human skull (right).

Other genes that differ strongly among ape species involve those producing immunoglobulin, major histocompatibility products (MCH) and T-cell receptors, but especially immunoglobulin genes—involved in production of antibodies. Why have these evolved so rapidly within apes? Your guess is as good as mine, but suggests that reaction to antigens was an important element of ape evolution.

Here is the authors’ summary, and most of the paper will be of interest only to geneticists familiar with the argot (not necessarily me):

The complete sequencing of the ape genomes analysed in this study significantly refines previous analyses and provides a valuable resource for all future evolutionary comparisons. These include an improved and more nuanced understanding of species divergence, human-specific ancestral alleles, incomplete lineage sorting, gene annotation, repeat content, divergent regulatory DNA and complex genic regions as well as species-specific epigenetic differences involving methylation. These preliminary analyses revealed hundreds of new candidate genes and regions to account for phenotypic differences among the apes. For example, we observed an excess of HAQERS corresponding to bivalent promoters thought to contain gene-regulatory elements that exhibit precise spatiotemporal activity patterns in the context of development and environmental response99. Bivalent chromatin-state enrichments have not yet been observed in fast-evolving regions from other great apes, which may reflect limited cross-species transferability of epigenomic annotations from humans. The finding of a HAQER-enriched gene, ADCYAP1, that is differentially regulated in speech circuits and methylated in the layer 5 projection neurons that make the more specialized direct projections to brainstem motor neurons in humans shows the promise of T2T genomes to identify hard to sequence regions important for complex traits. Perhaps most notably, we provide an evolutionary framework for understanding the about 10–15% of highly divergent, previously inaccessible regions of ape genomes. In this regard, we highlight a few noteworthy findings.

The importance of the paper for now seems to be the presentation of the sequences and their differences rather than explaining the differences or their significance in ape adaptations—especially in humans—for studying adaptive hypotheses involves a lot of work for each single region that differs among species or evolved quickly. Nevertheless, useful questions have been raised—like why genes involved in the immune response changed so rapidly—that will be subject to future work.

I am not sure who runs the Origins Unveiled site dealing with evolutionary anthropology, but based on the clarity of the tweet below from that site (click on screenshot to see the tweet in situ), it deserves more followers. It’s only about a year old, which may explain the follower issue.

This tweet from September of this year explains why the 98% similarity between humans and chimps drops to 84.7% when you take translocations, inversion, duplications, insertions, and other genomic rearrangements into account. And these rearrangements are not necessarily trivial, for duplications can lead to divergent gene families, and insertions can act to regulate genes in a new way.

Again, click below and read; it’s short and lucid:

I’ve shown one figure from the tweet above: the brain differences. Below is another figure showing how the 99% similarity between humans and chimps has traditionally been calculated, requiring alignment of nearly identical but perhaps slightly different bits of DNA. All captions come from the tweet. This figure shows how they line up chimp and human sequences (you see the gross similarity), but also that here there’s been a single nucleotide substitution in one of the two lineages, rendering this sequence 92.3% similar. (This is a made-up sequence for purposes of illustration.)  When you did that with the whole genome comparison based on earlier data, you got about a 2% difference. The problem, as I said, is that we didn’t have great chimp (or any ape) sequences and there are parts that you simply couldn’t line up this way. And those parts, when compared among species, increase the genetic difference between us and our closest relatives.

Figure 1 — Simplified Mock Alignment Illustrating Nucleotide Sequence Similarity Between Chimpanzee and Human Genomes. Out of 13 positions, one substitution (single-nucleotide variant, circled in red) results in ~92.3% DNA similarity. This example demonstrates the methodology behind the misleading 98–99% human-chimpanzee DNA similarity figures.

Below is another figure showing how various rearrangements, insertions, deletions, and translocations reduce similarity, but I’ll show only four of the six parts of the figure, giving the captions for a-d. You can see how these changes make humans and chimps less genetically similar than previously thought (again, captions come from the tweet; click to enlarge).  These are also “mock alignments” meant for purposes of illustration, but they do show the kind of thing seen in the Yoo et al. paper:

Figure 2 — Simplified Mock Alignments Illustrating Structural Variation Between Chimpanzee and Human Genomes. Note: Structural variants are not taken into account when calculating the 98–99% Chimpanzee-Human DNA similarity figures.
( a) Insertions and deletions contributing to sequence divergence. Out of 34 positions, 3 indels (insertions circled in orange; deletions in yellow) result in ~91.2% DNA similarity. Note: These indels are relative, as without a suitable outgroup (i.e. gorilla), an insertion in one genome appears as a deletion in the other.
(b) Duplication contributing to sequence divergence. Out of 34 positions, a duplication of 12 bases (duplicated segment encircled in blue; original in purple) results in ~64.7% DNA similarity.
(c) Inversion contributing to sequence divergence. Out of 34 positions, an inversion of 11 bases (encircled in green) results in ~67.6% DNA similarity. Note: Although bases may match within the inverted region, they do not contribute to sequence similarity due to misalignment. Without a suitable outgroup (i.e. gorilla), it is unknown whether the inversion occurred on the chimpanzee or human genome.
(d) Translocation contributing to sequence divergence. Out of 34 positions, a translocation of 20 bases (encircled in brown) results in ~41.2% DNA similarity. Note: A translocation is a DNA segment that has been “copy and pasted” or “cut and pasted” from another part of the genome.

So, when you hear that we’re nearly genetically identical to our closest relatives, just say, “Wait a tick. Not all that identical.” We have about 15% difference in sequence, which is not trivial.

UPDATE: I’m aware now that creationists and IDers have been using this 85% to cast doubt on human evolution, our place in the ape family tree, and whether evolutionists are honest.  This is bogus: the 85% vs. 98% depends on two different methods of calculating similarity. Which ever method you choose (alignment vs. total genomic similarity), the same family tree of the great apes appears, with chimps/bonobos our closest ancestors, then gorillas a bit more distance, and then orangutans, and then other apes.  The point of this post is not to cast doubt on human or ape evolution, but to show different ways of calculating genetic similarity.

27 thoughts on “Human and chimp genome comparison: apples and origins

  1. This is a really excellent piece. I’m pretty sure that I knew that there were various inversions and translocations between (other) apes and humans, but I never really thought about how these differences might manifest in the calculation. So, I often used the 98-99% number in conversation. Now I have no choice but to locate everyone I said that to and correct the record. It will take me the rest of my life.

    It’s great to see this work. Thank you for highlighting it.

  2. Fantastic article Jerry, thank you for putting this together. Definitely going to read several times to fully absorb this information, as I will be unable to keep quiet the next time someone says “we share 98% of our DNA with chimps” and will want to have all of my ducks in a row (by the way, what is our shared DNA % with a common duck)?

  3. Thank you for the memory you triggered. My dad was a big fan of Stanley Holloway and used to recite some of his monologues. One birthday, I bought him a book of the monologues and wrote a quote on the gift tag. I found out years later that he had kept the gift tag too.

    The quote I used was from George and the Dragon by R P Weston and Bert Lee. I kept the book as a momento after he died.

    Here it is:

    “Some folks’ll boast about their family trees,
    And there’s some trees they ought to lop;
    But our family tree, believe me, goes right back,
    You can see monkeys sitting on top!

    1. Correction, it was from St George and the Dragon. Didn’t realise he had two similarly titled monologues. My dad’s name was George, which is why I thought it was a good quote. Here’s the second verse.

      To give you some idea of our family tree,
      And don’t think I’m boastin’ nor braggin’,
      My great, great, great, great, great, great, great Uncle George,
      Wor the Saint George who slaughtered the Dragon.

  4. I have a question about the way indels and inversions or duplications are weighed. Surely, if there is for example a 10 bp insertion in one line that is not present in the other line, that counts as one difference, as a singular event, and is not weighed as 10 differences, right?

  5. No; look at the figure in the second piece. Insofar as a duplication counts as many nucleotides not present in the other species, it counts towards percentage similarity. The measurement is nucleotide similarity, not “event’ similarity. And one could justify this as contributing to overall similarity, though I can see what you’re driving at.

  6. Immunoglobulin genes evolve rapidly through gene duplication, leading to species specific expansions. It is not just a primate thing.
    Humans have about 54 variable heavy chain genes while macaques have about 90. Mice have about 130!
    Relatively speaking the chimps immunoglobulin genes look quite similar (in terms of comparing these types of genes from different species) to humans. It is only when you look at percentage differences in the nucleotide sequence that they appear to be enormously different compared to other genes.

  7. Another demonstration of why this is such an admirable and invaluable website. The educated public is probably vaguely cognizant of SNPs, but not of indels and rearrangements. Many, many thanks, PCC(e).

  8. What timing. I just finished reading chapter 8 of WEIT this morning where it discusses the similarity between humans and other apes. I avoided reading it for so long because I didn’t need convincing but finally picked it up because I figured it would have many interesting examples, and it does.

    1. Probably the bananas figure is either amino acid similarity in proteins, or something like “50-60% of human genes have ‘the same’ gene/protein in bananas, by which we don’t mean exactly identical, but a homologous protein”. E.g. almost all eukaryotes share a large number of proteins for basic cellular functions, DNA replication, mitochondrial energy production, etc. etc.

      Re: overall – unfortunately, the creationists / ID creationists have used headlines like “Busting the 98% myth” many times over the decades. But (from memory) the response back then is still reasonable now, i.e. that figure was developed for either coding DNA or alignable DNA between genomes (I forget which). That at least is a simple measure which can be done with a reasonable small sample of genes / sequence fragments.

      The question of overall similarity if you sequence absolutely everything and then only count similarity when there is a 1-to-1 relationship between sequences (thus all duplications contribute to the percent divergence) is, I suppose, somewhat interesting to calculate as an exercise, but it doesn’t really change very much. Whatever measure of genomic similarity you use, we would predict that you’d get the same average relationship, i.e. humans and chimps are closest, gorillas next, orangs next after that, etc.

      It should even work within humans. The common statement is that individual humans are ~99.9% percent identical at a nucleotide level, but once they started sequencing duplications etc. they got a much larger difference. Probably most of these duplications/insertions/deletions are in the junk DNA (the human genome is 1.5% coding DNA, 50%+ various forms of repetitive and nonrepetitive junk and fossil viruses, ~5% regulatory regions), so the “who cares” question is pertinent.

      PS: The dating they got for the human-gorilla divergence, 10+ Ma, seems very old; from memory, dates like that are typically gotten by extrapolating from the human mutation rate, but this ignores that the mutation rate may have slowed down in our lineage. Typically the gorilla vs. human/chimp divergence comes out as just e.g. 7 Ma when chimps are 6 Ma. Or it might just be weird, I don’t see much about the dating analysis in the Supplemental Material, nothing about fossil calibrations, nothing about uncertainty measures for the dates, etc. So I think maybe the TRAILS program they used was aimed more at modeling population size etc and maybe the dating was a secondary product. There are also complex questions of locus divergence times vs. species divergence times, issues of whether or not there was later hybridization, etc.

      PS there’s gotta be a measure of similarity where a duplication of 100 base pairs doesn’t count the same as the insertion of 100 totally novel nucleotides. E.g. imagine dividing the genome into millions of 100-base-pair blocks, then for each, searching the chimp genome and recording the closest percentage match. Compressibility? Gzip each genome individually and then together?

      1. Yes, I should emphasize that similarity of protein nature (i.e. sequence similarity in homologous regions) is not the same thing as DNA similarity across the whole genome: it is apples and oranges. The important thing is to realize that no matter how one calculates similarity, the phylogeny of great apes does not change, because alterations of genetic distance are proportional among all species. I’ve also found articles saying that this difference in how you calculate similarities has been known for some time. I don’t know that literature, but I’ll take the word of the authors. However, the data should not be used to cast doubt on whether humans evolved from ape ancestors, and that the phylogenetic relationship between living apes has not changed. And “degree of similarity” depends on what measure you use to calculate it.

        1. Problem is, the creationists will grab this ball and run with it. Biologists knew about the difference between aligned and non-aligned DNA comparisons early on. It’s not news that the different methods yield different results, but the anti-science crowd will claim this validates them.

  9. A good something-to-think-about post and I expect there will be thoughtful contributions in comments. From me, just — thanks!

  10. “[W]ith chimps/bonobos our closest ancestors, then gorillas a bit more distance, and then orangutans, and then other apes.”

    They’re not our closest ancestors; they’re our closest living relatives.

  11. I see you’ve already added a caveat but I think it’s worth elaborating on a bit more. I commented on this on this creationist article (https://crev.info/2025/06/human-and-chimp-similarity-reduced/) and I will copy that here.

    The study used two different metrics of divergence, one with single-nucleotide variants and one with nucleotide gaps. The former method gives the usual 1-2% divergence estimate between humans and chimps and the latter gives the 14-16% estimate. So as Google AI said, different methods give different results and it’s not just that one method is “objectively” better but arguably a semantic issue over the time “difference”. The authors also caution that the gap estimate doesn’t just represent real mutations but also “missing data, or technical problems (e.g., alignment failure due to SVs, repetitive elements, etc.)”. Additionally, the gap difference within humans was estimated at about 3-4%. So with those factors in mind the 14-16% may be an overestimate if we’re interested in only true species differences. I’m sure the value is still higher than 1-2% by any reasonable definition though and I’m sure people will keep doing studies of this sort.

  12. There’s this great post on pandasthumb.org:
    https://pandasthumb.org/archives/2025/07/human_and_chimpanzee.html

    The author explains why this actually isn’t news (scientists have been aware of the different ways to measure sequence similarity for a long time) and also explains how ID-creationists are pushing a lot of misinformation about the different ways of measuring the genetic similarity between chimps and humans because they’re desperate to deny our being related.

    The post also documents how the Discovery Institute’s Casey Luskin doctored a figure to hide inconvenient information. They’ve been running a lot of damage control over on the DI website since this doctoring was discovered.

  13. Heres’ a video that considers these arguments in detail, by the media influencer, Erica the “Gutsick Gibbon”:

    (One minor quibble about the video: the quote Erica uses: “Whilst Man, however well-behaved, at best is but a monkey shaved!” is not, as she suggests, attributable to Darwin. As Jerry coincidentally said above, it’s a line from a Gilbert and Sullivan opera)
    Her conclusion supports Jerry’s point that the genetic relatedness between humans and chimpanzees depends on how it is calculated but using some of the methods that show greater difference, also show significant differences within our own species. Whatever method is used, she suggests, supports the same phylogenetic relationships as suggested by Jerry above.
    Ok – deep breath – here’s my spanner in the works thought: Why are we asking this question? On one hand it seems a reasonable enquiry, all we are trying to do is to understand the place of humankind in nature, accepting the risks that creationists will try to exploit it as an important measure of difference from non-human animals, but how important? It matters as an argument for common ancestry of course, and the need to construct a phylogenetic tree, but beyond that, genetic relatedness is just a number. Whatever the As, Ts, Gs, and Cs in my genome say, it has no direct influence on the things that matter to me on a day-to-day basis. I have a strong urge to continue living, I have love for my wife and my family and know the importance of friendship, I care about my status in society. I feel a need for a common identity with my perceived in-group, and a feeling that there is a case for defending when it’s threatened. I have a sense of fairness and of humour. I have the capacity for hate, empathy, shame, guilt, kindness and disgust. And even if the need for spiritualism, or some inner purpose, is not a feature that I share strongly with other people, I do, at least, recognise it as part of the human condition.
    These are things that matter to me and, I think I can confidently say to you too. But what matters to a chimpanzee? Well, as I understand it, with some qualifications and exclusions, pretty much all of the above.
    What exclusions? Chimpanzees do not pair bond, and while love between family members generally appears to be as strong or stronger than it is in humans, they don’t experience romantic love or primary partner attachments. I once asked the late primatologist, Frans de Waal, whether he thought chimpanzees experience guilt, and he said probably not, although he said they certainly experience shame. They have their own form of laughter: if you tickle a young chimp they’ll laugh, and if you try to move away, they’ll try to pull you back to carry on with the fun. Whether they have any spiritual sense is open to question; there is some anecdotal evidence that they do, but I’m already over the word limit. As Jerry said: caveat emptor. Also, their levels of empathy and compassion, do not seem to be as well developed as they are in humans. It obviously goes without saying that their level of complexity of thought and communication falls way below that of humans, but if they experience love in the way that we do, then can we reasonably claim human love to be superior to theirs? If all this sounds surprising, then why wouldn’t it in a society that founds its assumptions on a distinction between human and animal which from an evolutionary perspective does not exist and never did?

Leave a Reply to Mikkel Rasmussen Cancel reply

Your email address will not be published. Required fields are marked *