Did countries with female leaders have a better coronavirus response? Statistics say “no”.

June 3, 2020 • 12:30 pm
It seems that the coronavirus has motivated many people to use the pandemic to leverage their favorite social-justice issue. So, for example, we’ve seen many discussions that the poor and African-Americans were disproportionately affected by the virus, which is true.  Another claim, not so obviously true, was the assertion, widely seen in the liberal media, that countries led by women had a better response to the pandemic—fewer infections and deaths—than did countries led by men. The implication was often that the character of women leaders was different from men in a way that led to a more salubrious response.  (Some sardonically said this response was basically that “men and women are the same—except when women are better.”)I wrote about this issue on May 17, and pointed out that the results given were largely anecdotal (invariably involving New Zealand and West Germany), had other explanations (e.g., countries with a more liberal social agenda could both deal with the pandemic more effectively and be more likely to choose women leaders), and that the assertion, of which I gave three instances (the New York Times, Forbes, and The Hill), needed a nonparametric statistical analysis. As I said at the time:

Is it true that countries with women leaders have done better than those with men leaders in fighting the coronavirus? That requires some kind of statistical analysis, for the analyses focus primarily on seven countries with female heads of state: Taiwan, New Zealand, Germany, Denmark, Norway, Iceland and Finland. And indeed, these countries have done better in fighting the virus than many others led by men. But there are other countries led by women as well, which are omitted from the analysis. Ideally, you’d want to do a rank order correlation between some measure of successful mitigation of the pandemic with whether the countries are led by women. Sadly, there are only 29 women-led countries in the world, and many have no data on coronavirus response.

Well, we now have a statistical analysis, done in detail and sent to me by Scott Goeppner, a doctoral student in the Department of Integrative Biology at Oklahoma State University in Stillwater. As he wrote me:

You posted an intriguing article a few weeks ago about whether female leaders have been better at handling the COVID-19 pandemic than male leaders. I have been following the COVID-19 deaths and cases on the ourworldindata website and it occurred to me that per capita measures of the number of COVID-19 cases and deaths provide a metric of how well a country is coping with the pandemic. With this in mind, I decided to try to run a Mann-Whitney test comparing the per capita number of cases and deaths due to coronavirus in countries led by men and women to see if female led countries fared better.

So he did. Now remember that the data are scanty because there are so few countries led by women, the results are tentative and, as Scott notes, there may be better metrics of gender equality than “whether a country is led by a woman.” But the upshot, in the article below—which Scott has kindly allowed me to send to inquiring readers, along with the supporting data—is that there is no evidence that countries led by women had better responses to the pandemic than countries led by men. In fact, the difference, which is not statistically significant, is in the opposite direction: cases per million or deaths per capita are slightly higher in woman-led countries. This lends no support to the assertions in the NYT and other places that having a woman at the helm protects your country better against pandemics.

Here’s the title of Scott’s piece, which is a Word document. I have permission from Scott to post the results and disseminate his analysis to interested readers. Write me if you want them, but don’t ask unless you intend to read them.

I’m not going to repeat the entire analysis, which you can read for yourself. I’ll just give an outline of the analyses and the results.

First, Scott found out which countries were led by women, ensuring that the women leaders actually had power and were not just figureheads. Then he used the coronavirus data from this source (indented paragraphs are Scott’s):

On May 17th, I downloaded a dataset from ourworldindata that included total cases, total deaths, total cases per million and total deaths per million (https://ourworldindata.org/covid-deaths). This dataset includes per day case counts, death counts, cumulative to date case and death counts, and lots of other information. The dataset extends back to when testing started for each country. I truncated the dataset to include only the cumulative cases per million and deaths per million as of May 17th when I downloaded the data.

Since the World Health Organization recommends that for every case deemed “positive”, there should be at least ten tests, he eliminated all countries that didn’t meet this criterion (see paper for methods).

In the end, he came up with 62 total countries, 12 of which were led in a meaningful way with females.

There are four analyses, none of which showed a significant difference between male-led and female-led nations. Further, the differences that did exist showed that female-led countries did marginally worse on every metric.

1.) Total cases per million for countries with adequate testing. Result: no significant difference using the nonparametric Mann-Whitney U test (p = 0.2891). Here are the results:

2.) Total deaths per million for countries with adequate testing. Result: no difference using the nonparametric Wilcoxon rank sum test with continuity correction test (p = 0.1368). Here are the results displayed  graphically:’

3.) Total cases per million for countries with adequate testing and only European countries included. Scott tested this because, as he said,

A potential problem with the previous analysis is that different regions of the world have different numbers of cases and deaths, and may be at different parts of their outbreaks. This would matter less if sex of leader were random with respect to region. However this does not appear to be the case, as 8 of the 11 female leaders of countries with adequate testing were leaders of European countries. Therefore, I truncated the dataset again to only include leaders of European countries. This process resulted in 29 countries remaining, 9 of which were female led.

Result: no significant difference using the nonparametric Wilcoxon rank sum test (p = 0.2948). Here are the graphical results:

 

4.) Total deaths per million for countries with adequate testing and only European countries included. Result: No significant difference using the Wilcoxon rank sum test (p = 0.2948).  Here’s the boxplot:

Here’s Scott’s conclusion, in which he suggests a metric that might be better, although of course to avoid p-hacking (doing a bunch of different tests until you get the result you want), you need to specify the metrics you’ll use before you do the test.

Summary – I found no evidence that female led countries had lower per capita cases or deaths due to coronavirus, and thus no evidence that female led countries are coping with the pandemic better than male led countries. I don’t think my Mann-Whitney U tests really settles the issue, especially given the very small sample size of women led countries. My guess is that if female empowerment improves a country’s response to coronavirus, it is more likely to show up using gender equity metrics than comparing countries based on the sex of their current leader.

My own take: in view of this analysis, there’s no statistical justification at present for arguing that women leaders dealt better with the coronavirus pandemic than did men leaders. Since the data don’t support that conclusion in any way, there’s thus no reason to speculate why there was a difference, as the New York Times did in the May 15 piece below. Note that the subheadline gives a reason: “a new leadership style.”

49 thoughts on “Did countries with female leaders have a better coronavirus response? Statistics say “no”.

      1. I have a fine old clock I purchased many years ago which has stamped on it “Made in West Germany.” I was showing it off to one of my young relatives who looked at me with uncomprehending eyes and asked “Was there an East Germany too?”

  1. It is to be hoped that Mr. Goeppner sends a copy of his analysis to the New York Times. But it would not be advisable to hold one’s breath waiting for a report about his finding to appear in the NYT.

    1. Not a chance they’d even acknowledge it. I predict “women leaders saved us from COVID-19” is already on its way to becoming one of those un-assailable snippets of truthiness, like “women earn 75% of what men do”- that we are subjected to.

  2. This is a good example of not jumping to conclusions or believing the current meme. Keep your baloney detection kit (statistics) at the ready.

  3. I should think that any statistical analysis of the relative performance of men and women leaders would be skewed by the United States, which while having something less than 5% of the world’s population has well over a quarter of the worldwide confirmed COVID-19 cases (and of the deaths due therefrom).

    Speaking of which, at a press conference on the coronavirus yesterday, the prime minister to our north was asked about Donald Trump’s recent performance in office. What followed was 20 seconds of stone-cold silence, then a halting answer that was about as diplomatic as it could be under the circumstances, I suppose:

    1. I felt for the guy. Sane, responsible, competent leaders have to very carefully weigh what they say because the wrong choice of words, or even tone, can be disastrous. What he wanted to say would have caused MAGA hatters to storm the Canadian embassy, so with more control than I could possibly manage he said something else.

      1. BTW, Ken I am not sure why you’d think the stats would be “skewed” by the US. We do have one of the highest number of infections and deaths, but the US is a large country. If you denominate deaths by million population, six countries; Sweden, Netherlands, France, Space, Italy, and the U.K. have worse rates.

        So, I’m not sure what you mean.

          1. Inhabitants per square kilometer (mile) might even be better.
            New York and Montana are quite different. Or the US and the Netherlands or Sweden and New York.

        1. “If you denominate deaths by million population, six countries; Sweden, Netherlands, France, Space, Italy, and the U.K. have worse rates.”

          Because of the virus arriving at different times, it is not reasonable to compare deaths/million population of US NOW with European countries NOW. Maybe a comparison of US now with Europe about 4 or 6 weeks ago has some validity.

          At least deaths/million is the best number to look at. I had predicted, and think my US prediction won’t be too bad, so 500 deaths/million on Aug 1 might be close with respect to the reported numbers. That’s about 165,000 reported deaths. I underestimated both Canada and UK for sure. Canada is about 60% better than US, i.e. 200 versus 325 deaths/million. It would be more like 140 if Quebec’s numbers were the same, rather than 3 to 4 times worse, compared to Ontario’s.

          But even those numbers are probably a non-trivial amount too small, as many ‘reliable’ sources seem to say. A recent one (that is, comparing to how much deaths exceed the expected statistically) make me think that 30% or 35% additional more might be closer to correct for US. So adding about ⅓ gives something like 140,000 deaths right now, which would be about 425 deaths/million now for US.

          For other, say European countries, again probably too small reported but not maliciously anywhere and varies with country.
          You didn’t mention Belgium, which is worse than the ones you did. An earlier article seemed to indicate its number then was about right in the sense above.

          And again, this is not morbidity of the virus but rather premature deaths, for virus plus lesser direct reasons, which would not have happened without the virus occurring. That’s the number with more general significance.

      2. It was not the MAGA hatters storming the embassy he was worried about. He is only concerned about an all too powerful and vindictive, thin-skinned, and capricious President of the United States.

      3. Not so much MAGA hatters who don’t even know where Canada is and don’t care what Canadians think but the Orange Menace himself who would spend every last waking hour trying to destroy the Canadian economy. Remember, he still has a tariff on our steel because he says that we could be an enemy of America — America’s closest ally and friend. So, with friends like those….

    2. He already got caught gossiping with all the world leaders about Trump. If he said anything slightly against him, Trump would probably spend all his time in office skewering the Canadian economy. I especially liked how his jaw moved to the side as he thought about how to phrase his reply. Also, I often see Justin Trudeau as an honorary woman as the press treat him in the same way, often commenting on what he is wearing or his hair & suggesting that he couldn’t possibly be good at his job because he’s attractive.

      1. Yeah, I sure wish we had a recording of the internal dialogue occurring in Trudeau’s mind as he stood there pondering how to respond to that question.

  4. The Columbia University statistician Andrew Gelman often writes about analyses of patterns or questions like this one. The problem with these analyses is that there is no theory or mechanism that would plausibly predict a link between the sex of the chief executive and the national public health response to a pandemic. In particular, there is no mechanism that would link females leadership to better public health in a way that would swamp out all of the other variables that differ among countries and lead to different public health outcomes, but still allow the effect of female leadership to shine through. Instead, these kinds of questions (like the question in the caption to that photo from the NYT) are typically motivated by a few cherry-picked examples (New Zealand, Germany), followed by what Gelman has called a walk through the garden of forking paths, in which many analysis decisions (like whether to analyze just Europe, or all countries; whether a specific female leader is a real leader or just a figurehead; whether a country qualifies as having “adequate testing”) can all be made with a particular goal in mind: to “discover” a pattern consistent with some prior belief (like a belief that female leadership will result in better public health outcomes). I’m sure Scott G. did his best to do this analysis objectively, but without any theory to guide the decision-making about how to analyze the data it’s unlikely that a true believer in the qualities of female leadership will be convinced that his approach is the right one or that his analysis shows no benefit of female leadership.

  5. I’d be interested to know how the distinction between ‘women with power’ and women who are mere ‘figureheads’ was made. And what the results would be like had that distinction not been made.

    I’d also be interested in what the statistics are for those countries designated specifically as democracies. Cut out the rest, have a look at the countries where people elect their leader. Just out of curiosity.

    I agree with Mike above. I see no reason to conclude much of anything from this, there’s so much noise here.

    I also think it’s reasonable to(tentatively) suppose that female leaders are better at dealing with this kind of crisis – not because they have some kind of bullshit ‘female intuition'(and how sexist is that particular cliche?) but simply because they’re less liable to make the kind of cretinous, dick-waving mistakes that bell-ends like Dominic Cummings and
    Donald Trump tend to make.

    1. Hi Saul,
      Thank you for your comment. I would encourage you to email Jerry and read the full document I sent him, which discusses some of the points you raise here. I’ll put a few comments below:

      1) Some countries have separate heads of state and heads of government, where the head of state is more of a ceremonial role than one with real power. In these cases, I tried to identify the person running the government, rather than the ceremonial head of state. The leaders and their role and sex is listed in the datasets I sent Jerry, and if you found cases where I missed a woman leader with power, it would be straightforward to edit the datasets and re-run the R code to see if the result changes.

      2) I did not think to reduce the dataset to just democracies and re-run the analysis this way. It might be interesting to do as non-democracies may be more likely to under-report cases and deaths. However, only 4 of the countries included in the Europe only section are considered non democracies here (https://en.wikipedia.org/wiki/Democracy_Index). I suspect a democracy only comparison, especially one of European democracies only, would be very similar to the only Europe section of this analysis.

      3) There are many variables that could affect the extent of a coronavirus outbreak in a country, including population density, median age of citizens, size of outbreaks in neighboring countries, etc. I don’t agree however that this makes impossible to draw tentative conclusions from this analysis. A better analysis that accounts for these variables could certainly come along and find that female led countries outperformed male led countries. Until then however, I don’t see a problem with concluding that, at present, there is no evidence that countries with female leaders outperformed those with male leaders.

      Cheers!

      Scott

      1. Thanks Scoot. I hope you understand that I was genuinely curious with my two questions. I have nowhere near the expertise to do an analysis like this and I’m grateful you did it.

        I don’t think this will put the question to bed though.

    2. What we see here is confirmation bias. You, Mr. Sorrell-Till, do everything you can to discount and reject these data, but are amply willing to suppose that female leaders are better with these crises simply because of your intuition about “female intuition” (which you suggest as a possibility even though you recognize it as a “sexist cliche”). Then you cite two people whose have nothing to do with this issue.

      Your comment could well serve as a paradigmatic example of confirmation bias. You criticize those data that go against your preconception and offer in their place anecdotes that supposedly support your hypothesis.

      The data may be noisy, but the real noise is in your sand-in-the-eyes comment.

    3. Dominic Cummings is not the leader of any country that I am aware of. The mistake he made was on a personal level and if there is an effect (“sod that, if Cummings isn’t social distancing, neither am I”), it’s unclear because the news broke when the UK lockdown was being relaxed and we’d expect to see a reduction in the slowdown or even a slight increase in cases.

      Your final paragraph seems fallacious to me too. You are comparing current male political leaders with the generality of the entire female population. Political leaders are not selected randomly from the population. Perhaps dick waving and being a bell-end are advantageous traits for a politician. If that’s the case, then the women that make it to the top are just as likely to have those traits as the men.

      1. I didn’t say Cummings was a leader. I used him as an example of the kind of male belief that one’s own fabulous, grandiloquent visions for the world are simply right and correct, and any temperance of said views is unacceptable. That way lies large-scale disaster. No fallacy there.

        My second para is not ‘fallacious’ either. It’s a defense of a tentative supposition that I do not think is unreasonable. If there are psychological gender differences then they will tend to be displayed in the different genders of leaders. Your point about democratic selection effects is interesting, and perhaps it would swamp the psychological differences. But perhaps it wouldn’t. As I’ve said, I don’t know. But I don’t think it’s an unreasonable belief to hold, and as interesting as this analysis is I don’t think this question is settled.

        1. I didn’t say Cummings was a leader.

          Well why bring him up then? Why do you think that a belief that “one’s own fabulous, grandiloquent visions for the world are simply right and correct” is exclusively male and apparently applies to all males? It seems to me your thinking is fallacious and perhaps a “grandiloquent vision for the world [that is] simply right and correct”

          It’s a defense of a tentative supposition that I do not think is unreasonable.

          You’re responding to an article that presents evidence that the hypothesis “female leaders are better at dealing with this kind of crisis” is actually not correct. There are two problems:

          1. you fail to address the evidence already presented.

          2. you assume that traits of the population in general will be mirrored exactly in the traits of the leaders in that population. Your argument boils down to: “women do not indulge in dick waving, therefore female political leaders do not indulge in dick waving”. This is a fallacious argument.

          1. No I don’t assume any of that. I’m simply saying that the opposite assumption, that the theory is falsified by this analysis, is not true.

            “Well why bring him up then?”

            Because he’s a man in a position of enormous political power, and has demonstrated the kind of traits I’m talking about. ~I’ve told you why.

            “You’re responding to an article that presents evidence that the hypothesis “female leaders are better at dealing with this kind of crisis” is actually not correct.”

            Given the amount of confounding variables in action I don’t think it’s capable of doing that. What it DOES do is sunder the idea that there are statistical data that support the claim that women are better at handling these things. But I would’ve been extremely surprised if there were significant differences, again due to the enormous amount of confounding variables.
            And I doubt that Scott, from what I’ve read, would be comfortable saying that the theory is ‘not correct’ based solely on this analysis.

            Likewise, if someone came forward and handed me an analysis that demonstrated that there WAS actually a significant difference between male and female leaders, in any direction, I’d be suspicious. Again due to the sheer amount of noise.

  6. I agree there is no reason to speculate about a non-existent fact, but as an intellectual exercise it is worth noting that there is a plausible reason why women leaders are more likely to suppress the epidemic in their countries. The reason is the well established fact (I think) that women tend to be more risk averse than men. Thus, in the trade off between virus risk versus economic cost, women leaders may be more prone toward reducing virus risk than protecting economic growth.

    If this has already been mentioned, I apologize.

    1. Yes I agree this is a plausible idea. The problem is that it was not developed before the pandemic, and then used to predict how different countries would fare. Instead it was developed after someone at the NYT had noticed that a couple of countries with female leaders seemed to be less badly affected by the pandemic. A completely different causal explanation (that male leaders are more likely to respond aggressively to a perceived threat, and reduce the effect of the novel coronavirus) based on a similar trope about sex differences in aggression and threat response could be offered to explain the opposite pattern if it had occurred (better outcomes in countries with male leaders). In a sense, in studies like these the data don’t matter: some kind of plausible story could be developed to account for any outcome. Several areas of social science are plagued with this type of theory- and mechanism-free causal reasoning, where observational data are analyzed using seemingly sophisticated quantitative methods, and then interpreted after the fact using a reasonable-sounding stereotype or trope that was chosen after the data were collected and the overall patterns in the data were known. This doesn’t seem like a very reliable way to identify mechanisms or understand how things work.

        1. I’ve never read anything by him, simply because the people who tend to really rave about him have been so dim. You can’t tell a book by its cover but you can tell a book by its fans.

    2. Your hypothesis suffers from the same problem as Saul Sorrell-Till’s above. You take a general trait of the female population and assume it applies to a very small sample of women that cannot possibly be called randomised.

      If risk taking is an advantageous trait for a person in politics, then it is likely that the women who make it to the very top are risk takers as much as the men who make it to the top.

      1. I think it would be true on average that women who make it to the top are likely to be less risk averse than other women but not likely to be as risk loving as men who make it to the top.

          1. Take two normal distributions with the same variances over a variable called risk aversion with each distribution centered over the respective means for men and women. From each population, suppose the 5% who are least risk averse run for political office. That gives you two sub-populations. The women sub-population will be more risk averse than the men sub-population.

          2. “From each population, suppose the 5% who are least risk averse run for political office”

            Why would I suppose that? What if it’s the least risk averse women that run for political office but the tween 5% and 10% least risk averse men? What if the process of successfully applying for chief executive of a country selects for a particular risk taking profile?

          3. I provide a model of a neutral society where the least risk averse members of the population run for office and show statistically that the women who run for office are more risk averse than the men who run for office. Clearly for any society that departs from the assumptions, the result may not hold and “anything can happen.” I think the onus is on those who would claim otherwise to show how the departures would nullify the result. In other words, as always it is at bottom an empirical issue. But the model may well apply to countries like Norway and NZ.

  7. I will be happy if/when we don’t continue to compare/contrast the intelligence, thoughts, beliefs, strengths, behaviors, performances of women vs. men. May we get to the point where the acts are evaluated without reference to femininity vs masculinity. d
    Diversity of abilities and talents is spread throughout both sexes. Conversely, incapability is spread liberally throughout both sexes. There are no doubt many men and women who could be a better president than tRump. His idiocy is not specifically due to his maleness.

  8. Here is a possibility…. in Canada health is a provincial responsibility, although significantly funded by the federal government.
    Each province has its own Chief Medical officer, as well as at the federal level, and these are all fully trained doctors. These CMOs have significant powers to the extent that the premiers ignore them at their peril.
    In BC we have Dr. Bonnie Henry. Alberta has Dr. Deena Hinshaw, at the federal level we have Dr. Teresa Tam.
    Metro Toronto has a woman CMO whose name escapes me at this moment….
    Throughout this pandemic, these women doctors have been calm, reassuring, offered the best advice and knowledge available…. and in Alberta at least, everyone tuned in every day at 330 pm for Dr. Hinshaw’s latest update.

    So… our PM may be a male, but key people calling the shots are women.

  9. More importantly may be that these graphs show which spread the death rate has. [Disclaimer: I’m writing this after media has gone victim blaming on Sweden, despite that it is iffy to compare these rates outside of epidemic models.]

  10. Replying to Torbjorn just above (button no good):

    Rather than Sweden, was it not the man, or government cabinet, which determined to have laws which would produce herd immunity (but did produce up to now a deaths/million number which is 10.5 times worse than Norway’s number) who is being blamed?

    Not a victim or victims as far as I recall.

    Lots of dead elderly Swedes who might have been better off living to the northwest across the border. Surely the public health systems in the two countries are not that much different–and likely near the best in the world–that’s been our experience in Norway and Iceland as foreign visitors over the last 50 years.

Leave a Reply to Saul Sorrell-Till Cancel reply

Your email address will not be published. Required fields are marked *