Does the best team win the World Cup?

There’s a dearth of evolution news during the holiday break, so I thought I’d call attention to a oft-discussed problem with the World Cup and a little known paper that suggests a solution.  The debate (which, as a football neophyte, I won’t attempt here to resolve or even contribute to) centers on whether the arrangement of World Cup matches is the best way to determine the best team.

To win the World Cup, a team need win only six or more games out of seven against a variety of teams, and to clinch victory it need win only a single game against an opponent it hasn’t previously met in the tournament.  This is in strong contrast to American games like basketball or baseball, in which the two final teams are pitted against each other in a series of games, so that you have to win more than one (four in the case of the baseball World Series) to become champion.

There are two interrelated problems here. The first is that the championship is decided with a single game, and the second is that the number of goals that a team must score to win is relatively low—sometimes just one—so a win can reflect luck, or the events of a single day, rather than a persistent and repeated superiority over an opponent. (The American football championship, the Superbowl, is also decided by a single game, but it typically involves several goals and many points.)  Can we really be sure that the victor in a single World Cup game is the best national team in the world?

This problem was taken up in 1966 by John Maddox, in an piece he wrote for Nature called, “We wuz robbed” (you can download it by going to this page).  If you’re a scientist, you’ll know that Maddox was the plain-spoken and controversial editor of that journal, where he served for two terms (1966-1973 and 1980-1995). He died last year. 1966 was, of course, the only year that Brits ever won World Cup, in a 4-2 final with Germany that featured the only “hat-trick” (three goals by one man; in this case Geoff Hurst) ever performed in a World Cup final.

Maddox fitted the number of goals among all teams in that year’s World Cup to a Poisson distribution.  This is a statistical distribution that occurs if there is a constant but very low probability of an event (say, a goal) occurring in a small interval (say, one minute of a game).  If the probability is constant, then the distribution of events over a longer interval (say, goals in a 90-minute game) should fit the Poisson.  Here from the article is Maddox’s compilation of each team’s World Cup goals, and the expected distribution from under Poisson expectation whose mean is equal to the average number of goals scored by a team (1.234 in this case).

The fit looks pretty damn good, and I confirmed this by doing a chi-square goodness of fit test, which gave the result χ² = 5.73, df = 6, 0.5 > p > 0.4, which isn’t even close to a significant deviation from the Poisson expectation. (I’m told that Mike Whitlock and Dolph Schluter show a similar Poisson distribution of more recent soccer scores in their statistics book The Analysis of Biological Data.)

The fact that the distribution fits so well, as if it were a single team with a fixed probability of scoring goals, led Maddox to say this:

The mere fact that a Poisson distribution can describe so well the distribution of scored by individual teams goes a long way to suggest that the teams were much of a muchness in talent and their scores were independent of each other.  From this point of view, the decision that the outcome of a single competition should depend on the outcome of a single game between the two so-called finalists was as much of a farce as a great many West German supporters already know it to have been.  If it is assumed that the goal scoring potentiality of the two teams is equally sell described by the Poisson distribution already specified, the chance that the result will be a draw is a mere 0.27. In other words, if two teams are equally matched, the chance that the result will be an active injustice to one of them will be 0.73.  By the same token, a team which is slightly less skilled than its opponent can nevertheless expect a one in three chance of winning the deciding match.

Well, I’m not sure I’d consider the loss to an equally-matched team to be an “injustice,” but Maddox has a point.  There are not many baseball World Series matches in which the losing team has failed to win a single game, and so we might be wary of saying that a team that wins the World Cup has decisively demonstrated its superiority to all other national teams.

The solution would seem to be making the championshp depend on winning more than one game.  Maddox suggests a World-Series-style final of several “replicated’ games, so that

. . the finalists go on playing against each other either until the superiority of one or the other of them is properly established, or until both parties agree to negotiate a draw.

Such a negotiation is of course out of the question, but a series of matches is not.  Maddox suggests, tongue in cheek, an alternative:

[R]edesign the parameters of the game of football in such a way that a respectable degree of confidence in the outcome of the competition can be acquired in a reasonable interval of time.  If, for example, it were agreed that single cup finals should remain, but that no team should be declared the winner until its score exceeds that of its opponent by three standard deviations of the Poisson distribution, it might be necessary to design the game of football so that it would be practicable for one side to score 100 goals or so within the limits of endurance of the spectators.  This implies that the parameter q [the mean score] would have to be much greater than under the present rules.  Such a change could easily be brought about, possibly by widening the goalposts or by abolishing goalkeepers.

Nobody’s having that, but why not multiple games in the final?  The downside is that this would make the World Cup much longer (especially if multiple games are also held in the earlier stages), and would also eliminate the drama of the championship coming down to a single 90-minute game that the whole world watches.  But really, isn’t the World Cup about determining which country’s team is best, a decision that the winning nation can proudly claim for the next four years? Is there anyone here who would defend the present system against one involving multiple games in the final?


Maddox, J. (writing anonymously).  1966.  We wuz robbed.  Nature 211:670.

h/t: Geoff North

  1. The only true alternative to a system like exists for the World Cup, in which teams are pitted against each other in a series of matches where the “best” team doesn’t always come out on top, is a system like in NCAA D1 football, with a bunch of meaningless matches that no one really cares about. It is about crowning a world champion, the team that can make it through a grueling playoff situation, not the team with the best individual players. I don’t think there is anyone that follows sports that would trade the exciting, phenomenal games that occurring during this world cup to see the “experts” pick the top two teams to play against each other, while watching a bunch of meaningless matches. Yes turning it into a best of three championship series instead of one game would find the “better team”, but then, as you mentioned, you loose the drama and excitement. It is comparable to the Super Bowl, which gets a larger following than any single game in the Basketball or Baseball championships because it means more, and the heightened excitement and buildup of a single game is worth having the winner of the World Cup be the underdog more often than in other sports.

    1. There was a lot of griping back in Dec-Jan about revising how the NCAA D1 football (REAL football) BCS bowl game series is done. Similar complaints: most of the top teams never play each other, so those teams from lesser conferences (i.e. not SEC, Big 12, etc.) never have a chance to reach the Nat’l Championship game. Boo-hoo. It is what it is. For football at least, it’s mainly (all?) about $$$. I don’t know enough about World Cup soccer to make that assumption for this case.

      1. The schools make more money in the current system than they ever would in a playoff situation, which is whythe school presidents keep voting for it. I am just saying the best and most exciting competition comes when there is a playoff type situation.

  2. I too prefer the drama over the crowning of the “real” (such a thing?) best team. That’s one of many, many things that annoys me about baseball and basketball championships.

    1. I just wish that the people lobbying the NCAA for a college football playoff would use the fervor over the World Cup to champion their cause. Soccer is a sport the US cares about once every four years, and the games consist of long stretches where not very much occurs. The potential that exists in a playoff series for D1 college football, a sport that is as big as any in the US, is huge.

      1. “long stretches where not very much occurs”

        That’s only if you consider scoring as the only possible occurrence in the game. In fact, a lot goes on between goals, but you have to enjoy watching the play, not just watching to see who wins.

        Most popular team sports involve a predictable back and forth between teams. At the end of each of these cycles, either the offense scores or they don’t, and play is turned over to the other side.

        Soccer is much more fluid, turnovers are happening constantly, and thus scoring a goal happens much less frequently. (Ice hockey may be similar in terms of turnovers, but the rink is much shorter.) After watching enough, though, you start to see the intricacies of the play, the different styles, and the building momentum and tension. The difficulty and infrequency of goals makes each one all that more savored (and/or controversial).

        So, it takes a somewhat different perspective to enjoy the game versus your football, baseball or basketball.

        1. Oh, it is a fabulous game, just that is how the majority of Americans perceive the sport.

        2. All true, and yet one must appreciate the fact that the short attention span of the average American makes concentrating on one thing for 45 minutes an ordeal.

        3. Except, of course, that the Poisson model just described falsifies all the drama you just invented.

          (One can as well throw appropriately weighted dice to decide the winner.)

          😀 😀 😀

  3. Technically, the two teams in the finals can have faced each other- in the group stage (the group winner and runner up go to the separate brackets), not to mention the various other tournaments (like Confederations Cup).

    Anyway, keep your fancy statistics to baseball. The alternative to the World Cup is war. 😉

      1. Didn’t misspell “World Series”, no. The World Series is exceedingly boring, but at least there’s a decent chance that the “best” team will win.

        The World Cup is dreadful in every regard.

  4. It’s a knock-out competition, so it’s not really about determining who is actually the best in a scientific sense. That the same cabal of teams keep winning the thing must indicate it does an ok job (Brazil, Italy, Germany)

  5. Isn’t this related to Arrow’s Theorem, which says that given a contest (election) between more than three entries, there is NO perfect way to select the best?

    1. Perhaps, but not in a trivial way. The paper I linked to in another comment shows that it is the teams “fitness”, their ability to score, that ranks them. That fitness doesn’t fulfill all of Arrow’s criteria for his social choice theory.

      The “If every voter’s preferences between X and Y remain unchanged when Z is added to the slate, then the group’s preference between X and Y will also remain unchanged” is broken. Introducing new teams means changing average fitness for all the teams.

      You have to reassess the theorem in that light. Off hand I would think that fitness, as a direct measure and not an individual’s voting preference, works fine as an unambiguous ranking.

      [Again I think the problem may be that it is a faulty interpretation of the game. There is no drama, but there is no “contest” either, the outcome is already decided by fitness. Modulo the stochastic element.]

  6. The idea is not to determine who is the best team in a scientific sense – it’s a knock-out competition, after all; who was best on the day. That the same few teams keep winning the thing, must indicate something though.

    On another note, I can never get my head around the low scoring complaint, it seems to be a knee-jerk view more than anything. Plenty of sports have no scores at all, or so many that they are meaningless. Only the end result matters, surely?

    1. I agree with you that the ‘soccer is boring/games are low-scoring’ thing is rarely well thought out, but…
      “Only the end result matters, surely?”
      I think even that’s missing the point. If only the end result mattered, then 0-0 draws would inherently be boring, and Australia’s 31-0 demolition of American Samoa would have been one of the greatest games of all time. (see )

      The point is though, that in football (and even in American football, and ice hockey, and basketball, and this is where I think the people who make the argument that soccer is boring because there’s barely any goals are deluding themselves without realising it) it is not solely the scoring of goals (or touchdowns, or nets, or what have you) that is exciting.

      A goal like this: is nowhere near as enjoyable as a goal like this: , and likewise the first goal linked is nowhere near as exciting to watch as seeing an amazing shot matched by an amazing save by the goalkeeper (see here, for example: ), and so on.

      Are the only interesting parts of a chess match the bits where pieces are captured?

      1. Re the Maradona goal you mention, Wikipedia says this:

        Maradona’s second goal was later voted by FIFA as the greatest goal in the history of the World Cup. He received the ball in his own half, swivelled around, and with 11 touches ran more than half the length of the field, dribbling past five English outfield players (Peter Beardsley, Steve Hodge, Peter Reid, Terry Butcher, and Terry Fenwick) and goalkeeper Peter Shilton. This goal was voted “Goal of the Century” in a 2002 online poll conducted by FIFA. Right after the goal occurred, it left the television commentator “sobbing in joy”, and apologizing for his outburst.

    2. “I can never get my head around the low scoring complaint”

      The real problem with the low scoring nature of the game is that it makes viable the strategy of doing nothing but defending and hoping to get lucky. If one team in a game plays that way, the game is generally going to be dull, simply because if you’re only defending, then it is relatively easy to close down the other team so they can’t put anything together.

      In a sport where you can reasonably expect to have to score 3 times to win (hockey, baseball, American/Canadian football, etc), you can’t rely on just defending and getting the one break and capitalizing. 100 goals in a game (i.e. basketball scores) isn’t necessary but adjustment of the offside rule would help. Things like needing daylight between the bodies of the attacker and 2nd last defender in order to be offside and/or no offside possible if the ball is already in the last 1/3 or 1/4 of the field. That’s on the basis the purpose of the offside rule is to prevent net-hanging (as I believe was the original purpose, now long lost in technicality).

      As others have observed, though, the World Cup and indeed soccer, involved a certain risk of ‘unfairness’ in that being best in some idealistic way isn’t always going to be enough or even being best that day. But then, that’s what makes for upsets and drama.

  7. Odd. Do people also complain that the roulette wheel is completely arbitrary?

    It’s hazard. It’s supposed to be stochastic.

  8. Ya know, if you tweak the rules to ensure that it is all but inevitable that the better team wins, then you might as well call it “measurement” rather than “sport”. (And good luck convincing people to remain interested in it if you do that.)

  9. American football and basketball inflate scores by making goals worth more than one point. A 3-2 world cup score could be equivalent to a 21-14 American football score. Also all of the timeouts and clock stoppages in American football and basketball ruin the flow of the games. The one sport where I can see a need for multiple games is baseball because different pitchers are used in each game. In other sports, you put the same 5 or 11 out each time. One game is enough.

      1. Indeed, but then Americans often refer to their domestic champions as ‘world champions’ including at times the World Series winner. It’s most egregious in American football, a sport played to any extent only in the USA. A bit like if we Canadians claimed the winner of the Grey Cup was the “world champion”.

  10. Football, like life, is sometimes unfair. Sometimes the good guys lose, sometimes the charlatans win (I’m looking at you Luis Suarez). That’s why the World Cup is so captivating.

    A best of ‘x’ series to decide the World Cup winner? Never, I say!

  11. They do face each other multiple times – every four years. The people who’re determined to find the best team as opposed to the winner of the world cup can have their fun with those statistics (and they do).

    In other words: Newsflash – it is about the drama, not about the scientifically correct method to find the best team. It’s about luck and skill and hope – and party. Party for one month, because any longer wouldn’t have the same effect. Let us have our regular one month drama-party. Soccer is a game and games need luck, suprises, skill, tactics, good and bad days, shock and hope. Drag the games out to several more and half of these attributes will disappear.

    The world cup is really not about finding the overall best team, and that’s quite okay.

  12. The purpose of competitions, and tournaments, is to determine who wins the even, not to determine who statistically would win the most games as N, the number of games, becomes large.

    If two teams are in the same “league”, it is understood that on any given day, either team might win, even though one of the teams is better. The point of a tournament is that *any* team might win — the possibility of an upset is “part of the drama”.

    Having said that, I have always thought that Soccer and Hockey are two examples of games in which it is too difficult to score. Both would benefit from widening their respective goals by about 50%. But I say that not being a fan of either. I’m sure that rarity of goals is considered part of the drama/fun of both sports. Each single goal gives players and fans a victory thrill. Compare that to Basketball with dozens of goals score in a game — each goal by itself is a comparatively ho-hum event, unless it is a close game in the final seconds.

    1. One might argue hockey could stand more goals, but at about 6 goals per game it doesn’t have soccer’s problem of pure defence with only the occasional break as a viable method of play at the top levels (and certainly not if the game is actually called according to the rules).

      1. The style of play at the top levels is different all around the world.
        In Brazil it’s all about individual skill, with mazy runs making it difficult for opposition defenders to mark effectively as they lose their man, leaving openings for others.
        In Italy it’s a very defensive style of play with slow, measured build-up play, probing for good passes whilst trying not to leave any openings for counter-attacks.
        In England it’s a very attacking style of play with lots of counter-attacks and a high amount of total shots. Goalless draws are rare.
        In Spain it’s all about short passes to feet mixed with individual moments of skill like a Leonardo Messi or a Cristiano Ronaldo.
        And so on. One of the draws of the Champions League is that it is interesting to see how different styles of football clash. It’s also why many players who are amazing on the continent fail to play well when coming to play in England.
        Look at Diego Forlán, for example, who was a massive disappointment when he came to England to play at Manchester United, one of the best teams in the world. His style just didn’t fit in. He scored just 10 league goals in 63 appearances, which is very poor for a striker.

        Of course, he then went to Villareal in Spain where in his first season he scored 25 league goals in 38 appearances, winning the Golden Boot (which is an award given to “the leading goalscorer in league matches from the top division of every European national league” (from Wikipedia).

  13. I wonder how the FA cup in the UK (or is it just England?) matches up in this respect? I seem to remember that the top two teams coming out of the league end up playing for it and that this really depends on the long series of games of the league.

    1. Then you remember wrong! The English FA cup is a straight knockout competition open to any club in the top ten levels of the league system, and had 762 entries in the last year. Sadly there are different entry points to the competition so we don’t get the fun of seeing, for example, Arsenal taking on a side on a park pitch in Northumberland.

    2. As Chris said, the FA Cup is a knockout tournament with single leg pairings (replays for ties until the semifinals). Incidentally, it only covers clubs in the FA, which is just England and several Welsh clubs.

      The Premiere League championship is awarded to the team with the highest score in the season, each team having played every other twice.

      Some championships (such as the CONCACAF and UEFA Championship Leagues, except for the final in the latter) use dual leg pairings, where the winner is decided by aggregate score among two matches rather than one (typically, one home and one away). I suppose that’s to negate a perceived home field advantage.

  14. Well, as a few others have hinted, It’s not only about the math, but more about the drama and spectacle. This is also why we don’t use cameras etc to have precise judging. We want an event. We want something to talk about. We don’t want mathematically satisfying. Football is for humans.

    The reason the Americans have more games I suspect is about money.

  15. Well, I didn’t know that it was such an old observation. But as I noted in the first World Cup post, there is a recent arxiv paper (with different and better statistics) which comes to the same conclusion.

    It is the team’s season “fitness” (capability to score) that is the essential parameter in a Poisson model for the game, not the individual strategies, players or the play day events. The play is just for show, one can as well throw dice (appropriately weighted) to decide the outcome.

    This putatively generalizes to other games:

    “To a very good approximation scoring goals during a match can be characterized as independent Poissonian processes with pre-determined expectation values. Minor correlations give rise to an increase of the number of draws. The non-Poissonian overall goal distribution is just a consequence of the fitness distribution among different teams. The limits of predictability of soccer matches are quantified. Our model-free classification of the underlying ingredients determining the outcome of soccer matches can be generalized to different types of sports events.”

  16. “it is about the drama”, “but more about the drama and spectacle”

    With all respect, the point of the research is that the drama is _constructed_ – there is no drama as described there. “To and fro”, “strategy”, “best player”, “team spirit”, “hat trick”, “daily form”, … no such things decide the outcome, see the paper I linked to. More than the drama of “the dice throwing” not deciding the fittest team, as the post discusses, which is a constant of the cup construction.

    Now I don’t mean that one shouldn’t enjoy the game, but to call it drama when there is no such thing is a bit over the top. Or “religious”.

    Why not call it enjoyment or show, that is fair I think?

    1. Hari Seldon, is that you?

      If you’re just interested in the final score then I can see why you’d dismiss it but it’s the ups and downs of the match that make it dramatic. Are you one of those people that skip to the end of a novel?

  17. While small sample size (as in single game knockout) allows for a more spread out distribution of results (i.e. the lesser team will have a higher probability of winning), I don’t think Maddox’s analysis says anything about the ‘deservingness’ of the winner of the World Cup. The question is not whether all goals, regardless of team, can be modeled by a Poisson, but whether different teams have different models. What we want to know is if the Poisson parameter ( = the mean # of goals scored) is significantly higher for, say, Brazil, than it is for, say, England). This of course does not exhaust the possibilities for statistical comparison, but, unlike Madox’s, it’s at least a start in the right direction. (As a side note for evolutionists, Maddox’s analysis reminds me of many misguided and misinterpreted models of random biotic evolutionary change that were popular about 30 years ago.)

    1. Bravo! Yes, that’s exactly the problem with Maddox’s analysis.

      Another analysis I have in my hands of over 30 years of English league soccer (just the main ones) also shows that goal scoring appears to occur at random, following a Poisson distribution. HOWEVER, this doesn’t mean each team’s likelihood of scoring a goal is equal!

      For instance, looking at the same World Cup 1966 figures, the top four teams scored a mean of 2.25 goals per match, whereas the mean (per game) for the bottom four teams was only 0.5!

      Of course, a team’s performance isn’t entirely captured by the number of goals scored. Most obviously, their “ability” to concede goals also has to be factored in. Doing this analysis, the bottom four teams haemorrhaged 2.25 goals per match, whereas the mean for the top four teams was 0.96.

      Now, I’m no statistician, but it seems rather odd to conclude that “teams were much of a muchness in talent and their scores were independent of each other”…

  18. Or as the English say: “Football is a game of 22 people playing and Germany winning!”

    1. And the added pleasure with this German side is they are actually very enjoyable to watch even for non-Germans.

  19. If the best team won, where then would be the fun?! If you want to know who is best look at the FIFA rankings (for what they are worth) – the top ten are generally fairly accurate (though England should be lower than presently rated), but as you go down them teams are likely to go up & down more if only because of the numbers of others at the middle levels of skill/ability. Fortunately there is plenty of chance involved – see this interesting article from New Scientist in 2002 –

  20. Well thanks. Here I was having a great time watching the matches (and it’s now down to the final three games) and you, Mr. Scientist/Statistician, just can’t resist the temptation to inform me that, really, it’s like watching televised poker (except the players run around more).

  21. “To win the World Cup, a team need win only six or more games out of seven”

    In fact, it is possible to win the World Cup by winning just 5 games (2 group games can be lost) and 4 of these could be 0-0 after extra time.

  22. In fact it would be possible to win the World Cup by winning only 4 matches and drawing all 3 Group games, as Italy did in 1982 when they won (there were only 24 teams in the competition then, but there was a second round of group games before the semi-finals).

  23. The ONLY issue with the World Cup (and football in general) is the dreadful refereeing. Mistakes are usually all over the place, especially when a big time (Brazil, Barcelona) is involved, and usually in favor of said big time.

    Besides that – the formula is better than Baseball and Basket competitions by a million years. Nothing more boring than those endless series of matches. Who cares if David has a shot vs Goliath? That’s the beauty of the sport. It seems US people really don’t get it…

