How I built my 2026 World Cup forecasting model

The key starting point is to assess the relative strength of the competing nations. We all know that France, Spain and Argentina are good. But how good are they compared to Curaçao and Cape Verde?

Unlike club teams, which regularly play competitive matches against teams of comparable quality, it’s difficult to robustly quantify international teams’ strengths. This is because they don’t play very often, and when they do, they face opponents of varying quality and levels of competitiveness. This is especially true for some of the smaller nations at this year’s tournament.

I haven’t created an international team-ratings system that draws on each team’s underlying data (I have a shot-based system for club teams), so I have used third-party ranking systems to provide inputs to my model.

These are:

transfermarkt.com, which assesses the transfer market value of every player. The assumed value of an international team’s squad isn’t a perfect estimate of how good they are, but it gives an indication of their potential. And it’s independent of the team’s actual results.

FIFA rankings – FIFA’s official international team rankings, based on results, using an ELO-type method.

eloratings.net – Another ELO rankings system (possibly better than FIFA’s)

I standardised the values from the three sources so that each has a mean of zero, then created a combined rating using a weighting that I judged to give a reasonable estimate of each team’s strength (44% transfermkt, 28% FIFA, 28% ELO). These values are converted to attack and defence ratings for each team.

For example, France are ranked the best team, with attack and defence ratings of 2.80 and 0.56, meaning they are expected to score 2.80 goals and concede 0.56 against an average team in a typical match. An average team would have attack and defence ratings of 1.34.

The derived ratings are the main input to my model. However, they are only my best estimate of a team’s strength at the tournament and remain uncertain. For example, a team could be affected by injuries, form, tactics, internal arguments, and other factors that the ratings haven’t accounted for. So, in the model, each team’s strength varies randomly (based on a normal distribution) across simulations around the central estimate.

In each simulation, every match score is modelled randomly using a Poisson distribution and each team’s strength ratings.

For example, for France v Iraq, using the central ratings above, the model will calculate the expected score as:

France goals = (2.80/1.34)*(1.61/1.34) *(2.43/2) = 3.05

Iraq goals = (0.74/1.34)*(0.56/1.34)*(2.43/2) = 0.28

Where 1.34 is the average goals scored and conceded across all teams, and 2.43 is the average goals scored in an opening group match (opening World Cup matches tend to be cagey, with fewer goals than in later rounds).

Each simulation will calculate a specific score for each match. For example, France v Iraq could be 4-1, 3-0, 0-0, or 2-1 in different simulations. Over a large number of simulations, France’s goals scored will average 3.05, and Iraq’s will average 0.28. Using the Poisson distribution, France would win 90.9% of the simulations, Iraq would win 1.7%, and 7.4% of the simulations would result in a draw.

In each simulation, every match is modelled in this way. The knockout matches are determined by the simulated results of the previous matches. The complete tournament is modelled each time, and the probabilities are derived by combining all simulations.

What software have I used to build the model?

I’ve used an Excel spreadsheet. I know, I know – Excel has severe limitations for performing Monte Carlo simulations. It takes about 5 minutes for it to process the simulations across 5000 columns.

I built my original Excel model for the 2018 World Cup, and it has performed pretty well. It’s relatively easy to adapt to different tournament formats, such as the expansion to 48 teams, and to retrieve probabilities for a range of interesting scenarios, including group places and the most likely finalists.

Leave a comment