Editor’s note/warning: A version of this analysis will be published on Jan. 8 by the Harvard Sports Analysis Collective. Andrew has been kind enough to adapt his work for World Soccer Talk ahead of this weekend’s FA Cup matchups. Any precision lost in the editing of this post is unintentional and may not represent Andrew’s conclusions.

This weekend marks the third round of England’s FA Cup – the culmination of a process that begins each August, when every club, professional or amateur, hopes to become one of the 20 teams from the third division or below to be alive come the first weekend in January. Theoretically, and with a little bit of luck, every club has a chance at a glory tie with Manchester United at Old Trafford or Liverpool at Anfield after England’s top two tiers enter the draw.

Thanks to that format, the FA Cup has recorded its fair share of “giant killings” over the years: fourth division Wrexham defeated defending league champions Arsenal in 1992; amateur side Hereford United beat top flight Newcastle United in 1972, courtesy of Ronnie Radford’s memorable goal; conference side Sutton United defeated 1987 winners Coventry City in 1989; and as recently as 2013, fifth division Luton Town defeated a Premier League club, Norwich City.

Based on that history, I decided to try and determine the probabilities of teams advancing in the FA Cup based solely on their divisional status. What are the odds a team gets to the quarterfinals given that they are from, say, the second division? Or a fourth-tier team making it all the way to Wembley? What’s the probability of a non-league side can actually lift the Cup? What are the odds of having a true Cinderella?

SEE MORE: Why the Capital One Cup is better than the FA Cup.

Unfortunately, because of promotion and relegation, the division of some teams changes from year to year, and I was only able to easily find divisional status covering the last four years. That’s what I used as my sample.

For those unaware of the format, there is no seeding in the FA Cup. In every round, all the participating teams are put into a hat, and matchups are determined by a blind draw. The first team out of the hat plays at home, the next is their opponent, and so on. Sometimes the best teams play each other super early, like when Manchester United and Manchester City met in the third round in 2012, or Tottenham Hotspur met Arsenal at the same stage in 2014. But this can also mean lower-level teams can get the luck of the draw. Championship side Millwall reached the final in 2004 without having to play a single Premier League team (they eventually lost the final 3-0 to Manchester United).

To start, I went through every FA Cup tie over the last four years to determine the probability of specific upsets given specific divisions. For example, in 67 meetings between Premier League and Championship teams, Premier League teams won 52. This is reflected in the table below.

Premier League Championship League One League Two Non League
Premier League 52/67 19/26 10/11 7/8
Championship 15/67 15/28 10/14 8/10
League One 7/26 13/28 3/3 1/2
League Two 1/11 4/14 0/3 N/A
Non League 1/8 2/10 1/2 N/A

Notes: Any team from below the fourth division was classified as non-league. The lowest team to reach the third round in the last four years was seventh-tier Blyth Spartans, who lost 3-2 to Championship side Birmingham City. The rows indicate the division, and the fraction represents the number of wins over the number of matchups against teams from the division in the columns. There were no matchups between League Two and non-league teams in the third round or later.

Over the last four years, the third round has featured an average of 9.75 teams from League One, 6.25 teams from League Two and 4 non-league teams, in addition to the 44 teams from the top two tiers of English football. Based on this, I was able to calculate the expected number of teams from each tier to appear in each round of the cup.

The ratios above create a series of probabilities – the likelihood of a team from one division defeating a team from another. But because the sample size for matchups between teams from League One, League Two, and non-league football was so small, I gave them each a 50% chance of beating each other, regardless of divisional status. This is a reasonable assumption considering further down the leagues income disparities level off, and teams are of a much closer caliber. (Also, the probability of beating a team from your own division is 50%. One team has to win, and one team has to lose.)

SEE MORE: The FA Cup is in Tottenham’s DNA; the club needs to take it seriously.

Based on those probabilities, above, we can determine an expected number of teams from each level for each round of the FA Cup. I did this by taking the probability of drawing a team from each division and multiplying the probability of beating a team from that division. I then used the formula for expected value to find the expected number of teams from each division in each round.

At each new round, I had to repeated this process. At each new level of the competition, the ratio of Premier League teams increases, making it more likely a team from a lower level would be drawn against a first-tier side.

Here are the number of teams from each level of English football that we can expect at the FA Cup’s various rounds:

Round 3 Round 4 Round 5 Round 6 Semifinal Final
Premier League 20 14.4 9.3 5.4 3 1.8
Championship 24 11.1 4.4 1.5 .7 .2
League One 9.75 4.1 1.5 .8 .3 0
League Two 6.25 1.7 0.5 0.3 0 0
Non League 4 1.1 .2 0 0 0

Using these values, I was able to determine the probability of a team from a specific division reaching a given round of the competition. In percentages:

Round 4 Round 5 Round 6 Semifinal Final
Premier League 70.52% 47.02% 27.65% 16.59% 9.75%
Championship 46.29% 18.68% 6.65% 2.00% 0.45%
League One 41.31% 15.83% 5.43% 1.71% 0.39%
League Two 28.85% 6.90% 1.24% 0.19% 0.02%
Non League 26.67% 5.98% 1.05% 0.18% 0.02%

Some oddities do occur from the small sample size. For example, Luton Town’s victory over Norwich City in 2013 creates the illusion that the odds of a non-league side defeating a Premier League side is one in eight, when in reality it is much, much lower (the previous non-league side to beat a top flight team was Sutton United in 1989). It should also be noted that the probabilities for the teams in the lower three categories are the probabilities given that they reach the third round of the competition.

Based on these numbers, we should see a non-league team reach the final of the FA Cup about once every 125 years. The last non-league team to win the FA Cup? Tottenham Hotspur, then playing in the Southern League, in 1901.

This year, there’s only one non-league team in the third round: Eastleigh, who play at home to Championship side Bolton. Here’s hoping they pull off the remarkable and become a true Cinderella!