The Unlikeliest Final Four
Posted by Neil Paine on March 28, 2011
Note: This post was originally published at College Basketball at Sports-Reference, S-R's College Hoops site, so when you're done reading, go over and check it out!
Just how unlikely is this year's Final Four of Kentucky, UConn, Virginia Commonwealth, and Butler?
Well, going by one measure, the odds of it happening were 0.00003% -- only two entries (out 5.9 million) correctly picked the four teams in ESPN.com's Bracket Challenge. But I decided to see how this year's improbable group matched up against other inexplicable Final Fours since the tournament expanded to 64 teams in 1985. Here were the Final Fours with the highest average seed # since then:
Year | Team A | Seed | Team B | Seed | Team C | Seed | Team D | Seed | Avg | #1s |
---|---|---|---|---|---|---|---|---|---|---|
2011 | KEN | 4 | CONN | 3 | VCU | 11 | BUTL | 8 | 6.50 | 0 |
2000 | UNC | 8 | FLA | 5 | WISC | 8 | MICS | 1 | 5.50 | 1 |
2006 | GEOM | 11 | FLA | 3 | LSU | 4 | UCLA | 2 | 5.00 | 0 |
1986 | KAN | 1 | DUKE | 1 | LSU | 11 | LOU | 2 | 3.75 | 2 |
1992 | IND | 2 | DUKE | 1 | MICH | 6 | CIN | 4 | 3.25 | 1 |
2010 | MICS | 5 | BUTL | 5 | WVIR | 2 | DUKE | 1 | 3.25 | 1 |
1985 | STJO | 1 | GTWN | 1 | VILL | 8 | MEM | 2 | 3.00 | 2 |
1990 | ARKA | 4 | DUKE | 3 | GEOT | 4 | UNLV | 1 | 3.00 | 1 |
1996 | MIST | 5 | SYRA | 4 | UMAS | 1 | KEN | 1 | 2.75 | 2 |
2005 | LOU | 4 | ILL | 1 | MICS | 5 | UNC | 1 | 2.75 | 2 |
Aside from 2011, two other years stand out at the top of the list: 2000, when two 8-seeds crashed the Final Four, and 2006, when no #1 seeds made it (but George Mason did). In terms of pre-tournament likelihood, how do those years stack up to 2011?
To answer that question, I simulated each tournament from scratch ten thousand times using the seed-based win probability formula I introduced here. In my 10,000 simulations, here's how often each team made the Final Four:
Year | Team | Count | Probability |
---|---|---|---|
2011 | KEN | 1194 | 11.9% |
2011 | CONN | 1631 | 16.3% |
2011 | VCU | 24 | 0.2% |
2011 | BUTL | 174 | 1.7% |
2006 | LSU | 1140 | 11.4% |
2006 | UCLA | 2261 | 22.6% |
2006 | FLA | 1649 | 16.5% |
2006 | GEOM | 50 | 0.5% |
2000 | FLA | 749 | 7.5% |
2000 | UNC | 192 | 1.9% |
2000 | MICS | 3028 | 30.3% |
2000 | WISC | 211 | 2.1% |
Multiplying the probabilities together, we find that the 2006 Final Four had a 0.00213% chance of happening based on seeds, the 2000 Final Four had a 0.00092% chance of happening, and the 2011 Final Four had a staggering 0.00008% chance (about 1 in 1,229,650) of happening. Since the field expanded to 64 teams, I think it's safe to say that this year's Final Four is easily the most improbable.
March 28th, 2011 at 3:01 pm
Every possible final four is unlikely.
If every team had an equal chance of winning each game, then every final four combination would have a ~.000015 probability of happening.
March 28th, 2011 at 3:37 pm
It's degrees of unlikeliness, which is made clear in the title.
If the Final Four were decided randomly, it would be random. But it's not, and seeding has an established relationship with tournament performance. It's a pretty good start for trying to ballpark the magnitude of this upset.
March 28th, 2011 at 3:47 pm
This reminds me of a radio promo they've been running for Colin Cowherd's show. In a "I hate to burst your bubble" kind of tone, Colin informs us that the NCAA tournament being full of upsets is a myth, it's dominated by the big programs, etc.
Is Colin magic? This guy is the sports radio version of that Willam Macy movie The Cooler. Every point he makes is immediately obliterated by reality: John Wall isn't going to amount to anything, Erin Rogers is a dud, and so on.
March 28th, 2011 at 3:47 pm
@Mike,
Your assumption is that the probability of every final four combination occurring is uniformly distributed, i.e. choosing the final four amounts to casting a many, many-sided die.
But we all know that the ultimate combination is almost random, centered around seeds. In our perception, higher-seeded teams are better, therefore have a better chance of making it to the final four. What that means is that the eventual final four combination is in fact normally distributed rather than uniform. As a result, the TRUE probability of any single combination is weighted based on tournament seeds.
So in terms of absolute probability? Well, yeah by your definition based on uniform distribution, every combination is "unlikely" to happen. But in relative terms, this year's final four is "relatively more unlikely" than the rest. Is it the absolutely most unlikely final four? Of course not. Logically, the most unlikeliest final four will be all bottom-seeds. But as far as modern bracket history goes, the 2011 final four is relatively the unlikeliest final four combination that we have ever had based on their seeds.
March 28th, 2011 at 5:29 pm
What would this look like using LOG5?
March 28th, 2011 at 7:27 pm
Does this include VCU's "play in" game?
I'm also reminded of a thought experiment I read about (the name eludes me, but I'm sure someone here knows it). The premise was that if a lottery exists such that any individual ticket had such infinitesimal odds of winning as to consider them zero, could it still be assumed that SOMEONE was guaranteed to win the lottery? Basically, if no one had a practical shot to win, then is it possible that no one wins? To apply it here, if we assumed that no Final Four was likely, could we conclude that maybe the Final Four just won't happen??? I SURE HOPE NOT!
March 29th, 2011 at 12:00 am
1980 was also an unpredictable final 4. purdue and iowa finished third and fourth in the big 10 that year and made it to the national semis while big 10 champ indiana and runner-up ohio state both lost in the round of 16 (IU to rival purdue).
UCLA (fourth in the pac 10 that year and only 17-9 pre-tournament) made it to the finals despite being the 47th invitee to the 48-team field. only louisville, the eventual champ at 33-3, was an expected final 4 team.
UCLA was an 8 seed, purdue a 6, iowa 5 and louisville a 2.