3rd August 2011
High school recruiting rankings, particularly the historical variety, have long fascinated me. There's something really interesting about looking back at them with the benefit of hindsight, and comparing a player's actual career trajectory to that which was predicted when he was just 18 years old.
With that idea in mind, I put together this post to see how often players of a certain ranking end up with a certain type of NBA career. For every player, I classified them in one of six categories:
- Superstar - Either made 1st-team All-NBA or was Top-5 in MVP voting at least once in his career
- All-Star - Made an All-Star roster at least once in his career
- Starter - Finished top-5 on a team in games started at least once in his career
- Regular - Not a starter, but played at least half of a team's games in a season at least once in his career
- Scrub - Not a regular, but played at least 1 NBA game in his career
- Did Not Play - Never played an NBA game
I then looked at the recruiting rankings on this site, gathering the data from 1998-2003 ('03 being the final HS class for which you can reasonably say every player has been given a full chance to reach his NBA potential -- if a guy hasn't made it by now, it's probably never going to happen). Based on their national prospect rankings coming out of high school, how many players ended up in each category in the NBA?
Rank |
Did Not Play |
Scrub |
Regular |
Starter |
All-Star |
Superstar |
1-5 |
16% |
10% |
16% |
35% |
16% |
6% |
6-10 |
38% |
10% |
10% |
31% |
7% |
3% |
11-25 |
46% |
16% |
16% |
19% |
2% |
0% |
26-50 |
70% |
9% |
7% |
12% |
2% |
0% |
51-100 |
82% |
5% |
7% |
5% |
1% |
0% |
Rank |
Did Not Play |
Scrub |
Regular |
Starter |
All-Star |
Superstar |
Top5 |
16% |
10% |
16% |
35% |
16% |
6% |
Top10 |
27% |
10% |
13% |
33% |
12% |
5% |
Top25 |
38% |
14% |
15% |
25% |
6% |
2% |
Top50 |
54% |
12% |
11% |
18% |
4% |
1% |
Top100 |
68% |
9% |
9% |
12% |
2% |
1% |
This is a sobering reminder of how elite the NBA's talent level really is.
Even if you're one of the 100 best high school players in all of America, there's almost a 70% chance you never play in the NBA, and almost an 80% chance that, at best, you'll be a journeyman scrub who doesn't play regularly. And while top-5 talents have a decent probability of being an NBA starter or better (58%), after that the drop-off is steep: 41% for players ranked 6-10, 21% for #11-25, 14% for #26-50, and only 6% for players ranked outside the top 50 (including just a 1% chance of being an All-Star).
Not to harsh the mellow of any budding BMOCs out there, but the typical top prospect's NBA career is, in the words of Thomas Hobbes, nasty, brutish, and short.
For the full list of recruits used in the study (and the categories they fell into), click here.
Posted in Analysis, NCAA, Prospects | 30 Comments »
27th July 2011
If you've been following the blog recently, I've developed a way to convert a player's Basketball on Paper stats to a Statistical Plus/Minus estimate. I'll spare you the gory details (which you can read about at the bottom of this page), and simply say that this version of SPM is less biased toward any one position and captures defense better than the original edition, making it the superior SPM in my opinion (although, as always, I'm certainly open to critiques).
Read the rest of this entry »
Posted in Analysis, History, Insane ideas, Statgeekery, Statistical +/- | 83 Comments »
25th July 2011
Which players excel against the best defenses, and which ones get their numbers by feasting on the weakest Ds?
To answer those questions, here's the latest installment of a series I started in 2009 and continued in 2010... The concept is simple: I rate each team defensively using the BBR Rankings formula (including regular-season and playoff games), then track how well each player performed offensively against opponents of varying defensive quality.
Read the rest of this entry »
Posted in Analysis, BBR Rankings, SRS, Statgeekery, Statistical +/- | 44 Comments »
9th July 2011
Here's a quick-n-dirty study I ran this morning... The idea is this:
Some stats seem to be more correlated with a player's role than his actual skill. Take a player out of the role, plug another similar player in, the new player produces just like the old one (and the old one can't "take the stats with him" to his new destination).
How can we quantify this, though? Well, let's identify players whose circumstances changed. I took every team since 1978 and assigned its players to 10 "roles" -- primary PG, backup PG, primary SG, etc. -- based on my detailed position file and where the players ranked on the team in terms of minutes played. I then isolated every player in that sample who:
- Played at least 500 minutes in back-to-back seasons
- Was between age 24 and 34 in back-to-back seasons (to filter out potential aging effects)
- Moved to a new "role"
- Was replacing a player who played >= 500 MP in the role and was between age 24 and 34 the previous season
This leaves us with 1,866 player-seasons to look at. For each of those, I need to know which predicts the player's performance in Year Y better -- his own stats from Year Y-1, or the Year Y-1 stats of the player whose role he took?
Read the rest of this entry »
Posted in Analysis, Insane ideas, Statgeekery, Win Shares | 9 Comments »
12th June 2011
As a follow-up to Thursday's post about the best Finals performances according to Statistical Plus/Minus, here's a playoff ranking since 2003 with a few tweaks:
- I finally re-ran the Offensive SPM formula without steals and blocks. Steals in particular were causing certain players to be extremely overvalued offensively, and there's little reason to include those defensive stats in an offensive regression. (DSPM is the same as before -- and yes, it still includes several offensive stats, but DSPM wouldn't explain more than 25% of defense without them, while OSPM's explanatory power was barely affected by dropping steals & blocks out.)
- At the request of readers, instead of per-minute SPM players are ranked by per-game "Impact", which is SPM times the % of team minutes played.
- All of a player's games are weighted by Championship Leverage, which takes into account how much the game will potentially swing the odds of a team winning the NBA title. Leverage is relative to the average playoff game in a given season (which always has a leverage index of 1.00). For instance, Game 1 of the Magic-Hawks 1st-round series had a leverage of 0.44, while Game 5 of the Finals had a leverage of 5.28. This means that, in terms of influence on championship probability, Thursday's game was 12 times as important as Game 1 of a 1st-round series, and the rankings will reflect this.
Finally, why 2003? Because that was the year the NBA adopted best-of-7 first-round series, allowing me to use the series win probabilities found here.
Anyway, here were the top playoff performers since 2003 according to per-game SPM impact, weighted by the importance of the game (minimum 10 games):
Read the rest of this entry »
Posted in Analysis, Playoffs, Statgeekery, Statistical +/- | 11 Comments »
9th June 2011
With the NBA Finals locked up 2-2, it seems like a good time to look at the best Finals performances in our database (which extends back to 1991 for playoff games). The metric of choice is Statistical Plus-Minus, an estimate of the player's contribution to the team's point differential per 100 possessions, using his boxscore stats as inputs. And, as an added twist, I weighted each game of the Finals according to its series leverage (the expected change in series win probability of the game in question relative to the series' overall average per-game change), meaning that performance counts more in the games that contain the most pressure. Here is every player in the dataset who played a minimum of 24 minutes per team game:
Read the rest of this entry »
Posted in Analysis, Data Dump, History, Playoffs, Statgeekery, Statistical +/- | 26 Comments »
31st May 2011
With Dallas-Miami Part II tipping off tonight (not that it's really a rematch), I wanted to see whether the Mavs' loss in 2006 was the worst Finals collapse of the BBR era. We have linescores for every playoff game since 1992, which means I can calculate the home team's probability of winning at various checkpoints within a game:
Stage |
p(Home W) |
Pregame |
60.4% |
After 1st Qtr |
=1/(1+EXP(-0.3599755-0.1122741*Home Margin)) |
After 2nd Qtr |
=1/(1+EXP(-0.2895922-0.1429087*Home Margin)) |
After 3rd Qtr |
=1/(1+EXP(-0.2041572-0.2117494*Home Margin)) |
Before any OT |
52.4% |
Combining those probabilities with the series win probabilities I found here, one can determine each team's probability of winning the series at a given checkpoint. This allows us to rank Finals collapses, pinpointing the moments within games where the eventual loser's series win probability was the highest:
Read the rest of this entry »
Posted in Analysis, History, Playoffs, Statgeekery | 13 Comments »
30th May 2011
Although we like to think "the best team always wins a best-of-7 series", variance plays a much bigger role than we'd care to admit. I found here that the best team in a given season usually wins the NBA title about 48% of the time -- and that's actually an incredibly high rate compared to other sports like baseball (29%), pro football (24%), and college basketball (34%).
Truth be told, playoffs are mainly designed as entertainment, with "finding the best team" as a secondary goal. And there's nothing wrong with that. If we forced teams to play enough to have statistical certainty, it would require a completely impractical number of games. For the fan's sake, it is necessary to achieve a balance between watchability and the feeling that what we watched wasn't a total fluke. And really, the NBA probably does this better than any other sport.
But we still have to acknowledge that the best team does not always win, nor do the NBA Finals necessarily contain the best teams in each conference. Can we put a number on how probable it is that a given Finals matchup did in fact contain the best from each conference? Using a very simplified version of Prof. Jesse Frey's Method for determining the probability that a given team was the true best team in some particular year (with assists from these posts), I calculated that probability for every Finals matchup since 1984, when the playoffs expanded to 16 teams.
Here are those Finals, ranked from the greatest certainty that the two teams were their respective conferences' best to the least certainty:
Read the rest of this entry »
Posted in Analysis, History, Playoffs, Statgeekery | 10 Comments »
23rd May 2011
Last week, I ran a post (prompted by this post at the Wages of Wins) wherein I tried to determine the offensive impact when a team loses its leading scorer. I found that, since 1986 at least, a team loses about 2 points of offensive rating relative to the league average when its top scorer by PPG doesn't play.
I got a lot of great feedback from that initial post, so I decided to try my hand at a sequel after making a number of improvements to the study:
- One complaint was that I was lumping efficient scorers in with inefficient ones in the original study. No one is really debating whether losing LeBron James will hurt an offense, but one of the core questions is whether losing Carmelo Anthony or Rudy Gay has a negative impact as well. To that end, I'm now isolating only teams with inefficient leading scorers. This means a team's PPG leader, minimum 1/2 of team games played, with either a Dean Oliver Offensive Rating or True Shooting % that was equal to or below the league's average that season.
- Another complaint was that I looked at offense alone, rather than the total impact of the player's loss. So now I'm looking at the change in team efficiency differential (offensive efficiency minus defensive efficiency) when a player is in and out of the lineup.
- While I accounted for strength of opponent in the last study, I didn't account for home-court advantage. Now I have added an HCA term to what we would predict an average team to put up vs. a given opponent (+4 pts/100 of efficiency differential to the home team), in addition to an SOS term (the opponent's efficiency differential in all of its other games).
What follows is a massive table that shows the results of this new study. The outcome (the bottom-right cell) is the average change in efficiency differential when an inefficient leading scorer plays vs. when he does not play, weighted by possessions without the leading scorer. If it is positive, it is evidence that even inefficient scoring is an attribute that teams find difficult to replace in a salary-capped economic system; if it is negative, it is evidence that scoring is overrated if it's not done efficiently, and that inefficient #1 options can be replaced with relative ease.
To the data dump (mouse over column headers for descriptions):
Read the rest of this entry »
Posted in Analysis, BBR Mailbag, History, Statgeekery | 33 Comments »
19th May 2011
Bill Simmons and BS Report HoF guest Chuck Klosterman are discussing Larry Bird vs. Dirk Nowitzki in a podcast. Simmons says that the advanced stats place Dirk in the same category as Bird, perhaps even giving Dirk the edge, and he's not sure how he feels about this.
I wasn't sure how I felt, either, so I looked up the numbers. Here is a monster table with their advanced stats -- each has played exactly 13 years:
Read the rest of this entry »
Posted in Analysis, History, Statgeekery, Statistical +/-, Totally Useless, Win Shares | 224 Comments »