Quantifying the NBA’s Reggie Cleveland All-Stars
Posted by Neil Paine on February 21, 2011
Everyone who has spent time studying historical player stats knows this phenomenon: You've seen a player's name for years, but you only know him as a series of numbers on a page. He retired before you were born, and you haven't even seen so much as a trading card with his picture on it... yet, instinctively wanting to humanize him, you imagine what he was like. You know his height, weight, all of the vital stats, everything except his ethnicity. So you make an educated guess based on his name. You now have an idealized picture in your mind's eye of the player in his prime, a man to go alongside the numbers.
The only problem comes when you do see him for the first time -- and he looks quite a bit different from the imaginary version you created years ago.
This is the concept behind Bill Simmons' Reggie Cleveland All-Stars, a "list of sports figures whose names would seem to indicate that they are of a different race or ethnicity than they actually are". Its namesake? Cleveland, a 1970s-era pitcher whom Simmons just assumed was black until learning otherwise when he joined the Red Sox.
For me, that player was Dick "Richie" Allen, most notably of the Phillies and White Sox. Retiring 8½ years before I was born, all I knew of Allen was his impressive stat sheet, including 1972 & 1964 campaigns that were among the best ever in my primitive homebrewed version of Win Shares. Shamefully, I always assumed he was white because of his name, and was very surprised when I finally saw his picture and learned of my mistake. The error is especially ironic in retrospect because Allen was outspoken in his rebellion against MLB's conservative power structure, and also suffered a great deal of racism throughout his career. In other words, I couldn't have been more wrong about Dick Allen.
Of course, this phenomenon doesn't always have such trivial consequences. Studies have found that similar assumptions frequently lead to discrimination in job hiring, causing some to even disguise their ethnicity in resumes to make themselves more attractive to employers. So it's a serious topic -- but hopefully one that we can also learn more about by examining our own ways of thinking, as well as analyzing some data. To that end, I took John Grasso's biographical database (which lists players as "white", "black", or "other" -- i.e., Asian) and calculated the expected probability of being a certain ethnicity based on the demographic trends of every player in NBA history. Based on Simmons' logic, the players least likely to be their actual race should be considered "Reggie Cleveland All-Stars".
Take for example the aptly-nicknamed Jason "White Chocolate" Williams (most recently of the Memphis Grizzlies). In NBA history, 14 players went by the first name "Jason". 4 were considered "white" in Grasso's data and 10 were considered "black", so the probability of an NBA player named Jason being white is 28.6%. Meanwhile, 57 players have had the surname "Williams" and, shockingly, Jason Williams is the only one to be considered "white" in Grasso's database, meaning the probability of an NBA player named Williams being white is 1.8%. Using Bayes' Theorem, we find that the probability of Jason Williams being white is 2.3%, making him the most unlikely player in NBA history to be his actual ethnicity.
Repeating this for all players, we get this list of the quintessential NBA "Reggie Cleveland All-Stars":
First Name | Last Name | Full Name | ||||
---|---|---|---|---|---|---|
Player | Race | W | B | W | B | p(actual) |
Jason Williams | W | 4 | 10 | 1 | 56 | 2.3% |
Bobby Jones | W | 7 | 8 | 2 | 41 | 7.1% |
Terry Thomas | W | 3 | 12 | 1 | 15 | 9.8% |
Ralph Johnson | W | 7 | 5 | 4 | 45 | 11.7% |
Dick Barnett | B | 23 | 3 | 1 | 1 | 12.7% |
Herb White | W | 2 | 1 | 1 | 11 | 12.9% |
Ron Johnson | W | 12 | 16 | 4 | 45 | 13.3% |
Michael Smith | W | 3 | 23 | 7 | 38 | 13.8% |
Phil Jackson | W | 6 | 9 | 2 | 20 | 15.0% |
Stan Brown | W | 5 | 3 | 3 | 26 | 15.5% |
Michael Bradley | W | 3 | 23 | 4 | 4 | 15.8% |
Adrian Smith | W | 1 | 4 | 7 | 38 | 15.9% |
William Smith | W | 1 | 3 | 7 | 38 | 16.1% |
Jason Smith | W | 4 | 10 | 7 | 38 | 17.7% |
Tony Bennett | W | 4 | 19 | 1 | 4 | 17.8% |
Steve Green | W | 19 | 13 | 1 | 19 | 18.0% |
Marshall Hawkins | W | 1 | 1 | 1 | 6 | 18.7% |
Luke Jackson | W | 5 | 1 | 2 | 20 | 19.5% |
David Wood | W | 3 | 15 | 2 | 3 | 19.9% |
Bobby Smith | W | 7 | 8 | 7 | 38 | 20.2% |
Travis Knight | W | 3 | 5 | 1 | 7 | 20.5% |
Ken Murray | W | 3 | 12 | 1 | 3 | 20.9% |
Don Chaney | B | 23 | 6 | 1 | 1 | 22.0% |
Eddie Miller | W | 2 | 13 | 5 | 5 | 22.0% |
Josh Davis | W | 2 | 6 | 7 | 25 | 22.5% |
Mickey Davis | W | 1 | 2 | 7 | 25 | 22.6% |
Aaron Gray | W | 1 | 6 | 3 | 5 | 23.0% |
Freddie Lewis | W | 1 | 3 | 2 | 7 | 23.0% |
Tim Young | W | 4 | 7 | 1 | 6 | 24.3% |
Jake Carter | W | 5 | 2 | 1 | 9 | 24.5% |
Richard Anderson | W | 2 | 7 | 5 | 14 | 24.9% |
Walt Davis | W | 4 | 5 | 7 | 25 | 25.6% |
David Lee | W | 3 | 15 | 6 | 6 | 25.7% |
Greg Butler | W | 7 | 17 | 1 | 5 | 25.8% |
Kenny Rollins | W | 4 | 15 | 2 | 1 | 25.9% |
Scott May | B | 12 | 3 | 1 | 2 | 25.9% |
Ed Smith | W | 16 | 12 | 7 | 38 | 26.0% |
Guy Sparrow | W | 1 | 4 | 1 | 1 | 26.1% |
Don Adams | B | 23 | 6 | 1 | 3 | 26.5% |
Steven Hill | W | 1 | 2 | 2 | 6 | 27.0% |
According to this data, a future generation of fans will list David Lee a member of their Reggie Cleveland All-Star team. We probably consider that unthinkable today... but I suppose older fans would never have imagined such confusion could exist for someone like Dick Allen, either.
February 21st, 2011 at 3:26 am
Come on now, Neil. You're better than this. You can't simply multiply first name probability times last name probability to get full name probability. Using your Jason Williams example, if you wanted to figure out the probability that someone with that name was BLACK, you'd do (10/14) x (56/57) =~ 70%. The problem with that is that if black and white are the only two options on the table in this scenario, the probability of a full name being white plus the probability of that same full name being black need to add up to 100%. (Unless first white x last black and last white x first black would give us the probability of a mixed-race NBA player?)
February 21st, 2011 at 3:41 am
"Shamefully, I always assumed he was white because of his name"
Shamefully? Why? I've done that before, and I don't feel any shame over it. Some names "sound" as though they belong to a certain ethnicity. Do people know who Scott Fujita is? I don't think it's any kind of bigoted stretch for people who don't know him to assume he's Japanese.
If I hadn't seen him on TV, I would have assumed Jeff Green is white.
February 21st, 2011 at 5:32 am
#1 - You're 100% right. That is what I get for posting at crazy hours in the morning after the night of the All Star Game. This is actually a Bayes' Theorem problem, I believe. I think I have it fixed now, but somebody who knows this stuff better than me should double check just to be sure.
February 21st, 2011 at 9:18 am
#2 - I think it's sort of shameful just because of Allen specifically -- once you find out who he was and what kind of struggles he went through (a great deal of which was due to racism), you feel like an idiot for ever assuming he was white.
But for somebody like Don Adams, I agree, there's nothing to be ashamed of. Honestly, up until researching this post, I pictured him looking like a 6'6" version of this guy.
February 21st, 2011 at 10:49 am
Nice. As a Black man, it's nice to see any form of candid discussion on race when too often we tend to avoid the topic altogether out of indifference, or fear of insult. Makes it particularly fitting being Black History Month...
There's no question that we ALL tend to assume peoples ethnicity based on nomenclature. It doesn't make us villains, it makes us human. No one is color blind, nor-should they be. But we should be tolerant,respectful, and curious in positive ways about other cultures.
Unfortunately, it has been understood for some time amongst the Black community that there is racial bias associated with traditionally Black names. At the same time there is a desire to anoint your child with a name that does have historically strong Black connotations. This is the ultimate "catch 22". As parents, sadly, we have to take this into consideration.
As far as JWill having the most Black name of all the White brothers, it's pretty fitting isn't it? I really can't think of a White dude who seems more "similar", if you will, in personality, and even game style, to a Black NBA player than Williams...
February 21st, 2011 at 11:03 am
Stunned that Sarunas Jasikevicius failed to make this list.
Bayes Theorem is obviously incorrect.
February 21st, 2011 at 11:24 am
By coincidence (?), AP ran this story today:
http://news.yahoo.com/s/ap/20110221/ap_on_re_us/us_the_blackest_name
February 21st, 2011 at 11:27 am
Scott Fujita was adopted by a Japanese-American man and caucasian woman, FYI.
Neil-
Are you using derivations of name? For instance, do Bill, Billy, Will, Willy, and William all get lumped together? I'm just curious if there are ways in which derivations might have ethnic or racial associations.
It's natural for us to categorize based on available knowledge. Not only is it an innate human way of organizing our world, but we're specifically trained to do this (not necessarily along racial or ethnic lines, but in general). I think the idea of such behavior being shameless, especially nowadays, is when we rely on those assumptions when it would be so easy to figure out the truth. I can understand this being more a problem in the '70s, before the interweb and satellite TV and 24/7 sports reporting. But nowadays, there really isn't a reason to not know more about a guy then what the statline says. There are still going to be guys we're wrong about, as there needs to be a moment when we learn definitively about a given person. But there doesn't seem to be good reason to know about a guy for several years and never know his race/ethnicity.
Part of the problem, as DWarner touched on, is a complete refusal to engage conversations about race and ethnicity intellectually and honestly. We're so afraid to talk about it, which only leads to more confusion and misinformation.
February 21st, 2011 at 11:42 am
#7 - That is an amazing coincidence. I had been working on this for the past week in honor of the All-Star Game. But, sure enough, all 10 players in NBA history named "Washington" are considered black in Grasso's database.
#8 - I'm just using the designated Basketball-Reference first name, which is typically what the player "went by" in his career. The question of alternate versions of the same name is interesting, because you're right, each seems to automatically carry a different connotation. So in that sense, using the first name they "went by" was the correct choice, because it inadvertently captures how each version of a name is most frequently used.
February 21st, 2011 at 11:51 am
Neil-
That makes sense, since it also involves a certain amount of self-selection. If a guy goes by Willie instead of William or Bill, it's likely that he had some power in making that determination (though it's certainly possible it was done by the parents and never changed). If so, he may have done it to assert a certain racial or ethnic link. Of course, the inverse is possible as well. Regardless, it is helpful to know. The only benefits I think of lumping them together are that it gets more to the hard of how names are bestowed in the first place and it increases the sample size, but it doesn't really get at the perception issue. If I know a guy goes by Willie, that is what is going to shape my perception of him... not that his parents called him William when he was an infant.
February 21st, 2011 at 11:56 am
Neil-
Is there any reason to question the accuracy of Grasso's database? I haven't seen it, so I don't mean to criticize it's usage. Only that the three categories offered (white, black, other) seem quite limited and if they are based on perception rather than self-identification, they might have their own biased factored in.
February 21st, 2011 at 12:00 pm
(Sorry for all the posts...)
Lastly, to expand on a comment in DWarner's post, it would be interesting to see if there was any association between playing style and race. I know a lot of people would balk at the notion, but it'd be interesting to see if there really was a "black" way of playing vs a "white" way of playing (something many people assert, often based on completely subjective analysis and with bothersome agendas at play). I don't know if the numbers available, even the advanced ones, are enough to get at style, but perhaps they are. And, if there are any trends, which players buck them?
Bill Simmons and Keith Law, among others, have talked about the general refusal to compare players across racial lines. We compare white guys to other white guys and black guys to other black guys, even if the most apt comparison is with someone in another racial group. This is most typically done when comparing prospects to current players or comparing current players to historical players. It could also illuminate some of the major issues with this approach.
All-in-all, fascinating stuff!
February 21st, 2011 at 12:12 pm
Everyone knows that white folks go by "Bill" and black folks are called "Willie."
February 21st, 2011 at 12:13 pm
#11 - I think the biggest thing to question is what you mentioned, the limited 3-category system (and "O" was used for only about a dozen players -- your Yao Mings, Wang Zhizhis, etc.) The rest were considered either "black" or "white", which is often inadequate to identify someone's heritage. But within that (admittedly flawed) framework, I have no reason to doubt the accuracy of the data. The APBR, and Grasso in particular, have always been extremely committed to the facts of NBA history, impressively so given that this data isn't really available anywhere else.
February 21st, 2011 at 12:46 pm
Out of curiosity, what was Mike Bibby classified as? Biracial guys like him might give a good indication on how the data is determined.
February 21st, 2011 at 12:57 pm
Bibby is considered black in the data; ditto Jordan Farmar, Kris Humphries, Deron Williams, Matt Barnes, Shane Battier, etc.
February 21st, 2011 at 12:58 pm
Neil-
I didn't mean to question the veracity of the researchers or their work... just the inherent limitations of a system with only three categories and one based upon a third party's understanding of a person's race. I'm confident that Grasso and APBR used self-identification data whenever possible, but going back historically, there are likely players for whom this wasn't available and at least some guesswork was involved.
How was Blake Griffin identified? What about the Lopez boys? If we are limited to the three categories (which may just be the reality at present time), we'd at least want to see them applied consistently. Are international players of African descent (either via the Caribbean, Africa, or elsewhere) considered black? What about Manu Ginobili, who I understand to be of European ancestry but Argentinian nationality? How about Nene Hilario? The fact that there were only 12 players in other makes me think that guys like Blake and Nene are deemed black, Manu is probably white, and the Lopez boys might be two of the 12 others? Do you have a link to the Grasso database? Was that in the post and I missed it?
Again, my attempt is not to discredit your work here, Grasso's, or that of anyone else. I'm just trying to remain aware of the inherent limits of this work.
February 21st, 2011 at 12:58 pm
Took too long to post... I see you already addressed bi-racial players. I'm still curious about Hispanic/Latino players and those with more complicated classifications like Manu or Nene.
February 21st, 2011 at 1:07 pm
Found the link to the data (I'm moving *REAL* slow on this holiday morning)...
Looks like Brook and Robyn Lopez are listed as white (reality, white mother, Cuban father, though I don't have info on his father's ethnicity specifically).
Ginobili is listed as white.
Nene is black.
It seems that O was essentially a stand in for Asian and all others were put into either black or white.
February 21st, 2011 at 1:11 pm
Some very amateur sleuthwork seems to indicate that the Lopez twins' father might have had at least some African ancestry in him. I can't find much info on him, but his cousin played Major League ball and photos seem to indicate at least some African ancestry. Again, this doesn't necessarily mean anything for the Lopez twins, but I'm just throwing stuff at the wall now.
February 21st, 2011 at 1:34 pm
Hmm. Sort of sounds like they are either going by self-ID or the single-drop theory.
February 21st, 2011 at 2:06 pm
Neil, I made the same mistake with Dick Allen... Had to laugh at myself when I did a Google Image Search two years ago.
And as for basketball, I always assumed Bob Love and Norm Van Lier were white.
February 21st, 2011 at 3:05 pm
Jordan Farmar is black?? I thought he was Jewish... not that you can't be both, of course. Anyway, it annoys me to no end when guys like Dirk and Peja who never pass get compared to Larry Bird just because they are all tall white guys who can shoot. I've been saying for years that Lebron reminds me of Bird; but people look at you funny when you say something like that.
February 21st, 2011 at 3:19 pm
Nick Says:
February 21st, 2011 at 12:46 pm
Out of curiosity, what was Mike Bibby classified as? Biracial guys like him might give a good indication on how the data is determined.
Although there are a lot of bi-racial cats in the league, I'm pretty sure both of Mike Bibby's parents are actually light skinned Black folks...
February 21st, 2011 at 4:14 pm
AYC-
I'm with you on the superficialness of our comparisons. The problem is that the bias is not limited to fans or even the media. Executives are just as guilty of it and, unfortunately, some players (of all races) are adversely effected by such nonsense.
February 21st, 2011 at 5:52 pm
BTW, if someone mentioned the names Brook, Robin, and Blake to me without context, my mind would conjure three blonde high school girls giggling their way through the mall.
February 21st, 2011 at 10:50 pm
I demand to know why Darius Songaila is not at the top of the list.
February 22nd, 2011 at 12:05 am
Whoa, whoa, whoa... Dick Barnett is black? My mind has just been blown.
February 22nd, 2011 at 9:25 am
I once was at an event involving kids where I had a list of names I had to call out. I got to the O's and the next name was Peter O'neil who was not white. Hilarity ensued as all of the grown-ups tried to mask their surprise at his ethnicity.