On the NBA: Statistical Responsibility

Q: When can we expect Moneyball 2: Moreyball?

A: this would be a terrible movie. that said, we did have a 22 game winning streak and apparently a long winning streak is all you need for a movie. 

- Daryl Morey, during a Reddit AMA chat.

Coverage of the NBA in the modern age is dominated by statistics. Where the dominating monolith of the box score once stood, now there are a myriad of useful and interesting methods for analyzing basketball, in scopes from the play of an individual player in a specific play type to the historical strategic choices of the league as a whole across time. This is a tremendous boon to anyone who wants to talk about the game: come up with an argument, and you can go away and find the numbers that prove or disprove it. But it’s not all good news. 47% of statistics are made up. Lies, damn lies and statistics! You know the clichés. The more information there is, the more it can be twisted to greater accentuate your argument or to be downright misleading.  There is an excellent example doing the rounds at the moment, and it has been driving me nuts to the extent that I felt like writing a blog post about it: The Heat are 45-3 in their last 48. How can we expect them to lose four in a row to anybody? 

There are some classes of statistics that create brilliant, fragile curios – great for soundbites, but not particularly useful for providing an accurate picture of what’s actually going on on the court. One factor that’s particularly relevant in such cases: when the sample from which the numbers are taken in such a way that it twists the result. What we have here is a textbook example. On what basis have we chosen 48 games as the period of time over which the past form is to be considered? Pretty sure the answer is “Because that’s when the Heat started winning“. But this is cherrypicking of the worst sort. A win streak is emphatic, but by its very nature it implies that the areas at either end of the streak are dotted with losses. Ironically, the last game before this selection begins was a loss against the team the Heat happen to be taking the court against tonight – the Indiana Pacers. In using this stat as a tool to prop up an argument about how the Heat are going to win, the implication is that the game on Friday 1st February that the Pacers won 102-89 is somehow much less important than the games that followed it. Otherwise why wouldn’t it be in the sample?

Of course, there is a very easy to spot counter-argument for why this sort of factoid isn’t all that useful – last year’s playoffs! Conveniently, 2012 featured the exact same storyline, as the Spurs had amassed a 20 game win-streak two games into their Western Conference Finals match-up with Oklahoma City. As we saw then, the past record of the Spurs did not matter in the slightest at the Thunder proceeded to reel off four straight games and progress to the NBA Finals proper. The only things the Heat’s current achievement has over last years’ Spurs are that it’s a bigger sample size and that it doesn’t have an event to terminate it with yet. But when the samples have been selected to accentuate statistics, I’m not sure they can be treated with the same respect they normally would be.

There’s one other thing worth bearing in mind when talking about numbers like these, and that’s strength of opponent. If I’ve done my maths right, the Heat’s last 48 games have featured opponents with a combined winning percentage of 42.7%. To put that in perspective, the closest comparison to that record this season would be the Philadelphia 76ers or Toronto Raptors, neither of whom were particularly strong. There were certainly some quality wins for LeBron and co. in the section of the schedule in question, but there was also a large number of games against the lottery-bound who were unlikely to put up much resistance. The remaining playoff teams excluding Miami have a 66.2% regular season winning percentage and should provide much stiffer opposition.

At the end of the day, the Miami Heat are still an excellent basketball team and are rightly considered favourites to win it all. But there are plenty of arguments and statistics to support that fact that aren’t based on horribly mangled statistics. Just as we quested for better ways of looking at the world than points per game, so should we choose firmer ground to base our positions on than selectively defined sample spaces. Isn’t the Heat’s full regular season record of 66-16 impressive enough? Or perhaps instead their 30-12 record against playoff teams? Neither is quite the statistical canard of a 45-3 stretch, but you don’t have to distort the truth to make Miami look like a good team.

With all the information available in today’s NBA, it’s more important than ever that the we exercise judgement in the tools we use to interpret what we see. There is a time and a place for statistics that produce ‘interesting’ results, but they almost always come with caveats and biases that make them unsuitable for any rigourous statistical analysis. And if you’re using numbers to educate your basketball perspectives, then that is what you’re doing. So just as you should be getting rid of points per game when talking about offensive performance, so should you be getting rid of streaks when talking about how good a team is.

View this discussion from the forum.

This entry was posted in columns and tagged , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.
Login to leave a comment.
Total comments: 23
  • 2016Champions says 1 YEAR ago

    The different mentalities in discussion forums? And my philosophy on what the ideal mentality is and how I sometimes contradict thatphilosophy(ie. the pot calling the kettle black)? Sorry about branching off. We can go back to talking about statistical responsibility now :)

  • thejohnnygold says 1 YEAR ago

    I don't think I know what we're talking about anymore... :unsure:

  • 2016Champions says 1 YEAR ago

    Has there ever been a philosopher who was perfect? I bet if you did some digging you could even come up with dirt on Ghandi :D

    Gandhi_first_blood_1.jpg

  • thejohnnygold says 1 YEAR ago

    This is interesting...I think I would liken it more to "salesmanship" than lawyering--of course, those two things aren't all that different.

    I'd also say we've got a case of the pot calling the kettle black :rolleyes:

  • rockets best fan says 1 YEAR ago

    OH! we getting philosophical :lol:

  • 2016Champions says 1 YEAR ago

    RBF, I think you're missing my point.

    I believe Rocketrick was the one who once said "I enjoy finding the flaws in arguments" and JD expressedempathy in that. I think 99% of people here would express empathy in that. This is what I'm calling the "lawyer's mentality".

    The "writer's mentality" is what I would call someone who just looks to share his perspective, and is willing to explore multiple perspectives without fully narrowing their vision down one scope.

    I think we need less "lawyer's" and more "writer's" in sports discussion forums in general.

  • rockets best fan says 1 YEAR ago

    2016

    I see you have a complex on the subject of stats. I have attacked you on some quotes of stats when they have been taken out of context. however believe it or not I look at stats too. the only difference is the weight we think stats should be given. I agree stats are a tool. 1 tool in a tool box of many tools. however if it becomes the only tool..........then it's out of context. I never meant to attack your idea or lead you to believe that stats are useless imo.....they are useful along side many other tools in a GM's tool box, but they aren't the most or least important tool......just a tool to help evaluate players. you probably think I feel about stats like McHale when morey hands them to him, but I'm not. it's just that I believe all information must be weighed in an evaluation.

  • 2016Champions says 1 YEAR ago

    Yeah, that's the point I was trying to make in the article. It applies to this instance specifically but also to general statistical usage - it's often tempting to to shave some boundary conditions here and sample size there to present a stronger argument, but at the end of the day that's damaging both to your own argument (assuming you get called on it) and to the participants understanding of the game as a whole.

    (BTW, it's killing me how often Zach Lowe likes to pull this 46-3 [as it is now> stat out. He's pretty much my favourite NBA writer, but dude, please stop using it to try and big up the Heat - it just sounds bad!).

    ST

    Zach Lowe is my favorite writer too, he doesn't use alot of statistics though which I like, he's so incredibly in-depth and detailed when he breaks things down. The average fan would say something like "Miami move the ball really well" but he will break down the exact way they move the ball, the sets they run, with pictures and videos. I wish there were more writers like him. I haven't noticed this 46-3 stat he's throwing around but I gotta say it's a pretty impressive stat.

    I think some people (like me for instance) who post stats aren't always trying to make an argument though, sometimes they're just posting something they find interesting or impressive and felt like sharing it. Then everyone with a "layer's mentality", for the lack of a better term, just jumps all over the guy who posted the stat. I think this mentality is counter-productive, especially when the person isn't insinuating anything or drawing conclusions from the stat.

  • rockets best fan says 1 YEAR ago

    Yeah, that's the point I was trying to make in the article. It applies to this instance specifically but also to general statistical usage - it's often tempting to to shave some boundary conditions here and sample size there to present a stronger argument, but at the end of the day that's damaging both to your own argument (assuming you get called on it) and to the participants understanding of the game as a whole.

    (BTW, it's killing me how often Zach Lowe likes to pull this 46-3 [as it is now> stat out. He's pretty much my favourite NBA writer, but dude, please stop using it to try and big up the Heat - it just sounds bad!).

    ST

    you're RobDover ?

  • Sir Thursday says 1 YEAR ago

    Nobody likes it when statistics are mis-used, I'm sure everyone agrees on that. However, if someone is trying to provide perspective on one side of a coin, it's only natural that they might feel that it's counter-productive orcontradictoryto also provide perspective on the other side of the coin. Many writers don't feel this way, but not everyone who posts here has a writer's mentality. If anything, active members in sports discussion forums generally have more of a lawyer's mentality where they will twist whatever they can, whether it be statistics or observations, to win what they perceive as an argument/debate, and they will do whatever they can to dismiss conflicting evidence/statistics asinadmissible, misleading, or inconclusive.

    Yeah, that's the point I was trying to make in the article. It applies to this instance specifically but also to general statistical usage - it's often tempting to to shave some boundary conditions here and sample size there to present a stronger argument, but at the end of the day that's damaging both to your own argument (assuming you get called on it) and to the participants understanding of the game as a whole.

    (BTW, it's killing me how often Zach Lowe likes to pull this 46-3 [as it is now> stat out. He's pretty much my favourite NBA writer, but dude, please stop using it to try and big up the Heat - it just sounds bad!).

    ST

  • 2016Champions says 1 YEAR ago Too bad we have to pay a lot of money to get full access to Synergy stats. I bet there's a whole world of advanced stats that aren't really available to the public.
  • Steven says 1 YEAR ago Stats are overrated. Advance stats are amazing.
  • 2016Champions says 1 YEAR ago

    Nobody likes it when statistics are mis-used, I'm sure everyone agrees on that. However, if someone is trying to provide perspective on one side of a coin, it's only natural that they might feel that it's counter-productive orcontradictoryto also provide perspective on the other side of the coin. Many writers don't feel this way, but not everyone who posts here has a writer's mentality. If anything, active members in sports discussion forums generally have more of a lawyer's mentality where they will twist whatever they can, whether it be statistics or observations, to win what they perceive as an argument/debate, and they will do whatever they can to dismiss conflicting evidence/statistics asinadmissible, misleading, or inconclusive.

  • rockets best fan says 1 YEAR ago

    Alot of general fans frown upon the use of advanced statistics in forum discussions.

    I don't........I just don't like when posters try to bend the stats to make their point as if their are no other factor to be weighed

  • 2016Champions says 1 YEAR ago

    Alot of general fans frown upon the use of advanced statistics in forum discussions.

  • rockets best fan says 1 YEAR ago

    who's calling for the boycott of statistics? I think any nba fan realizes the potential benefit of stats to help judge the game........also this is a sports forum......would be best to leave political subjects in political arenas.

  • 2016Champions says 1 YEAR ago

    I think it's okay to the statistics the way you see them used in scouting reports, for example the scouting reports at draft express often cite Synergy Stats but only within context. They're not drawing conclusions from the Synergy stats, they're just using them to reinforce theirobservations.

    The stance alot of people here are taking on statistics reminds me of how alot of people are saying guns should be banned, there are too many cases where people use them in the wrong way. The difference is that when you use guns the wrong way people die. If people use statistics the wrong way they just come to wrong conclusions.

    Statistics are tools, if they're used the wrong way it's the users who should be frowned upon--not the statistics. I think the answer is not to boycott the use of statistics, but rather educate people better on which statistics should be used and how to use them.

  • timetodienow1234567 says 1 YEAR ago With James, it's possible but improbable
  • ale11 says 1 YEAR ago

    Of course that streak has to be put on perspective. I'm 99.9% sure that Miami wouldn't have been able to accomplish that streak if they were playing in the West Conference. I bet OKC would love to play Charlotte and Orlando 4 times a year (I'm not including Cleveland because they really make them sweat for those wins), not even considering that Dallas (West's 10th seed) would have made the playoffs in the East and Memphis (West's 5th seed) would have been 2nd seed in the East.....5 50+ wins teams against only 2.

    Some numbers do matter and show that Miami, during the regular season, doesn't have much competition anyway. I'm not trying to undermine what they accomplished, it was awesome, but it wouldn't have been possible in this conference.

  • rocketrick says 1 YEAR ago

    Are you factoring in the way Wade has looked since his knee injury?

    D-Wade looked pretty good to me tonight, I don't know about you. Is he 100%? Doubtful. But tell that to the Pacers team. Bogus 6th foul called on Wade in my opinion putting Paul George on the line for 3 at the end of the game. That's what kills me, the inconsistency of the refs. Players get killed driving to the hoop on the last play of the game and usually the refs swallow their whistles. Wade exhales on George and gets called for a foul.
  • 2016Champions says 1 YEAR ago

    Are you factoring in the way Wade has looked since his knee injury?

  • rocketrick says 1 YEAR ago

    New post: On the NBA: Statistical Responsibility
    By: Rob Dover


    Quote:
    Quote:
    Coverage of the NBA in the modern age is dominated by statistics. Where the dominating monolith of the box score once stood, now there are a myriad of useful and interesting methods for analyzing basketball, in scopes from the play of an individual player in a specific play type to the historical strategic choices of the league as a whole across time. This is a tremendous boon to anyone who wants to talk about the game: come up with an argument, and you can go away and find the numbers that prove or disprove it. But it's not all good news. 47% of statistics are made up. Lies, damn lies and statistics! You know the clichés. The more information there is, the more it can be twisted to greater accentuate your argument or to be downright misleading. There is an excellent example doing the rounds at the moment, and it has been driving me nuts to the extent that I felt like writing a blog post about it: The Heat are 45-3 in their last 48. How can we expect them to lose four in a row to anybody?
    There are some classes of statistics that create brilliant, fragile curios - great for soundbites, but not particularly useful for providing an accurate picture of what's actually going on on the court. One factor that's particularly relevant in such cases: when the sample from which the numbers are taken in such a way that it twists the result. What we have here is a textbook example. On what basis have we chosen 48 games as the period of time over which the past form is to be considered? Pretty sure the answer is "Because that's when the Heat started winning". But this is cherrypicking of the worst sort. A win streak is emphatic, but by its very nature it implies that the areas at either end of the streak are dotted with losses. Ironically, the last game before this selection begins was a loss against the team the Heat happen to be taking the court against tonight - the Indiana Pacers. In using this stat as a tool to prop up an argument about how the Heat are going to win, the implication is that the game on Friday 1st February that the Pacers won 102-89 is somehow much less important than the games that followed it. Otherwise why wouldn't it be in the sample?
    Of course, there is a very easy to spot counter-argument for why this sort of factoid isn't all that useful - last year's playoffs! Conveniently, 2012 featured the exact same storyline, as the Spurs had amassed a 20 game win-streak two games into their Western Conference Finals match-up with Oklahoma City. As we saw then, the past record of the Spurs did not matter in the slightest at the Thunder proceeded to reel off four straight games and progress to the NBA Finals proper. The only things the Heat's current achievement has over last years' Spurs are that it's a bigger sample size and that it doesn't have an event to terminate it with yet. But when the samples have been selected to accentuate statistics, I'm not sure they can be treated with the same respect they normally would be.
    There's one other thing worth bearing in mind when talking about numbers like these, and that's strength of opponent. If I've done my maths right, the Heat's last 48 games have featured opponents with a combined winning percentage of 42.7%. To put that in perspective, the closest comparison to that record this season would be the Philadelphia 76ers or Toronto Raptors, neither of whom were particularly strong. There were certainly some quality wins for LeBron and co. in the section of the schedule in question, but there was also a large number of games against the lottery-bound who were unlikely to put up much resistance. The remaining playoff teams excluding Miami have a 66.2% regular season winning percentage and should provide much stiffer opposition.
    At the end of the day, the Miami Heat are still an excellent basketball team and are rightly considered favourites to win it all. But there are plenty of arguments and statistics to support that fact that aren't based on horribly mangled statistics. Just as we quested for better ways of looking at the world than points per game, so should we choose firmer ground to base our positions on than selectively defined sample spaces. Isn't the Heat's full regular season record of 66-16 impressive enough? Or perhaps instead their 30-12 record against playoff teams? Neither is quite the statistical canard of a 45-3 stretch, but you don't have to distort the truth to make Miami look like a good team.
    With all the information available in today's NBA, it's more important than ever that the we exercise judgement in the tools we use to interpret what we see. There is a time and a place for statistics that produce 'interesting' results, but they almost always come with caveats and biases that make them unsuitable for any rigourous statistical analysis. And if you're using numbers to educate your basketball perspectives, then that is what you're doing. So just as you should be getting rid of points per game when talking about offensive performance, so should you be getting rid of streaks when talking about how good a team is.

    The Miami Heat are deeper than they've been since LeBron and Bosh came aboard to join D-Wade. Plus 3 seasons together including playoff runs through Conference Finals and NBA Finals (extrapolating for the current series). It's gonna be hard to beat these guys no matter what the statistics do or don't say (eyeball test for whatever that is worth, right?)
  • rockets best fan says 1 YEAR ago

    I love this article. it should be read by everyone who comes to this forum. stats do have their place, but not as a know all see all measurement