I am excited to introduce Ben Heller as the newest contributor to the Red94 team. Ben will be sharing his insight in the form of statistical analysis.
A little bit about Ben, in his own words:
I graduated from Washington and Lee University two years ago with B.S. in Business Administration. My areas of focus were finance and investments, with some statistics and economics as well. My basketball career began at Lanier Middle School, then transitioned to Lamar High School, where I played on the varsity team for three years. I played at Washington and Lee for four years, leading the team in rebounding and scoring and earning the ODAC Scholar Athlete of the Year award my senior year.
While I was still young during the days of Clutch City, I can remember waiting in line for hours to get a Rocket’s ’93-’94 championship T-shirt. It is often fun to revel in the past, but I am generally forward-looking, using our league’s history only as a series of lessons, rewarding those with the patience and commitment to learn them. Statistics in basketball have made much-needed progress in the last decade, and I am proud to know that our Houston Rockets are still on the cutting edge.
I look forward to exploring the depths of NBA statistics, challenging our collective intuition, and learning from you all as well.
Ben’s first post ensues after the jump. – Ed.
This is a two-part post. First, I would like to introduce my general line of thinking with regards to basketball and the role of statistics, as well as introduce a general model. That will be unique to this post only. Next, and for all future articles, I will explore different biases and intuitive concepts around the NBA and compare them to measurable statistics, as well as take a look at interesting facts, ideas, and topics.
As the role of advanced statistical analysis in the NBA has expanded in both mainstream basketball chatter and internal management decisions, we are seeing a convergence between intuition and measurable data from the games played. During the great Durant-Oden debate, statistical analysis within the Trail Blazers organization projected Durant as the superior player, yet the inescapable NBA notion that “you can’t pass up a chance at a legit 7-footer” was ultimately too powerful. During this fiasco, which was just over three years ago, the difference between statistical analysis and the visceral conclusions of scouts and management (referred here perhaps in a derogatory nature as “intuition”, but certainly based on experience and basketball knowledge that go beyond the casual fan) was put on the greatest stage. Both players are still young and hopefully their careers are far from over, but it is becoming increasingly obvious that Kevin Durant is the superior player by almost any definition.
The takeaway here is not that “statistics are smarter” or that all teams should fire their scouts (many of whom can still make successful decisions based on basketball intuition). The point is that there existed market inefficiency during these player evaluation periods, and that teams with more information and better information were simply in an advantaged decision-making position. More importantly, the appropriate use of this information is what will differentiate GMs, coaches, and scouts in the future. The amount of data available now is, even to a more serious fan, completely overwhelming. Just scanning through, for example, Basketball Reference’s glossary, it is hard to look at any given definition vs. another and say that one is definitively more important. Do you want a player with a higher versatility index or effective field goal percentage? Can you put up with a higher turnover percentage if it means an increase in offensive rebounding percentage?
For these and other questions, the relative importance of a basketball statistic must be correlated with some sort of definitively desirable (or undesirable) outcome, such as wins in a season or average points scored differential. I originally wanted to equate the team points per game differential (average ppg / opponent average ppg) to a series of ratios that could be broken down further into more ratios. The general concept is to have a model that, when broken down into enough moving parts, can show differences in various team performance measures over time, as well as compare them to the best and worst teams. It would allow for us to measure the relative importance of the various sub-categories, which is a crucial tool when making important decisions within an organization. Well, that all sounds nice, but I couldn’t quite make it work. The good news is I came out alive and with a general formula nonetheless. Skip to Part II if you want to take my word for it.
There are two basic equations: offensive points per game and defensive points per game. The ratio of one over the other can be viewed as one of the most predictive statistics related to wins other than wins itself, and may be a better predictor of future wins. As the defensive equation is structurally the same as the offensive equation, I will focus on explaining just the offensive equation for now. Here is the general equation before any decomposition:
PPG = pts/pos * pos / game
The norm right now is to define a possession as FGA + .44*FTA + TOs – OREB. In this form, both teams generally will have the same number of possessions each game. I think that causes the possession statistic to be a misnomer. It misclassifies “hustle” (OREB) as “efficiency” by allowing multiple FGAs per possession, allowing for a greater points per possession number without increased scoring efficiency. In that light, I prefer defining possessions as FGA + .44*FTA + TOs.
So, what is next? To see anything meaningful, the formula must be broken down further. There are several routes to take from here, but I chose to break down major scoring categories: 2 point field goals, 3 point field goals, and free throws. The next level of decomposition is as follows:
PPG = (2Pts/pos + 3Pts/pos + FTPts/pos) * pos/game
2Pts/pos – points from 2 point field goals per possession
3Pts/pos – points from 3 point field goals per possession
FTPts/pos – points from free throws per possession
The calculation of 2Pts, 3Pts, and FTPts is made by a weighted average of attempts multiplied by the percentage:
(2ptFG%*W2ptFGA*2 + 3ptFG%*W3ptFGA*3 + FT%*WFTA*2.2727 + 0*WTO) * pos/game =PPG
The weights are the percent of times that a play ends on a given category, and are calculated by attempts/pos (for example, W2ptFGA = (fga-3fga)/(fga+.44*fta+to).
I chose to stop here, but the different percentages could be further broken down by locations on the court or situations (such as time left in the shot clock). Of course, a different direction could be taken that tries to incorporate other important factors, such as rebounding, but I will leave that for another day.
Let’s take a look at the power of this equation in the pro basketball context. I broke down the different regular season offensive values for the past 10 NBA championship teams (’00-’09) and the teams with the worst record in each of those years. The idea is to see the sharpest contrast possible (at the expense of a large sample size…).
Anything jump out? There is an obvious difference in team PPG. That tells us nothing other than good teams score more points. Duh. But how do they get their points? If you look at possessions per game, the numbers are almost identical, meaning that pace does not influence wins very significantly. Even subtracting TOs from pos/game (leaving just offensive attempts per game: FGA + FTA*.44), the Champs had 91.25, while the losers had 90.40. All that means is that, through the course of a game, both winning teams and losing teams will, on average, have a very similar number of attempts at scoring.
What are the implications? This next statement is going to drive my old W&L coach crazy: turnovers are not important. Alright that was a little dramatic, but they are less important than we think. It’s hard to believe, because conventional basketball wisdom will tell you that taking care of the ball is a key ingredient to an effective game plan. However, statistically a turnover is only slightly worse than a missed shot (slightly worse because missed shots have a chance for offensive rebounds). Because only about 13-14% of possessions result in turnovers, the real area for variability (the other 86-87% of possessions) is scoring efficiency. The Champs averaged .943 points per possession, while the Losers averaged .876. To expand on that, let’s take a look at defensive statistics:
*Note, this second table shows the ratio of each group’s offensive value divided by defensive value, with the difference between the groups, showing the most drastic differences between the Champs and Losers.
The Champs are slightly better than the Losers with turnovers, but not by much. The Champs turned the ball over .4% less than their opponents (13.2% – 13.6%), while the Losers turned it over .8% more (14.3% – 13.5%). However, it’s time we start think about turnovers in terms of opportunity cost. The Champs averaged 13.13 points per game lost due to turnovers (13.2% o_WTO *.943 o_PPP *105.16 o_pos). However, their opponents averaged a smaller loss despite having a higher turnover rate: 12.56 ppg. (13.6% d_WTO *.881 d_PPP *105.22 d_pos). Considering opportunity cost, the Champs were actually slightly worse off as a result of turnovers than their opponents because they score so much more efficiently. Scoring efficiency, in a nutshell, is the single most important factor in basketball – by a long shot.
Looking back at the Champs vs. Losers one more time, some of the biggest differences were related to three-point shooting. Better teams take, and make, significantly more three pointers than bad teams. After three games, the Rockets are shooting 27.3% on 55 attempts, while our opponents are shooting almost 40% on 68 attempts. Coming in to the season we were worried about our defense on the perimeter. After all, Aaron Brooks and Kevin Martin are not exactly a menacing defensive duo. So far, all of our fears are coming true, and we essentially have to be perfect offensively to stay in games. It is still too early to hit the panic button, but something needs to change for our team to be competitive.
Brooks may not be the problem and Carmelo is definitely not the answer, but how much success can we really have long-term by trying to outscore everyone? Judging by the playing styles of the Celtics, Lakers, Spurs, Pistons, and Heat in the past decade, our current team structure is simply not built for contention. Against the Hornets on Wednesday, pay attention to our perimeter defense. If Paul can consistently penetrate past our guards, and either get to the free throw line or open it up for Belinelli , Ariza, and Thornton to shoot open threes all game, it’s officially time to hit the panic button.