LEBRON Introduction

LEBRON: The man, the myth, the metric?

See the LEBRON data itself here or explore our free interactive application here!

Want to listen to an explanation? Check out our pod with Ben Taylor discussing LEBRON here.

 

Introduction

LeBron James the man needs no introduction. LeBron, one of the world’s most recognized athletes, has inspired people across the world through his play for nearly two decades. He’s a player who does it all, whether it be scoring, passing, rebounding, or even make key defensive plays. He’s always been a high impact player that has contributed to winning at the highest levels, and the perfect role model for an impact metric.

In the constant pursuit for better data to evaluate player impact, the BBall Index team of Krishna Narsu and I (Tim/Cranjis McBasketball) teamed up to create a new metric for the NBA community using some of the best practices and techniques we could find from the community infused with our own flavor.

After running the numbers and while seeking a catchy but sensical name, LeBron came to mind. Not just as the embodiment of what impact metrics attempt to measure, but also due to the new metric returning results indicating he’s been the most impactful player from 2009-20 (the length of our database), as well as his name aligning well to the specific components of the stat we wanted its name to outline.

 

Luck-adjusted player

Estimate using a

Box prior

Regularized

ON-off

 

As far as forced acronyms go, I’d say this one works out fairly well.

Put simply, LEBRON evaluates a player’s contributions using the box score (weighted using boxPIPM’s weightings stabilized using Offensive Archetypes) and advanced on/off calculations (using Luck-Adjusted RAPM methodology) for a holistic evaluation of player impact per 100 possessions on-court.

LEBRON is broken up between LEBRON (overall impact), O-LEBRON (offensive impact), and D-LEBRON (defensive impact). It is a measure of impact, not talent, and like with our talent grades has an age growth curve where we expect players to get better (more rapidly when younger) over time and then drop over time (more rapidly the older they get) later in their career.

 

The Box Score

We use the box score (including points, rebounds, assists, etc.) to calculate a box score “prior” for each player, which will be a datapoint used in Regularized Adjusted Plus Minus (RAPM) calculations.

Instead of the RAPM calculations knowing nothing about a player (and pushing them towards a zero/average value) it’ll have this prior to regress towards instead to increase accuracy.

BoxLEBRON

For these calculations we’ll be using weightings from the box score component of Player Impact Plus-Minus (boxPIPM), a stat created by Jacob Goldstein. He and PIPM (formerly with BBall Index) are now with the Washington Wizards, Go-Go, and Mystics.

The coefficients for the boxscore component are as follows in that PIPM writeup page.

Through our stabilization we modify this box score prior for the next steps of our LEBRON calculations.

Stabilization

To deal with smaller minute players from previous year and better process data for future seasons to help determine if high performance on a small sample is noise or real, we’ll be stabilizing/padding data by combining a technique outlined by Kostya Medvedovsky here with our Offensive Archetypes, which label players based on their jobs on offense.

What this approach does is determine the volume at which each box score statistic stabilizes (and becomes a good indicator of performance rather than noise). A tiny sample of outlier performance won’t get the math’s full buy-in, but sustained performance over a higher sample will be respected by the math.

Incorporating role allows us to treat the expected values component of that math with a bit more common sense. An Off-Screen Shooter, operating via pin downs and flare screens often for 3-point looks, won’t use the same average value in their calculations as a Roll & Cut Big, who does their work at the rim and rarely (if ever) takes 3-point shots.

Through these techniques, we’ll end up with a stabilized and role-adjusted version of boxPIPM as our box score prior.

 

The On-Off Calculations

That box score prior is then used during Regularized Adjusted Plus Minus (RAPM) calculations to derive LEBRON values. You can read about RAPM calculations here or here. Here’s the best excerpt from the Nylon article to help outline the approach RAPM takes:

“The idea behind adjusted plus-minus is that to get an accurate feel for a player’s value, we need to control for the presence of other players, both on offense and on defense. Before we get into the nitty gritty details, consider the general idea. Say you have three players and they are playing in a 2 on 2 basketball game. The plus-minus splits look like this:

P1 + P2 on the court: +10 points

P1 + P3 on the court: +8 points

P2 + P3 on the court: +4 points

Just from looking at this, you might reasonably guess that P1 is the best player on the court, but let’s do the math. This is a system of linear equations in three variables, so we can solve it algebraically to decide who contributed most to the team’s success:

P3 is a +1 player, P2 is a +3 player, and P1 is a +7 player. This is a simple example, but what I’ve done is parsed out each player’s contribution, controlling for the other players on the court. I’ve left minutes out of this but imagine that P1 and P3 play together a lot. This will make P3 look good even though P1 is doing most of the work.”

We utilized Ryan Davis’ tutorial script for our RAPM calculations, and would recommend you do the same if you have interest in calculating these types of values.

 

Luck-Adjustments

That LEBRON on-off data is also variance/luck adjusted, which is a concept you can read more about here and deserves credit from both Nathan Walker and Jacob Goldstein.

The idea behind adjusting for luck is that when evaluating players’ on-off court impact, it’ll be more accurate to adjust for factors in that on/off data that we can say with confidence is due to luck/variance rather than individual players abilities.

For example, my teammates shooting better on free throws when I’m in the game than they do when I’m out of the game has nothing to do with me but will help my +/- data. So this adjusts for that.

 

Wins Added

Since LEBRON values are indicative of impact per 100 possessions on the court, we’ve also added calculations of aggregate impact for players. These values are calculated to show the total wins a player has added for that season. Thus, Wins Added.

The math on these values leverages weights from Jacob Goldstein’s formula that looks at minutes and total impact but applied to LEBRON instead of PIPM for our calculations.

 

Wrap Up

After all of this, we end up with a per 100 possession estimate of impact that is role-adjusted, stabilized, and utilizes luck-adjusted values along with RAPM calculations.

We have LEBRON (overall impact), O-LEBRON (offensive impact), and D-LEBRON (defensive impact) calculated.

Much like several other impact metrics, 0 is average and -2.7 is replacement level (estimated LEBRON for a G-League replacement added to the roster).

This combination, which would not be possible without the great work done by so many members of the NBA analytical community, should yield a value-add impact metric that’s pretty damn good. We will post results as future tests are conducted to further evaluate the accuracy of LEBRON.

Along with the release of this metric, we’ll also be releasing several interactive tools so you can have some fun with what’s been produced. For now, a Google Spreadsheet has been created to house those tools. In production are Shiny Apps of a higher production quality, which will allow you to explore this new metric on the site itself.

Among the tools you can utilize now in that spreadsheet are:

  • Player Lookup Tool: select a player and see their LEBRON, O-LEBRON, and D-LEBRON values for each season of their career charted out and in table form, as well as the percentiles for each of those values.
  • Player Comparison Tool: select 2 players and see their LEBRON, O-LEBRON, and D-LEBRON values for each season of their careers.
  • Player Lookup & Forecast Tool: select a player and see all the info from the Player Lookup Tool, as well as future projected values for that player.
  • Career Projection Tool: select a player and see their forecasted future LEBRON, O-LEBRON, D-LEBRON and Wins Added values until their projected retirement, as well as their estimated contract worth (adjusting for cap inflation) and the odds their Wins Added reaches specific levels of impact (Rotation Player vs Starter vs All Star, etc.)
  • Query Tool: utilize any combination of filters for age, season, position, minutes, and team to query a specific search within our database, as well as produce values indicating the minimum/10th percentile/25th percentile/50th percentile/weighted average/75th percentile/90th percentile/maximum values for that filtered group.
    • For example, you can see only 20-21 year old PGs from the past 3 seasons who have played 500+ minutes and see who pops up, as well as see data to help set expectations for LEBRON values for players within that group.

 

What’s Next?

Moving forward, LEBRON will also play a role in a more advanced metric that utilizes tracking data that we’re working on at BBall Index. Keep an eye out for that later this upcoming season.

I also feel obligated to share that we also have a less developed version of LEBRON (without the luck-adjustment) calculated but not published. We call that metric our Box prior Regularized ON-off Numerical plaYer estimate, or BRONNY.

 

Frequently Asked Questions

If your question isn’t answered here, feel free to reach out on Twitter to Tim or Krishna.

 

Where’s the Data?

Visit our LEBRON Database page or our interactive App to see the data.

 

Which players have the best LEBRON seasons in your database?

For some context, our database currently covers the 2009-20 seasons, so we won’t be seeing any Michael Jordan seasons here (yet).

Giannis has a season in 1st (2019-20) and in 3rd (2018-19), with LeBron (2009-10) in the middle. LeBron has the best overall career LEBRON data, and his name pops up frequently at the top of the leaderboard for the metric.

For O-LEBRON, LeBron James (twice), James Harden (4 times), Steph Curry (twice), Kevin Durant, and Nikola Jokic can each be found in the top 10.

For D-LEBRON, Dwight Howard (twice), Rudy Gobert (3 times), Andrew Bogut (twice), Giannis Antetokounmpo, Kevin Garnett, and Larry Sanders lead the pack.

 

Is this a measure of player talent?

No! LEBRON, like all other impact metrics, are estimates of the total impact a player has on their team. This is a function of their talent, role/usage/deployment, team scheme, and fit within lineups.

Because of this, we may see players higher or lower than their true talent levels based on their individual circumstances. We might see role players perfectly optimized show up higher than they would on another team. We might also see players used poorly show up lower than their talent might suggest they should be.

Change the context even within the same season and you can impact LEBRON values. Hassan Whiteside, for example, becomes a far less positively impactful defender in a playoff environment where he’s targeted more in ball screens and in isolation. But for his 2019-20 as a whole, his defense was positively impactful at a high rate given his role and situation.

At BBall Index we have a separate set of talent metrics to evaluate talent.

 

How does LEBRON differ from other impact metrics?

LEBRON is the only impact stat utilizing the full bevy of techniques identified above, with role-adjusted, stabilized, and luck-adjusted values utilizing actual RAPM calculations (not estimates of RAPM calculations).

There are other impact metrics out there, but here are notes on just a few commonly used ones:

Box Plus Minus, for example, is limited to just the box score and will miss out on the luck-adjustment, stabilization, and value on/off data provides.

ESPN’s Real Plus-Minus has been mostly a black box, so it’s hard to comment on how our metric differs from theirs.

538’s Robust Algorithm using Player Tracking and On/Off Ratings utilizes tracking data LEBRON does not (but our next gen metric will), and does not leverage the same luck-adjustments, role-adjustments, or stabilization as LEBRON and uses an alternative approach to LEBRON’s RAPM calculations for the on/off component of the metric.

Jacob Goldstein’s Player Impact Plus-Minus, which is no longer in the public realm, was structured similarly to LEBRON but utilized estimated RAPM calculations and is constructed without the role-adjustments and stabilization that LEBRON utilizes.

These are fairly high level comparisons. For a deeper dive, check out the writeup pages for those metrics.

Here’s a look at how LEBRON compares with some other impact metrics in terms of error rates and success at predicting future game outcomes.

 

What does it tell us about a player’s future if their current LEBRON values are poor?

The LEBRON values only tell you what they tell you, which is what has happened. Players grow over time (and also decline with older playing age), so young players with poor LEBRON values aren’t necessarily going to be bad in the future. In fact, many successful NBA players start their careers with negative impact data that improves as they age and develop.

For example, the average LEBRON value for 19 year olds, weighted by minutes played, is -1.53. Being able to see contextual data like this in our Query Tool allows more fair evaluation of players for their age.

Within our interactive spreadsheet, our forecasting tools project future LEBRON values based on age growth curves we’ve calculated. Those are the expected average values based on what’s happened in the past, and better/worse optimization of a player or high/low development rates can impact what the future holds.

Does LEBRON account for the playoffs?

We have separate Playoff LEBRON values that use multiple seasons of data due to the small samples we get from individual playoffs. You can find that data here. These values should be used in a reflective way rather than interpreted as predictive, due to the small sample and confounding variables of unbalanced strength of schedule and how impactful tactics from an individual series can be.