Expected Points: Providing Context

This is perhaps only indirectly related to the strategic post I’ve been looking to expand upon, but it’s important nonetheless, and, in my mind, more immediately relevant.  The topic, as the title suggest, is Expected Points, the concept developed by Brian Burke at AdvancedNFLStats.com.

I’ll spare you the full explanation because I think most readers here are aware of it, but basically it assigns a point value to each down/distance combination to provide a measure of how valuable each situation is.  For example, 1st and 10 at the 50 yard line is worth MORE than 1st and 10 at a team’s own 20 yard line.  Simple enough.  However, many people (myself included) have been a bit to cavalier in using the Expected Points concept to evaluate in-game strategy.

Today, I want to clear that up.  There is a major limitation to the Expected Points concept; that is, it’s an AVERAGE, and it doesn’t account for the relative strengths and weaknesses of each team.  Logically, a certain field position is worth more to the Broncos than it is to the Buccaneers, right?  Similarly, a particular field position is worth LESS against the Chiefs defense than it is against the Eagles.  Unfortunately, that’s not accounted for, limiting the usefulness of the Expected Points analysis.

I’m not saying it’s worthless, far from it in fact; I’m just saying that we need to remember that the EP analysis for a given situation reflects average teams, and therefore must be adjusted when applying it to real-world situations.

Allow me to demonstrate (as you knew I would).

Here is the chart of EP value for 1st downs.

Screen Shot 2013-10-25 at 4.45.35 PM


The problem, as I mentioned above, is adjusting for relative strength.  To explain, we need to pick 2 teams.  For the sake of clarity, and consistency, we’ll use Denver and Jacksonville (FO’s best and worst teams by DVOA).

Let’s just look at Denver’s offense.  To adjust, we need to know how the Broncos offense compares to average.  Luckily enough, Football Outsiders provides us with a measure of just that.  Note that, for now, I’m knowingly glossing over the fact that FO’s DVOA might not be the best measure here.

Anyway, Denver, according to FO, ranks 40.9% better than league average.  Well that makes things easy, right?  All we have to do is increase the average EP value of each yard line and re-graph.  Well here’s that graph:

Screen Shot 2013-10-25 at 4.52.08 PM

I’m guessing you all see the problem, but if not, hang in there, I’ll get to that in a second.  We ALSO have to account for Jacksonville’s relative strength/weakness on defense.   Just as I did above, I can just use FO’s rating (17.9%) to adjust again, right?

Well here it is:

Screen Shot 2013-10-25 at 4.55.06 PM


Perfect….now we have the adjusted EP value for the Broncos versus the Jaguars.  We can do the same calculations as before, using these values, to determine the “optimal” play-call (where optimal means maximizing expected points).

Only…there’s still that problem I mentioned above, which by now EVERYONE has noticed.  Looking at the graph above, it’s pretty clear that we’ve made a mistake in our analysis.  Touchdowns are only worth 6 points, which means no field position can be worth more than that (violated above).  Moreover, since, regardless of relative strength, scoring a TD can never be 100% assured, we shouldn’t even see a value of 6 anywhere on the chart.

Basically, as the mismatch becomes more and more favorable to the offense, the line should approach a limit of 6, but never reach it.  Unfortunately for us, that complicates our plans.  How do we account for this?

I’m not sure, but I do have one potential out.  

We can ignore everything I did above (ugh) and go back to using the average values for EP.  Rather than accounting for the relative strength here, we can instead adjust the expected success rates to account for relative strength.  That raises it’s own issues, but it seems to be more intuitive.

HOWEVER, that only addresses the problem when we’re using a combination of EP and Success Rates to game out the Optimal Value of a certain situation.  It does not address the issue when we are only using EP (which is how many analysts are using it.)

That brings me back to the main point: be careful when using Expected Points to justify in-game strategic decisions.  It can be done (and EP remains the best model for quantifying such situations), but you have to remember that the actual results will be distorted by the relative strength of the teams.

If that reminds you of this E = R ((60 – T) / 60) + C then I’ve done my job.  I apologize for the slap-dash way I’m addressing the overall concept, but it is what it is…I don’t have the luxury of taking the time to plan these things in advance.

I’ll try to return to this soon, hopefully using a real situation from Sunday to game out the options using our adjusted success rates.

Next up, though, we need to talk about Bayes and reconciling New Information (what happens during the game) with what we already “knew” (R before the game starts).

Eventually, I do believe it will be possible to create an algorithm that essentially tells you what the “optimal” decision is in every Go/Kick/Punt situation.  We’re kind of there already, on an average basis.  The key, of course, is to adjust for the teams involved in order to allow actually reliance and use (not that NFL coaches would ever admit that a computer can make better decisions than they can).  At the very least, it’ll provide a valuable guide with which to grade coaches.

This should probably be a permanent end note, but all comments are encouraged.  As I said, I’m trying to develop a useful model here, so suggestions are always welcome.


7 thoughts on “Expected Points: Providing Context

  1. I love this stuff!

    Three thoughts: Does the chart approach 6, not 7 because even 1st and goal from the 1 has a net EP of 6 due to fumbles and FG and stuff, or because we’re not counting the PAT as part of the EP? The latter would seem dubious to me. All of us football fans easily learned our 7 times tables because that’s what happens most frequently. If kickers are 98% accurate on PAT, let’s call the TD worth 6.98, but not 6. (Chip and the 2 point-conversion-revolution could create an EP over 7, but I’m still waiting for him to go for it on more fourth downs)

    Second: Your point about the TD never being inevitable made me think of something. What if you could find a way to apply the Denver offensive bonus to the difference between Optimal (say 6.98) and any given EP. So if Denver gains a first down at their opponent’s 35, EP=3, Optimal is 6.98. Denver will get more of those 3.98 points than anyone else. For fun, 40% of the 3.98, added back to the 3 gives an EP of 4.6. The Jags, conversely, get a larger number than 3.98 to be subtracted from 6.98. 17% more than 3.98 is roughly .8 of a point. So the Jags are 4.8 away from optimal (2.2).

    I’m not sure that you can just use the DVOA 40%. Somehow that math doesn’t seem correct. To test any formula, you’d need a lot of data from teams with various DVOAs and how check how they performed relative to EP. But if someone told me they measured all year and the Broncos EP from the 35 was 4.6 and the Jags was 2.2, I’d find it believable.

    Last: EP is powerful. I should think it would be a requirement of any modern coach to understand the concept and have a basic idea of what down, distance and field positions were worth. Or have a sideline coach who could say something along the lines of, “A punt here needs to be inside the three to maintain EP”.

    • First, let me agree that just slapping FO’s % on there isn’t necessarily a grey method. I mentioned that in the post, but I just used it here because it was the clearest way to illustrate the bigger point. I’m not sure yet where I come out on actually using it for the calculations (I think it might be less objectionable when applied to success rates rather than EP).

      Your point about % of Optimal is interesting. Ill have to think it through, but that could be a good way of adjusting it while maintaining our limits.

      Lastly, agree 100% on the value of the concept. Should be used more often, but as I was trying to show, its just not as simple as punching it into the EP calculator and going with that.

  2. I do not see a problem with using EP unadjusted because no matter what team no team can expect more than a TD on any play and even the best offense against the worst defense very rarely score if they start on their own 1st yard line.

    • So you’re argument is basically that the absolute difference in EP won’t be significant after adjusting. I think that’s half true. It’ll be small in most cases, especially very far from and very close to the end zone, but in between I think it will be large enough to effect the “optimal” calculations. That area, obviously, is where most of the close decisions get made.

      However, don’t know for sure until we find a way to adjust the calks with some degree of accuracy.

  3. Burke’s EP model doesn’t make any adjustment for contextual information, so the point you’re trying to make is valid, but the way you’re making it is a pretty blatant strawman.

    I’m not sure what kind of manipulation was used to change the chart, but its pretty clear that, a strong offense against a weak defense should have a flatter EP slope, and that a weak offense against a strong defense should have a steeper EP slope. The presented charts exhibit the opposite trend.

    Michael Beouy did a more sophisticated and sensible analysis about a year ago:

    One of the fundamental issues that appears almost immediately it that there’s no clear consensus on the relative strength of teams, and the date for Burke’s original model is already quite sparse, so it’s really difficult to have any confidence in anything.

    > Touchdowns are only worth 6 points, which means no field position can be worth more than that > (violated above).

    In Burke’s model, touchdowns are worth 7 points, but there’s a discount for giving up possession to the other team, so the net expected points are less. In a gross mismatch, the weaker team may well have negative expected points when it receives kickoff, and that would push the expected net points of a touchdown above 7. (In a video game world, a team could also have a high expectation of recovering an on-sides kick, or 2-point conversions and that would also increase net expected points past 7.)

    Burke presents information about his model at his site:

    • Agreed, the slope should definitely be different. The question is: what’s an accurate way to do that? Maybe the link you included answers that, I haven’t read it but will definitely check it out.

      I see your point regarding the max value of expected points, however, I think that’s a slightly different view of things than what I was going for. If a heavily mismatched team is on its own 1 yard line, it’s expected points will be negative. However, that doesn’t mean that a TD from there is then worth more than 7 points.

      Basically, I think we’re each talking about different values. You’re looking at potential performance ABOVE expectation, which can be greater than 7 (the situation I mentioned above). The discrete EP value, though, can’t be greater than 7. I can probably say that more clearly, but let me read the article you linked to first, maybe that’ll show me something I’m missing.

      • “…The question is: what’s an accurate way to do that?…”
        My first order approximation would be to think of EP as geometric, so the slope will change more further from the goal line, but I can really only think sensibly in generalities like that.

        “…Basically, I think we’re each talking about different values. You’re looking at potential performance ABOVE expectation, which can be greater than 7 (the situation I mentioned above). The discrete EP value, though, can’t be greater than 7….”
        Actually, the situation I was referring to is when a team is favored enough that it has a positive expectation, even when it’s kicking off to the other team. Then a touchdown is going to be worth 7 expected points plus whatever that positive expectation is. Therefore a situation with an almost certain TD (like 1st and goal from the other team’s 1) can plausibly have a value of more than 7 expected points.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s