The last post raised a couple of issues (through the comments and twitter) with the Yards Allowed stat. Clearly it’s not a perfect measure of defense. As a result, I am pulling together some other defensive metrics and will run the same analysis, in hopes of shedding some light on the relative importance of defense.
In the meantime though, let’s look at why Yards Allowed is a flawed measure. Hint: We have a problem we have faced before. That is, teams that are losing by a lot will throw more often, gaining more yards and distorting the defensive ranking of the winning team.
Before we get to that though, we need to examine just how big the potential problem is. In my previous post on Pass Play %, I hypothesized that there is some skew, but that when looking at every play run in the NFL every year, relatively few of them are done by teams focused on anything except maximizing points scored, which should minimize the effects on the overall data set.
To get a better look, I put together a new analysis, comparing points scored to the ratio of run/pass yards allowed. The hypothesis is that teams that score a lot of points will force their opponents to pass more, resulting in a positive correlation between points scored and % of Pass yards allowed. I expect there to be a positive result, as there’s a pretty logical case for correlation. The real question is how strong the result is; if it’s weak than my prior statement regarding overall skew effects may still stand, if it’s strong then we have to re-evaluate (and do a lot more work).
Here is the chart. Note the Y axis is just Pass Yards Allowed/Total Yards Allowed.
Dammit. The correlation value is .40, so moderate, but definitely a little higher than I expected. Unfortunately, this means we’ve got some work to do if we want to remove the noise resulting from this issue, which will be my goal for the next couple weeks, interspersed with other topics for which I’ve found interesting data.
We’ll attack it from two angles. First, as I mentioned at the top, we can use other defensive metrics that indicate strong defense but aren’t as susceptible to the same problem.
Also, we can try to use a better metric for overall defense. There are some interesting stats out there on sites like Football Outsiders that we might be able to use. I’ve got a few in mind but let me know if you have any particular favorites. Let me caution you though, stats that look really complicated my be subject to over-fitting, which would lead to really powerful results in analyses like the one above but actually tell us very little useful information.
We can also try to create our own, though I’ll only do that if people are really interested. The idea of a relatively simple stat that incorporates some major defensive measurables sounds intriguing though. If you’ve seen something like this, let me know so I don’t waste time duplicating it.