One of the most random events in football is the interception. We can say that not just because of tipped balls or freak gusts of wind, but because we know that interception rates for teams and for individual quarterbacks varies widely from season to season. They also vary greatly from one half of the season to the next, which suggests that more is at play than player ability.
In a recent post I estimated the proportion of team win-loss records that can be attributed to sample error in an effort to demonstrate the technique.
I looked at the interception rates (int per attempt) for all ‘qualified’ NFL quarterbacks since from 2002-2009. To qualify, a QB must have thrown a certain proportion of his team’s pass attempts. The overall average interception rate is 2.9%. The standard deviation, from QB to QB, is 0.94%, which means about 2/3 of the QBs will have interception rates that fall somewhere inside 2% and 4%.
To calculate the variance due to randomness, I couldn’t directly use the formula from the binomial distribution. Unlike team seasons, which all have 16 games, QBs have varying numbers of attempts during a season. Instead, I used a random simulation to estimate the random variance. I started with the premise that all interceptions were completely random. What if every quarterback’s pass attempts each had a 2.9% chance of being intercepted, regardless of who he is or who the opponent is. What would the distribution look like then? How different would this purely random distribution be from the actual distribution we observe in the real NFL?
This is something similar to a study Chase Stuart did at last year at Footballguys. He created 25 notional QBs, each with an identical interception rate. He demonstrated that after 500 attempts each, there could be a wide range (from 9 to 22) in total interceptions due to sample error alone.
I did things differently. I re-ran the past eight NFL seasons worth of pass attempts as a random simulation. Each real QB had the same number of attempts as he did in the real seasons, but his simulated number of interceptions was a purely random function based on the baseline 2.9% interception rate. After simulating hundreds of seasons, I looked at the resulting distributions. The mean interception rate was obviously the same as the actual rate at 2.9%, but the standard deviation was narrower, as we’d expect. The purely random simulation had an average 0.85% SD compared to a 0.94% SD for the actual observed seasons.
The proportion of randomness of interceptions is therefore:
In the graph below, you can see what I'm talking about. If interceptions were purely random, we'd expect to see a binomial distribution centered near an average rate of 2.9%. The red line is just one set of 8 seasons of randomly generated interception rates. The blue series represents the actual distribution of interception rates over the past 8 seasons. They're not much different. The random distribution is taller and narrower. The fact that the actual distribution is wider and flatter indicates the degree of skill involved.
We can see from the graph that some coaches may tend to have a boiling point--a limit beyond which they will not tolerate any more interceptions. Once a QB gets to just under a 4% interception rate, he'll be replaced. If true, this would create a selection bias in the actual distribution. Without this bias, the underlying actual distribution would be slightly wider, increasing its variance.
Certainly interceptions are due to skill at some level. If you put me in at safety on an NFL team, chances are I'm not going to reel in a single pick. Or if you put me in a quarterback, I'd throw 10 interceptions per game. Obviously, speed, experience, and skill all play a part. The thing is, average Joes like me aren't out there playing in the NFL. The skill levels of the participants are relatively equal, even for the very best and very worst of active players, and that's part of why the noise of randomness is so large compared the signal of skill.