Statistical Chatter: Qualifying the value of NFL head coaches

Jonas Trostle/The Miscellany News

There have been 60 National Football League head coaches since 2014. Considering 28 of them have had to seek different employment in the intervening years, we might ask: How can we judge a head coach to be good or not?

In my attempt to tackle this question, I used two multilevel Bayesian models, one with 60 varying intercepts for rushing and another with 60 varying intercepts and 60 varying slopes for passing. The end goal was to provide a very rough approximation of the value of each head coach. I decided to use the multilevel technique because it provided an apt middle ground between creating a separate model for each coach and putting them all into the same model without accounting for the variation between them—the varying slopes and intercepts allowed me to adjust for this. The first model was used to produce the average expected points added (EPA) from a running play under each head coach, with no extra variables controlling for running back skill, because of its minimal effect. The second model examined the average expected points added from a pass play, but with a wrinkle: Some head coaches have better quarterbacks. So, how do we mitigate the effect of better quarterback play when we really want to look at coaching? One method, which I use in this analysis, is to condition on some proxy of quarterback skill so that its effect on our head coach output is dampened. The most broadly useful proxy for quarterback skill is completion percentage over expectation (CPOE), which measures how good a quarterback is at completing passes compared to average after taking into account things like pass length. In using this measure, we have to make two assumptions: one, if a quarterback’s epa is different than what cpoe would predict, that difference is wholly attributable to the play design (and thus the coach), and two, play design has no effect on cpoe. These are both wrong (wide receiver skill may explain some of the difference and plays can be designed to have wide open receivers), but they are useful enough so long as we go in with the understanding that we are not perfectly eliminating quarterback skill, and that some elements of play design may not be captured.

Combining the outputs of the two models, we get the above chart. On the vertical axis is points per play pass epa, and on the horizontal is per play rush epa. The higher up a coach is, the better passing efficiency they have had during their tenure(s); the further to the right, the better the rushing efficiency. Top right is good at both, bottom left is bad at both. Notice the entire horizontal axis is negative; rushing, except in particular circumstances, is less efficient than passing.

Turning to some of the labeled points in the upper right rectangle, we have a healthy mix of the surprising, interesting and expected. At the very top is the Rams’ Sean McVay, the “whiz” who, after being hired for the 2017 season, crafted some of the NFL’s best offenses, with Jared Goff and Todd Gurley as the focal points. Goff is average to below-average when it comes to completing passes, and Gurley was supplanted by C.J. Anderson and later cut, so it’s no surprise to see McVay take the top spot after making two mediocre players look like stars.

Next to McVay is Chiefs’ head coach Andy Reid, who earns an unsurprising spot for a known offensive guru, but it’s certainly higher than expected if you think that Patrick Mahomes is the best quarterback in the league. Take a moment to ponder what it would look like if he were instead a Falcon, and Matt Ryan was starting for Kansas City. Would Ryan be the one coming off a Super Bowl victory just a year after a historic 2018 season?

Rounding out our top-right corner trio is Bill Belichick. Of the three, he’s the hardest to separate from his quarterback. Goff spent a season without McVay, and it was awful; Reid had Alex Smith and Chase Daniels before Mahomes; but Belichick, outside of 21 games since 2001, has only had Tom Brady. However, if we judge Brady on how well he would have performed based on his accuracy alone, we can pretty safely say that he was buoyed substantially by the system around him, at least since 2014.

The rushing efficiency of these coaches is interesting in another aspect. If we now include Sean Payton and Jason Garret, we can see that these five are situated pretty closely horizontally. What’s obscured is how these teams got there. McVay is the most unique, since his rushing efficiency comes from running into light boxes and using his receivers as additional ball carriers. Reid has also shown a tendency to put his runners in the best position, while also having had  the privilege of working with Jamal Charles, one of the most efficient running backs of this generation. Garret, Belichick and Payton all share a similar nexus for why they rush efficiently: quarterback sneaks. Quarterback sneaks, while rare, usually occur at the highest leverage moments when it comes to EPA: third and fourth and short or on the goaline. This means that they can greatly influence the overall efficiency. Brady, Brees and Romo were all masters of this art, and Dak Prescott has done well enough too. Coupled with very strong offensive lines, it’s no wonder that their efficiency was so much higher than average.

Before turning to the poster boy for bad coaches, we should ask ourselves why Dan Quinn, a defensive-minded coach from Pete Carroll’s tree, is so high in pass efficiency. The answer can be found immediately to the right of Quinn. Kyle Shanahan is currently doing wonders for the 49ers offense, but before he was head coach there, he was the offensive coordinator under Quinn in Atlanta. It seems far off now, but there was a point in time where Kyle Shanahan stood on the sideline and watched Julio Jones stride into the endzone to put the Falcons up 28-3 in the Super Bowl. This does highlight a problem with this chart of head coaches, as it’s impossible to separate them from their offensive coordinators, but over time successful coordinators drift to other teams and their effect is mitigated. 

We won’t dwell too long in the basement of disappointment, but we need to talk about Jets Head Coach Adam Gase. There are 32 NFL head coaches, and somehow Gase is still one of them. His tenure with the Jets has not been easy, with a string of subpar quarterbacks and oft-injured first round pick Sam Darnold. Despite the odds, however, the bad quarterbacks under Gase have played worse than would be expected given their natural abilities. And not a little worse either, but much, much worse. Ryan Tannehill, who spent time languishing under Gase in Miami, only bloomed once he went to Tennessee. To fully nail in this point, going from Gase to average is as big of an improvement as going from average to someone like McVay or Reid, the aforementioned offensive masterminds.

Once again, I stress that this is not a perfect way to evaluate head coaches. But the fact that it broadly matches my intuition and general consensus lends some weight to its evaluations. This multilevel model is only a starting point, but it allows us to better quantify in what ways coaches are different and what makes a good coach good.

One Comment

  1. Thanks for the article. I like the fact that you adjusted for COPE.
    That said, I am thinking that another way to look at this is in conjunction with % attempts to pass and run.
    for example, caldwell is efficient when dialing up a pass. how often is he passing? of course, this begs another question, but I think it makes a good discussion. Assume he passes just 50% of the time, and it is the third lowest in the league. On the one hand, it suggests he should pass more. on the other hand, it might suggest that since he passes less than average, he is able to take advantage of teams stacking the box, etc, and therefore his pass efficiency is high. either way, great article. thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

The Miscellany News reserves the right to publish or not publish any comment submitted for approval on our website. Factors that could cause a comment to be rejected include, but are not limited to, personal attacks, inappropriate language, statements or points unrelated to the article, and unfounded or baseless claims. Additionally, The Misc reserves the right to reject any comment that exceeds 250 words in length. There is no guarantee that a comment will be published, and one week after the article’s release, it is less likely that your comment will be accepted. Any questions or concerns regarding our comments section can be directed to