Trying To Quantify The Impact Of Bad Calls

Every fan of a football team, college or pro, has complained about officiating and bad calls against their team at some point. Fans of the Cleveland Browns are no different. Many Browns Backers are absolutely convinced that the refs simply don’t call the game fair and square and that the bad calls–or the no-calls, in the case of the stubborn refusal of game officials to call the obvious holds on Myles Garrett–always seem to go against the Browns.

Now a frustrated Browns fan has performed a data-oriented analysis of penalties and, not surprisingly, he has concluded that the Browns have been hurt more than any other team in the NFL by flags. He’s used the information about the number and circumstances of penalties to calculate the impact in terms of EPA, for expected points added, with respect to each penalty assessed against a team. His analysis indicates that the Browns–who have been whistled for 64 penalties, compared to only 45 against the Browns’ opponents–have lost a net of 3.5 EPA per game due to the yellow flags. To make matters worse, the analysis indicates that the team that has benefited the most from penalties is the Pittsburgh Steelers, who have a positive EPA impact of almost 3.

Does this prove that the striped shirts have it in for the Browns? Not so fast! Some of the penalties against the Browns–like the three lining up offside penalties against the defense in the game against the Bengals–are clearly correct calls, and no one should be heard to complain about those. It’s also possible that the Browns are just undisciplined, and fans can definitely think of times when players lost their cool and made stupid plays. The issue is whether the refs are making more bad calls against the Browns than they do against other NFL teams, and that is really hard to quantify objectively.

The EPA analysis is interesting, but I don’t think it proves that the refs are biased against the Browns–although some Browns fans clearly will argue that it does. In my view what it does show is that the Browns need to specifically focus on avoiding the dumb penalties and the undisciplined penalties, because the number of penalties they are racking up are really hurting them. If they can do that, I’ll take my chances on a bad call now and then.

Trying To Make Sense Of The Data

One of the frustrating things about the coronavirus is the lack of a meaningful context in which to make sense of the data.  Statistics are breathlessly reported by the news media without any way to assess what the statistics actually mean for those of us out in the world at large.  It’s a good way for news sites to increase their clicks and visits — the New York Times has reported that news sites have experienced significant usage surges, as readers seek the latest information about COVID-19 — but what’s the right takeaway from the flood of information?

article_24febbraio_2014Consider the reports about increases in the number of confirmed coronavirus cases in the U.S.  Are reports about the number of confirmed cases in a particular location doubling in a week, for example, bad news . . . or just a product of the fact that as tests become more available and more people are tested, we’re inevitably identifying more people who have the virus?  We know that many people who are infected with coronavirus experience only mild symptoms, and increased testing is going to identify an increasing number of those mild cases.  And, as we identify more people who are walking around with the virus, we’ll inevitably see a decline in the mortality rates, because we know from fifth-grade math that an increase in the denominator of confirmed cases, by adding in more newly discovered mild cases, will necessarily result in a decline in the overall death rate percentage.

And speaking of the mortality rate, is reporting on people who have had the coronavirus and died really a meaningful measure, in the abstract, or does it tell us something only if we know more about the circumstances of the deaths?  Take Italy, for example.  The reported mortality rates in Italy are among the worst in the world — worse, even, than the reported rates in China (and I emphasize “reported” for a reason).  But as this article from the Telegraph points out, the high Italian death rates appear to be the product of several factors that seem to undercut the ability to draw meaningful inferences from the Italian statistics.  For example, Italy has one of the oldest populations in the world, the  vast majority of the individuals who have died have been older and dealing with other health issues, and Italian doctors record coronavirus in the cause of death records even if the individuals were suffering from other, significant health problems that contributed to their death.  Given those factors, how should we react to Italian statistics?

And finally, I’ve seen reports that China has closed the hospitals it built to deal with its coronavirus cases, and is reporting a decline in coronavirus cases.  But, should we credit anything China has to say about COVID-19?  It’s pretty clear that China wasn’t exactly transparent with the world when the coronavirus was first discovered in Wuhan province, and I’m skeptical about trusting anything that government says about COVID-19 at this point.

The need to put some context to the data is important not only for those of us who are scratching their heads about how to deal with the issues presented by the coronavirus, but also for the decisionmakers who are weighing when to open businesses, schools, and restaurants.  Our daily lives always involve some form of risk calculation, and most of the risks — whether it is the risk of death in a traffic accident, from a choking incident, from a falling tree limb, or from an operation gone bad — are risks that we are willing to accept.  If the increased testing produces a surge in the number of reported cases and a correspondingly steep drop in the mortality rate, at what point do the authorities conclude that it is okay for us to leave our houses and go back to work?  If the death rate from COVID-19 is twice that, or four times that, of H1N1, and we compare that risk to the damage that would be done to the economy and to individuals who live paycheck to paycheck from a prolonged shutdown, do we accept that risk?

And finally, how do the publicized cases of sports and entertainment figures who report having the coronavirus affect the public perception of the risk equation?  If all of the NBA figures, football coaches, and movie and recording stars who have contracted COVID-19 survive the experience, will that put pressure on authorities to let us get on with our lives?

When you are talking about data, context is so important.  Mark Twain was right about lies, damn lies, and statistics.  I feel that the news media is letting us down, and focusing on the sensational, click-bait headlines while forgoing the nuts-and-bolts reporting that really would be useful during this period.