clock menu more-arrow no yes

Filed under:

Type 1 Error, Analytics, and the NHL

New, comments

I channel my inner curmudgeon and look into the future for Analytics.

Kyle Dubas and Brendan Shannahan
Kyle Dubas and Brendan Shannahan
by Ernest Doroszuk July 22 at 1:07 PM

So 2014 is the year of Analytics in the NHL.  It's "Revenge of the Nerds". I predict that 2015 will be the year of the backlash.

As teams embrace Analytics (even the Maple Leafs!), I feel safe in predicting that sometime soon, probably within the next year, it will blow up. Some team will do something (trade a player, change strategy, something) based on Analytics. It will go horribly wrong. Cue the backlash. I'm even going to predict how it will go wrong. But first, a little basic Statistics.

When you do a Statistical analysis, you have a "Null Hypothesis". Usually the Null Hypothesis is that everything is the same. If we are looking at goaltenders and PK save percentage, the Null Hypothesis is that "All goaltenders have the same PK save percentage". The "Alternative Hypothesis" is the, uh, alternative. "All goaltenders do not have the same PK save percentage". The analysis then looks at how likely is the data if the Null Hypothesis is true, and the only difference is the effect of random variations. This likelihood is called the "p-value". If the data is reasonably likely under the Null Hypothesis (the p-value is high enough) we accept the Null and reject the Alternative. If the data is unlikely under the Null Hypothesis (the p-value is low enough) we reject the Null and accept the Alternative.

There are two kinds of mistakes you can make. Type 1 Error and Type 2 Error. Type 1 Error is rejecting a Null Hypothesis when you should accept it. In essence, it is saying that players differ when you should conclude that they are the same. Type 2 Error accepting a Null Hypothesis when you should reject it.  This would be saying players are the same when in reality they are different.

So what is going to go wrong? Type 1 Error. Somebody is going to say "Player A has a value of X. Player B has a value of Y. Those are different!  B is better!" They aren't going to test whether the difference is significant, they're just going to assume that it is. That assumption will be wrong. It will play out very, very badly. The MSM will howl with delight.

Don't get me wrong.  Type 2 Error happens all the time too.  Hell, I'm pretty sure I've done it.  When I conclude that "Goalies do not differ on the Penalty Kill", that's probably a Type 2 Error.  There probably is some small difference in skill from goalie to goalie.  The difference is small enough that 5 years of data isn't enough data to show it.  From a practical standpoint, if you can't see a difference over 5 seasons, you certainly won't see one over the next season.