FanPost

A blast from the past: Sunny Mehta on goalies

This article originally appeared on Gabe Desjardin's blog behindthenethockey.com at http://www.behindthenethockey.com/2010/4/13/1419530/new-jersey-devils-vs-philadelphia BehindtheNet is now part of ArticIceHockey here at SBNation, but the original article is no longer linked. I pulled this from web.archive.org.

New Jersey Devils vs. Philadelphia Flyers Preview

by sunnymehta.com on Apr 13, 2010 4:40 PM EDT 50 comments

The New Jersey Devils and Philadelphia Flyers have had a history of epic playoff battles. This season's installment of the "Turnpike Series" should be no different. While the standings (i.e., total points and playoff seed) seem to imply a fairly significant discrepancy in skill, a closer examination indicates otherwise.

Even Strength

One of the best measures of prowess at even strength is "Shots + Missed Shots." The stat has good predictive value, and further filtering it by when the score is tied on the road removes a lot of the situational and scorer bias. Here's how these two teams stacked up by that metric in '09-‘10:

NJ: 533 For, 543 Against

Phi: 456 For, 467 Against

While the Devils had higher shots and misses on both sides of the ledger (presumably because they happened to play more minutes with the score tied), viewing the numbers as percentages (NJ: 49.5% Phi: 49.4%) shows that the two teams were almost identical in terms of meaningful territorial advantage.

You will undoubtedly hear the media drone endlessly about the differences in individual shooters, hall-of-fame versus non-hall-of-fame goaltenders, and of course, intangibles ("Kovalchuk is playing with the ‘determination' and ‘intensity' of a 15th century warrior."). But I promise you none of that matters nearly as much as what was illustrated in the previous two paragraphs.

Shooting percentage (i.e., a measure of "Shot Quality") at the team level has zero predictive value. Save percentage (i.e., a measure of "Shot Quality Against" and "Goaltending") has slightly more than zero predictive value. In other words, way less than most people think. Comparing Martin Brodeur to Brian Boucher, most will consider goaltending a huge advantage for New Jersey in this series. I don't. No one has conclusively shown a meaningful difference in skill between NHL goaltenders. I'm not saying it doesn't exist, I'm just saying no one's really proved it, and that all signs point to goaltending differences being far less important than everyone thinks. (For example, the distribution of road save percentages with the score tied over the course of the full '09-'10 season was narrower than what we'd expect by chance alone. If that doesn't make sense to you, ask about it in the comments section and I'll explain it further.)

Now, given the latter disclaimers, if you had to bet on one goalie versus the other of course you would bet on Brodeur given their histories. And seeing as the territorial advantage is almost a dead heat, I'm willing to use goaltending as a "tiebreaker" to give the slight nod to the Devils in terms of advantage at even strength.

Special Teams

Shots on goal is the important thing to look at with regards to power play and penalty kill situations. It has more predictive value than goals or conversion rate. Teams control special teams shots by either generating/stifling shots when on the PP/PK, or by actually drawing and taking more/fewer penalties. It doesn't matter how a team does it, the important thing is that they do it. For example, year after year people see the Devils at the top of the list in Goals Against Average and assume it's due to their team defense or their goaltending, yet in many seasons when you break down the numbers by man-strength you see that the Devils were merely average in goaltending/defense but had a low overall GAA simply because they took far fewer penalties than the rest of the league.

In any case, here are the numbers for '09-'10:

NJ: 353 5-on-4 shots, 286 4-on-5 shots against

Phi: 471 5-on-4 shots, 379 4-on-5 shots against

Once again, while we see some difference in total events, viewing the shots as percentages (NJ: 55.2% Phi: 55.4%) tells the story of two extremely similar teams. Philly did a better job of generating shots within their power plays (56.8 shots per 60 minutes compared to the Devils' 50.4 shots per 60), so if they can manage to keep the number of penalty calls equal, they will have an advantage. However, it's very difficult to play the Devils in a seven game series and take as few penalties as they do.

Individual Matchups

While both teams ended up with similar numbers in the important categories, how they got those numbers differed a bit. Corsi is the term we use around these parts to measure how many total shots were directed at net while a particular player is on the ice. If we look at each team's top forwards (as determined by the number and quality of minutes played), the Devils' Parise and Zajac had a better corsi +/- than the Flyers' Richards and Gagne by almost double! In fact, looking at each team's next four, it's hard not to give a slight edge to the Devils' Elias, Kovalchuk, Langenbrunner, and Zubrus (or Rolston) over the Flyers' Carter, Hartnell, Briere, and Giroux, particularly given Carter's foot injury. However, the Flyers' depth guys like Leino, Pyorala, and Van Reimsdyk handled their assignments better than the likes of the Devils' Pandolfo, Clarkson, and Neidermayer.

On the back end, the Flyers are quite a bit stronger than the Devils. Pronger, Coburn, Carle, and Timonen all posted a corsi +/- better than any Devils defensemen. Well, with one exception: Paul Martin. Martin was hurt for most of the season, but he is back now and supposedly healthy. He is far and away the Devils' best defensemen, and he is arguably the best defensemen on the ice for either team.

Final Verdict

This series is much closer than most people think. Both teams are very similar in terms of even strength territorial play. The Devils have an edge in goaltending, though I question the overall significance of that edge. Both teams have very similar penalty kills. The Flyers have a better power play, but the Devils are better at avoiding penalties. Home-ice advantage makes the Devils a slight favorite, though I'm not sure how much the likely abundance of Flyer fans at The Rock affects that.

Prediction: Devils in seven.

Comments

"No one has conclusively shown a meaningful difference in skill between NHL goaltenders. I’m not saying it doesn’t exist, I’m just saying no one’s really proved it, and that all signs point to goaltending differences being far less important than everyone thinks."

There’s a very interesting contrast between this statement and the work of Alan Ryder ("the Hart should go to a goalie every year") or Tom Awad’s GVT metric (which litters the top ranks with goaltenders). Just seems to me there’s apperently a huge disconnect between the player-contribution folk and the shots-metrics folk with regards to the importance of goaltending.

TBH, I am really becoming increasingly wary about the use of shot metrics as the be-all end-all measure of team ability (and everything else being consigned to the "luck" pile) — that seems to be a long way from Corsi’s original view as a proxy for puck possession. I accept that they are strong indicators with strong correlations, but that doesn’t necessarily mean everything else is chance. Dismissing goaltending entirely as a difference-maker triggers my smell test alarms most of all.

At the very least, goaltending as being irrelevant is certainly a radical idea — NHL teams should really stop paying them much more than a million or two if none of them are difference-makers!

by MathMan on Apr 13, 2010 6:09 PM EDT reply actions 0 recs

The difference in goaltender talent across the NHL is pretty small and the difference in their performance over say the life of a six-year contract is even smaller. There are certainly better goalies, but the talent difference between the best guy in the league and the median starter is about 1.5 wins.

What I think is the confusing issue here is the value of single-season goaltender performance. The top goalie is worth a ridiculous number of wins, even if that performance is unlikely to continue in the playoffs or carry over to the next season.

Does that make sense? Luongo’s talent is worth 3-4 wins over replacement ($7-$10M) but his performance in a single season could be worth 8 wins if things break right for him.

by Hawerchuk on Apr 13, 2010 6:30 PM EDT up reply actions 0 recs


GVT is a measure of observed performance, not implied talent, so it will be affected by the randomness spread of goaltenders. I think the statement "No one has conclusively shown a meaningful difference in skill between NHL goaltenders" is far too bold, but I agree that the spread is smaller than most people think, especially when they base themselves on recent history or who’s "hotter". We have enough seasons to say that Brodeur is better than Boucher.

I agree that using purely shot-based metrics as proxies for team ability is at the very least incomplete. On the positive side, it’s the only aspect of team ability that’s consistently repeatable over the short term.

by Tom Awad on Apr 13, 2010 6:39 PM EDT up reply actions 0 recs

Tom,

Noting a difference between Boucher and Brodeur is totally a moot point if we see no evidence of skill in the population. (Which I’m not necessarily claiming is the case.) If we get 100 people to flip coins a thousand times and record the number of heads, and I show you one particular guy who comes up with a vastly different number of heads than one other particular guy, does that mean the difference was due to coin-flipping skill?

by sunnymehta.com on Apr 14, 2010 1:24 AM EDT up reply actions 0 recs

Sunny,

You’re correct. My argument was that the 10-year stat difference between Brodeur, Boucher and a few others was probably enough to imply some evidence of skill in the population. Obviously, you’d be correct that given a large enough population, there will be outliers, and you can’t use the outliers as an argument in and of themselves if there is a huge bulge in the middle of your population.

by Tom Awad on Apr 14, 2010 11:06 AM EDT up reply actions 0 recs

Strictly speaking, isn’t the entire NHL an outlier of the population as a whole? It could even be considered an outlier among the smaller hockey-playing portion of the population. Especially regarding goaltending – including injuries and such we may see 80 goalies in NHL action during a full season, as compared to ten times that many skaters.

(According to hockey-reference.com, those numbers this year are 83 keepers and 879 skaters with at least one game in the NHL this season.)

Given how scarce an NHL-quality keeper is, it stands to reason that there isn’t a terrific amount of difference between them. Brodeur may not be all that much better than Boucher – remember, Boucher is the all-time shutout-streak record holder! – but the difference between Boucher and, say, me, is preposterous.

IOW, there is definitely goaltending skill in the population. It’s harder to see in the highest level. I see it a lot more clearly in my own rec league. More importantly, one of the big differences between the highest level and mine is how repeatable the skill is. I have made NHL-quality saves. I have been beaten by perfect shots. The difference is, I may pull a highlight-reel save once a month; the NHL guys do it so regularly it’s ridiculous: in fact it becomes routine. It had better be, because the guys shooting at them are going to make far better shots far more often than the guys I see every Tuesday night! And likewise, we’ve all seen stupendously bad goals – but we see them out of the professionals so rarely that we have a whole different definition of "soft goal" for an NHL regular as compared to me. Half the stuff that I surrender would be considered soft by that standard.

Of course I'm an expert, I've seen Slap Shot eleven times!

by mikb on Apr 14, 2010 12:07 PM EDT up reply actions 0 recs

you are correct, but by "population" i meant "population of NHL goaltenders"

by sunnymehta.com on Apr 14, 2010 12:14 PM EDT up reply actions 0 recs

Should have picked up on that, thanks.

Of course I'm an expert, I've seen Slap Shot eleven times!

by mikb on Apr 14, 2010 12:42 PM EDT up reply actions 0 recs

MathMan,

As I mentioned, and as you quoted, I am not saying the difference in goaltender skill doesn’t exist, I’m just saying that imo the jury is very much out right now on how significant it is. And the signs point to it being overrated.

Also, as Gabe mentions below and elsewhere, saying the relative value of goaltenders is not all that important is not the same as saying the goaltender position is unimportant. If we are in the grocery store and I say, "MathMan, that bottle of water you’re about to buy is not worth twice the price of the one next to it, because there’s basically no difference between any of the brands." Does that mean I’m saying "water is not important"?

As for Ryder’s PC, it’s basically a linear weighted ranking system where he chooses the weights, very contentiously and somewhat arbitrarily, imo. Vic Ferrari has driven holes through the shot quality/PC/miscellaneous hogwash metrics big enough for a zamboni to fit through. If this is the "huge disconnect" you are referring to, I can explain the difference quite simply: one party understands (or at least acknowledges) the role of randomness in sports and in life, and the other doesn’t.

And it’s not about consigning everything to chance. Acknowledging absence of evidence is not the same as claiming evidence of absence.

by sunnymehta.com on Apr 14, 2010 1:12 AM EDT up reply actions 0 recs

Also, in all fairness, I think some of Ryder’s other work is very good. I’m just not a big fan of his PC inputs. (Though his season recap papers are usually well written and interesting.)

by sunnymehta.com on Apr 14, 2010 2:07 AM EDT up reply actions 0 recs

> Also, as Gabe mentions below and elsewhere, saying the relative value of goaltenders is

> not all that important is not the same as saying the goaltender position is unimportant. If
> we are in the grocery store and I say, "MathMan, that bottle of water you’re about to buy is
> not worth twice the price of the one next to it, because there’s basically no difference
> between any of the brands." Does that mean I’m saying "water is not important"?

Of course not. But if there’s basically no difference, it also means I’d be a complete idiot to buy the more expensive water. Which means there are a lot of GMs in the league who obsessively waste good money and cap space on ‘elite goaltenders’ that are not difference-makers.

What your reasoning should logically lead to is that Vancouver is foolish to pay Luongo several million dollars a year, because there’s no significant difference between him and your average NHL starter. And Philly has been wise all these years not to spend a lot of money on goaltending, because, really, it’s not a big difference-maker.

I can buy that goaltending may matter less than we think, but saying that any NHL starter is basically just as good as any other is pushing a bit far. If there’s an absence of evidence that quality of goaltending is a difference-maker, it seems rather foolish to spend any money and effort trying to acquire higher-quality goaltending above the basic level. Odds are that money and effort would be better-spent on improving parts of the roster for which evidence that it will make a difference does exist.

by MathMan on Apr 14, 2010 12:10 PM EDT up reply actions 0 recs

What’s the correlation between team wins and goaltender salary? Pretty poor. There are 18 goalies paid over $3M next year: Lundqvist, Ward, Miller, Backstrom, Giguere, Kiprusoff, Vokoun, Huet, Luongo, Brodeur, Thomas, Fleury, DiPietro, Hiller, Bryzgalov, Leclaire, Khabibulin, Rinne.

Who will we see in the first round of the playoffs? Miller, Luongo, Brodeur, Fleury, Bryzgalov and Rinne. Lundqvist, Vokoun and Hiller are also very good goalies, but the rest? (I’m not a Backstrom believer.)

There are 18 GMs who think you should pay a lot for a goalie. Half of them are getting a backup goaltender for those big bucks. So, yes, there are a lot of GMs obsessively wasting good money.

by Hawerchuk on Apr 14, 2010 2:00 PM EDT up reply actions 0 recs

Making the playoffs is a team accomplishment — from which you cannot infer that highly-paid goaltenders on those teams that missed the playoffs aren’t a good investment individually (Lundqvist for example was a major reason why the Rangers missed out so narrowly).

I’ll readily agree that some goaltenders are overpaid (hello, Mr. Huet), just like some forwards and defensemen are overpaid. There’s a far cry from that, though, than saying that it’s not worth paying an elite goaltender should you find one, which is the logical conclusion of what Sunny is saying.

Under that philosophy, the Flyers, with their cheapstake approach to goaltending, are the wisest team in the league with regards to goaltending. Certainly, if playing their third-stringer goaltender when facing Martin Brodeur does not represent a significant edge for New Jersey, then clearly the Devils are wasting their money. (Team effects notwithstanding, Brodeur is at least a very good starter.)

Likewise, the Canucks should try to find a sucker to take Luongo’s contract, sign Martin Biron for a million (again, there’s no evidence that there’s a meaningful difference in skill between the two), and spend the money on someone who will actually make a difference. And so on: New York should try to ditch Lundqvist at the earliest opportunity to free up money to spend on difference-makers. Unless those are all outliers, but that’s already a significant proportion on a small population of around a hundred.

I’m willing to admit that goaltending may not be as much of a difference-maker as we think (though, actually, I doubt that). Going from there to the notion that there’s no meaningful difference in the goaltender population (or no evidence of it, which in practical NHL management terms amounts to the same thing) leads to some pretty radical conclusion.

by MathMan on Apr 14, 2010 3:59 PM EDT up reply actions 0 recs

We both think it’s correct for GMs to spend big bucks on truly elite goaltenders.

But you’re really putting words in my mouth. What I’ve written time and time against is that there are a handful of truly elite goaltenders out there who are worth more than you can even pay them under the CBA. Luongo and Vokoun, in particular, and a few others.

But Cam Ward? Seriously. There are 2-3 times as many goaltenders being paid as though they’re elite as there are elite goaltenders. And the difference between many of these goaltenders and the ones available for a lot less money, like Craig Anderson, is tiny.

by Hawerchuk on Apr 14, 2010 5:13 PM EDT up reply actions 0 recs

I think it might be meaningful to look at goaltender stability as a stat. How many times has a goalie given up goals one after another? How many times has the goalie shut down further scoring by the opponent after the first couple of bad goals?

I think confidence of the goalie plays a big part in the game, players will play better if they are confident that their goalie will stop the puck. The coaches will put out more offensive lines if he thinks the goalie will be able to stop any kind of odd man situations.

Overall, goalies are different, but the difference is not just physical, but mental as well.

by MoonDragn on Apr 14, 2010 1:10 PM EDT up reply actions 0 recs

The Flyers have a better power play, but the Devils are better at avoiding penalties. Home-ice advantage makes the Devils a slight favorite, though I’m not sure how much the likely abundance of Flyer fans at The Rock affects that.

You know one of my most favorite playoff memories was back in 2006 for Game 6 between the Flyers-Sabres at an overflowing Wachovia Center, everyone wearing Orange. If there were any Sabres fans there, they were very few and far between – at least that’s how it looked on VERSUS. The sound provided no alternate evidence; the crowd was almost literally electric before it started.

The Flyers then proceeded to defecate themselves all over their home ice and lost 7-1.

My point: I wouldn’t worry about the make-up the crowd affecting home ice. How Lemaire uses that last change for match-up purposes will be more substantial to how the game goes.

Devils in my heart! Devils in my mind! Devils in my eyes! Devils until I die!
In Lou We Trust - The New Jersey Devils SBN Blog

by John Fischer on Apr 13, 2010 6:24 PM EDT reply actions 0 recs

John,

Good point, and you’re very likely correct. I almost didn’t include the second half of that sentence because it’s pretty much pure conjecture, but for some reason I left it in as a note of interest.

We know home ice seems to be worth about 4.5% in win probability, but to my knowledge no one has dissected how much of that is due to having the last line change versus feeling the adrenaline of your home crowd versus sleeping in your own bed versus sleeping with your own wife, etc. I have no idea how much the "home crowd" is worth.

by sunnymehta.com on Apr 14, 2010 1:20 AM EDT up reply actions 0 recs

Great article, but is the author Sunny Mehta the poker player and poker writer? That guy’s from Jersey- figures he’d pick the Devils. Mehta is like the Hawk Harrelson of hockey analysis. Flyers in 6.

by Josh Wexler on Apr 13, 2010 8:31 PM EDT reply actions 0 recs

Hi Josh,

Yes I am the poker player/writer, yes I am from Jersey, and yes I am predicting the Devils to beat the putrid, inappropriately named city "of brotherly love" in seven games. Put it on the board……YES!

by sunnymehta.com on Apr 14, 2010 1:27 AM EDT up reply actions 0 recs

@Tom Awad: Tom, can you link to studies that DO show a meaningful difference in goaltender skill at the NHL level?

by Josh Wexler on Apr 13, 2010 8:35 PM EDT reply actions 0 recs

It’s seems to be a pretty straight forward ANOVA and I for one am certain there is a difference. I guess the issue may not be statistical significance but practical significance.

by DoctorMyBrainHurts on Apr 13, 2010 11:21 PM EDT up reply actions 0 recs

Hi Josh,

I don’t have a published analysis handy. I can resume the main points for you: variance in goals against on an individual goaltender or team basis is roughly twice what would be expected uniquely due to luck. This means that the talent spread in the league is roughly equal to the single-season luck spread. Over several seasons, the luck will tend to balance out, and save percentages do not regress 100%, as they should if goaltending skill was completely evenly distributed. If you look at post-lockout data, for example, Tomas Vokoun or Roberto Luongo’s save percentages will be conclusively higher than Chris Osgood’s or Vesa Toskala’s (actually, with Toskala, maybe one season is enough).

The counter-argument you’ll often hear is that it’s wrapped up in shot quality, that Vokoun faces easier shots, or the scorer records more shots, and thus he’s actually not better. Eventually it becomes impossible to prove beyond an absolute shadow of a doubt, so people continue to claim that it hasn’t been "conclusively proven" even though it’s absurd to think otherwise.

by Tom Awad on Apr 14, 2010 12:11 AM EDT up reply actions 0 recs

Tom,

First, as I mentioned, I never said a difference in goaltender skill definitely doesn’t exist. I am simply taking a stance of skepticism on the relative significance until someone shows something one way or another. I’m sure someone will – probably Vic.

Secondly, the general shot quality thing has been fairly well debunked, the scorer bias has been pretty conclusively shown, and the playing-to-the-score effect has been empirically noted by many (including this very blog). I can link you to multiple sources for all of those.

Do you think the significance of goaltender skill has been shown to that same degree (everything up to "beyond an absolute shadow of a doubt"), such that we are simultaneously unable to link to a single study but it’s "absurd" to think it hasn’t been conclusively proven?

by sunnymehta.com on Apr 14, 2010 1:43 AM EDT up reply actions 0 recs

I’ll have the study up within a few days. To be honest, you’re the only person I know who has expressed doubts there could be ANY skill variation within the population. Most intelligent people, including Vic, yourself, Gabe and me understand that the season-to-season variation exagerates the true skill discrepancy (if it exists).

As for shot quality, I was mentioning it IN SUPPORT of your position. Even if goaltender save percentages vary beyond randomness, shot quality could be part of the reason. Hard to say unless we can objectively extract it.

I don’t dispute either the scorer bias or the playing-to-the-score effect, both of which I’ve written about myself and have been shown quite conclusively by others.

by Tom Awad on Apr 14, 2010 11:23 AM EDT up reply actions 0 recs

Cool. Looking forward to your study. If I can make a suggestion – use ES road tied data only.

by sunnymehta.com on Apr 14, 2010 11:32 AM EDT up reply actions 0 recs

I think it’s obvious that there’s variation within the population. Hasek, for example, was a vastly better goalie than we’ve seen in years. And there are plenty of guys on the other end of the distribution who are terrible.

Playoff population is different. Playoff teams tend to have good goalies, so the true skill difference between say the #3 playoff goalie and the #14 playoff goalie might be 0.003 in save percentage.

by Hawerchuk on Apr 14, 2010 11:46 AM EDT up reply actions 0 recs

But you could easily get outliers at the low end — weak goalies that got carried by very potent teams, or goalies that are backups being thrust into front-end positions by injuries, after a stronger goaltender handled most of the starting duties to get their team in the playoffs.

by MathMan on Apr 14, 2010 12:14 PM EDT up reply actions 0 recs

Most playoff goalies have a five-year track record with 100+ starts.

by Hawerchuk on Apr 14, 2010 2:00 PM EDT up reply actions 0 recs

Length of track record doesn’t enter my point. Most playoff teams presumably have fairly strong goalies that have helped them or at least haven’t hindered them in compiling a record in the upper 50% of their conference — otherwise they wouldn’t be very likely to be playoff teams int he first place. But there are some circumstances (strong team/weak goale, late injuries) where a team’s starting playoff goalie may end up being an outlier at the low end.

by MathMan on Apr 14, 2010 4:07 PM EDT up reply actions 0 recs

Cool. Looking forward to your study. If I can make a suggestion – use ES road tied data only.

by sunnymehta.com on Apr 14, 2010 11:26 AM EDT reply actions 0 recs

Would anyone be able to post a link to some of the stuff debunking shot quality? I’d love to read it.

by Phil23 on Apr 14, 2010 9:27 PM EDT reply actions 0 recs

http://vhockey.blogspot.com/2009/07/shot-quality-fantasy.html

by sunnymehta.com on Apr 14, 2010 11:07 PM EDT up reply actions 0 recs

Incidentally, shot quality does exist in a deterministic way:

http://behindthenet.ca/blog/2009/08/defensive-systems-and-their-impact-on.html
http://www.puckprospectus.com/article.php?articleid=247
http://behindthenet.ca/blog/2009/07/defensemans-impact-on-offense.html

It’s just such a small effect that it’s not the first thing we should be arguing about.

by Hawerchuk on Apr 15, 2010 2:20 AM EDT up reply actions 0 recs

Thanks for the links guys, much appreciated.

by Phil23 on Apr 15, 2010 2:40 PM EDT up reply actions 0 recs

As a complete aside, wouldn’t it be nice if the NHL decided to put microchips into each player’s skates so we could have a virtual "hockeyF/X" and literally be able to know exactly how long they were on the ice for, how far they skated, etc. Sure seems like it wouldn’t be that difficult a thing to do and would make the inputs we use for all advanced metrics far more accurate.

by Phil23 on Apr 14, 2010 9:28 PM EDT reply actions 0 recs

A Study Of One Season

For those interested, I just did the following experiment…

I grabbed the numbers for every goaltender’s ‘09-’10 Even Strength shots faced and saves (minimum 5 games played). I then simulated 10,000 seasons using the assumption that there is no goaltender skill. To do this, I simply gave each of the 68 goaltenders his own "coin" weighted to .919 (which was the league average ES sv%). For each simulated season, every goaltender got to flip his coin the number of times equal to his actual shots faced (for example, since Stephen Valiquette actually faced 99 shots in ‘09-’10, he got to flip his coin 99 times in each simulated season). Then I recorded how many "heads" (i.e. saves) and percentage of heads (i.e. save percentage) each goaltender came up with in a given season. And then I looked at the spread of these "head percentages" by measuring the standard deviation amongst the goaltenders. As mentioned, I ran 10,000 of these simulated seasons, and I recorded the standard deviation every time.

The largest standard deviation in any of the 10,000 simulated seasons was .0187. The smallest was .0073. The average was .0120.

Getting back to real life, the actual standard deviation of even strength save percentages in ‘09-’10 was .0129. That means if I showed you a bunch of graphs of the simulated seasons, you would have absolutely no idea if and when I snuck the actual season in with them, because it doesn’t look any different.

In other words, looking at any goaltender’s save percentage for a single season is completely and utterly meaningless. We absolutely cannot know if it was the result of anything other than pure fucking luck. It is absurd to even give out an award called the Vezina Trophy for Best Goaltender In A Given Season. This is the equivalent of giving out an award called Best Coinflipper.

Note that if we run this type of study for certain other sport skills, we do not see this effect. For example, instead of using goaltender save percentage, if I ran the same study for NHL forwards’ ability to direct shots at net, or MLB batters’ ability to hit home runs, the simulated seasons look nothing like the actual seasons. If I were showing you graphs of simulated seasons of pitcher strikeout percentages, as soon as I snuck in the actual season you would know immediately, because of how different it looks.

by sunnymehta.com on Apr 15, 2010 12:26 AM EDT reply actions 0 recs

Maybe I’m dense, but I don’t get how comparing the real-world standard deviation of a single season to that of a number of distributions where each goalie is a .919 talent demonstrates that there’s no such thing as goaltending talent and it’s all luck. The distribution around the mean may be similar, but that tells you nothing about whether the guys that end up on the high side of the curve are the same ones year after year (possibly because their coin is weighed differently).

What would be more meaningful, I think, is finding out whether the outliers at either end are the same year after year and where goaltenders fall in each year. That would be indicative of a consistent level of success being due to skill — IOW, see whether individual goaltenders regress to the mean over multiple seasons. If the same cluster of 5-6 guys each post numbers of .930 over 10 seasons with a small deviation between the seasons, that’s a strong clue that these guys are not .919 goaltenders in general, for whatever reason. If everyone regresses to the mean over time, then it’s indicative that it’s mostly chance (or factors outside of the goaltender’s skill, at the very least).

by MathMan on Apr 15, 2010 2:25 AM EDT up reply actions 0 recs


No, re-doing the study with a larger sample size (i.e. several years worth of data) would be good only to re-check for population effects. If we see no population effects, finding a "consistent" goalie is meaningless. If you get a bunch of dudes flipping coins, even for large samples, someone is supposed to get really lucky and end up on the way right tail of variance. Most people won’t believe that though – confirmation bias is tough to shake. "WTF, you’re telling me it’s a coincidence that Luongo is consistently in the top half of save percentages?!"

If the same cluster of 5-6 guys each post numbers of .930 over 10 seasons with a small deviation between the seasons, that’s a strong clue that these guys are not .919 goaltenders

No, not if .930 is well within acceptable range of variance for .919. People seem to forget the fact that if a guy (or group of guys) is really a .930 goaltender, we should also see seasons where he ends up on the good side of variance for .930 and posts like a .950 save% or whatever. But we just don’t observe that irl.

by sunnymehta.com on Apr 15, 2010 2:49 AM EDT up reply actions 0 recs

Someone is bound to get lucky. Several someones though? 4-5 ‘elite’ or ‘very lucky’ guys over the span of several seasons, with 4-5 ‘poor’ or ‘unlucky’ guys staying well below the norm over the same number of seasons? On such a small population that would be a rather a large proportion of unusual sequences.

How the distribution is laid out between goaltenders over the span of a single season is not particularly germane. Skill is about being persistently above or below average. What the variation is for an individual goaltender year over year would be more meaningful. If goaltending skill is truly random, you will have guys that will be consistenly lucky, guys that will be consistently unlucky, most of the guys will consistently cluster in the middle, and some guys that will swing wildly from very good to very bad from year to year. Does that latter group exist in significant quantity? Does the variance year-over-year for individual goaltenders match what we’d expect of a .919 goaltender?

by MathMan on Apr 15, 2010 7:50 AM EDT up reply actions 0 recs


Your approach is good, Sunny, but I would do the analysis weighted by shots against rather than just save percentage. In other words, instead of looking for the standard deviation of save percentage, look for the standard deviation of goals above/below average. The standard deviation for a guy who faces 2000 shots and has a 0.920 expected sv% should be 12 goals, or 0.006 of save %, not 0.012, while the standard deviation for backups is higher, etc. I think you’ll find a measurable difference if you do this.

And if there’s no such thing as goaltender skill, then there’s no selection bias so you don’t have to worry about that. It doesn’t matter if you pick the same guy to flip the coin 2000 times.

by Tom Awad on Apr 15, 2010 8:55 AM EDT up reply actions 0 recs

Tom,

I don’t follow. For each simulated season, I measured the sd of the population’s save percentage. Why would I measure the sd of any raw counting number when each goalie faced a different number of shots? (For example, let’s say we have a population of two goalies. Each gave up one goal. If we look at the sd of "goals" it will be zero. But now let’s say I tell you that one goalie faced 4 shots and the other goalie faced 40. Viewing the sd of raw goals is meaningless.)

Are you suggesting I give each goalie the same number of shots? In that case measuring a raw counting number would work, but the exercise has little practical use seeing as we’d have no observed distribution to compare it to (since the population of goalies did not face the same number of shots). We want our luck model to mimic the ability/observed model.

I may be misunderstanding you.

by sunnymehta.com on Apr 15, 2010 10:18 AM EDT up reply actions 0 recs

Hi Sunny,

I’m saying to compare goals allowed vs. average (i.e. shots faced * average save ). The reason is that your technique adds a lot of goaltenders with very few games / shots faced. Obviously, for these players, 99 of their results will be luck (even Hasek is indistinguishable on 100 shots), so obviously the final result looks like it’s all luck.

For example, say my population is 5 guys with a "true" 0.920 who faces 2000 shots, 5 guys with a "true" 0.880 who faces 2000 shots, 50 guys with "true" 0.920 who face 100 shots and 50 guys with "true" 0.880 who face 100 shots (it’s an extreme case just for illustration). Your technique will say there is no measurable skill because the 100 guys who face few shots pollute the data and can’t be separated from randomness, while in fact there is a lot of skill difference in this population. If you weight with shots faced, then you’ll give more weight to the 10 guys who faced a lot of shots and the skill will show up.

by Tom Awad on Apr 15, 2010 3:42 PM EDT up reply actions 0 recs

Oh i see what you’re saying. But my issue would be that simply removing the low shot guys brings in way too much selection bias. (Rookie comes up from the AHL and gets shelled in his first two NHL games and we never see him again, but if he does well he plays the rest of the season, etc.)

Also, I’m still not clear on your "goals above average" solution.

by sunnymehta.com on Apr 16, 2010 12:09 PM EDT up reply actions 0 recs


If there truly is no skill in the goaltending population (the initial premise), then selection bias almost doesn’t apply. Every goaltender is performing randomly, so as long as you don’t select just based on performance (hey, if I just eliminate all the guys below 0.900 and above 0.920, it looks like luck!) then you should be OK. Doing your calculation just with goaltenders who played 40 or more games should be fine. In effect, over multiple seasons, it should give the result Gabe arrived at (in the long-term, 45% luck).

by Tom Awad on Apr 16, 2010 3:09 PM EDT up reply actions 0 recs


I tried a different experiment – I used four-year even-strength save percentages for goalies who faced 1500+ shots. Their observed stdev was 7.73 while the simulated stdev was 4.9. That would suggest goaltending is 55% talent in the long-run, with the proviso that goaltenders who got really unlucky lost their jobs. I think there’s some real skill there – Vokoun is legitimately 4 wins better than replacement…The big problem is that most guys aren’t.

by Hawerchuk on Apr 15, 2010 12:28 PM EDT up reply actions 0 recs


That sounds right. I’m not arguing that Cam Ward is worth 6M$! I’m not even saying that you can tell a guy’s real skill with anything less than, say, 2000 – 3000 shots against. But the skill exists, it’s there, and it shows up in the data.

by Tom Awad on Apr 15, 2010 3:44 PM EDT up reply actions 0 recs

I must be dense, so let me illustrate my problem with this methodology with an illustration and maybe someone can explain to me where I’m going wrong.

Suppose that there were absolutely no luck effects in goaltending, and that the ES save percentage for the 2009-2010 not only represented each goaltender’s true talent, but was a fixed proportion that was 100% repeatable. (The goaltenders were secretly replaced with elaborate forcefields with computers carefully programmed to allow a very specific proportion of pucks through, and they are 100% infaillible.)

Now as far as I can tell, in this hypothetical scenario, the graphs would be the same as the 2009-2010 season, the standard deviation among the goaltenders would be the same as the 2009-2010 season, but in reality there are no luck effects and if we continued in this vein in the 2010-2011 and 2011-2012 seasons, the graphs would look exactly the same as well. But here we’re not looking at other seasons; we’re only looking at the spread of save percentages for the 09-10 seasons and comparing it to the 10,000 coin-flipping seasons.

In this scenario, how would your methodology detect that there were no luck effects involved? How would it come to a different conclusion than when looking at data with some amount of luck involved?

That’s what I don’t get — how do you decide that there are luck effects in goaltending without looking at how much variance there is with any individual goaltender’s performance?

by MathMan on Apr 15, 2010 10:46 AM EDT up reply actions 0 recs


You don’t have to look at any individual goaltender’s performance to know how much luck there is. Assume a binomial distribution. For a goaltender with a true 0.900 skill and 2000 shots against, the variance on his goals against is 2000 * 0.9 * 0.1 = 180, so his standard deviation of goals against is sqrt(180) = 13.4 and his standard deviation of save percentage is 13.4 / 2000 = 0.0067. Therefore, if you had a population of goaltenders who were all of the same "true" talent level and all faced 2000 shots, the standard deviation of sv% should be 0.0067. If it’s more, than there’s skill involved. In practice, it’s more.

By looking at the distribution of save percentages versus those expected if it was merely luck, you can tell how much is luck and how much is skill. Typically, over a single season, goaltending is roughly 55% luck and 45% skill. As Hawerchuk pointed out above, in the long run skill becomes more important.

by Tom Awad on Apr 15, 2010 3:49 PM EDT up reply actions 0 recs

MathMan,

Basically you’re asking "how do we know that it’s fair to assume goaltenders are subject to binomial variability the way coins are?" And the answer is: we don’t. We are making an assumption, but it’s an assumption that has empirically worked in modeling other sport skills. (Hitters in baseball, etc.)

Think of it this way: if we look at the distribution of a particular sport skill and find that it looks exactly like a chance distribution, it may not prove that the skill is all luck. However, if the distribution looks nothing like a chance distribution, that does prove that it can NOT be explained only by binomial randomness.

This gets into the philosophical realm i.e. Popper’s falsification theory that one can never prove something to be true but only prove something to be false. I don’t know, I guess at a certain point we all have to come up with our own baseline of what constitutes enough "proof" to accept something as being true.

by sunnymehta.com on Apr 16, 2010 12:18 PM EDT up reply actions 0 recs

Please make sure that any content you post is appropriate to Game Time, which means that it pertains to hockey, the Blues, frosty adult beverages, or puppies.

X
Log In Sign Up

forgot?
Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

Join St. Louis Game Time

You must be a member of St. Louis Game Time to participate.

We have our own Community Guidelines at St. Louis Game Time. You should read them.

Join St. Louis Game Time

You must be a member of St. Louis Game Time to participate.

We have our own Community Guidelines at St. Louis Game Time. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9355_tracker