/cdn.vox-cdn.com/uploads/chorus_image/image/37287396/451336636.0.jpg)
I originally wrote some of this last summer but never posted it. So the correlations below talk about the Blues 2011-12 data.
Hopefully, most people are familiar with Corsi. Corsi On is another name for Corsi. It's the Corsi generated when a player is ON the ice. You can also calculate Corsi Off, which is the Corsi that happens on the ice while the player is OFF the ice. Corsi Rel is Corsi On – Corsi Off.
Silver Sevens said, "Corsi Rel is a relatively simplistic way of measuring how effective a player is in driving possession relative to the rest of his team. At its most basic level, the stat reflects a player's raw EV Corsi relative to the raw EV Corsi to the rest of the team when he is not on the ice." Pension Plan Puppets said "Corsi Rel is another statistic that is meaningful here, as it indicates how solid a player's possession statistics are relative to those of their team-mates. "
As an estimator of player ability, Corsi Rel has problems. It doesn't actually measure what it is supposed to measure. It depends on ice time. It's average is not zero. It's probably not the best estimator for player effect.
It doesn't measure what it is supposed to measure
"(Corsi Rel) reflects a player's raw EV Corsi relative to the raw EV Corsi to the rest of the team when he is not on the ice." While it is defined that way, that's not actually quite what it measures. The main driving force behind Corsi Rel is a actually a relatively small number of teammates, in particular the ones who replace the player directly. I looked at the Blues 2011-2012 data. I looked at the correlation between CorsiRel for each player with the CorsiOn of each of his teammates. I limited this to players with 20 or more games. Logically, if the correlation between the two players is positive, they are on the ice together. Conversely, if the correlation is negative, they are not on the ice together. Patrik Berglund had 6 players who made a difference in his Corsi Rel. All have negative correlation.
Teammate |
Corr |
p-value |
signif |
ARNOTT |
-0.4081413 |
0.0003724 |
*** |
BACKES |
-0.3956563 |
0.000234 |
*** |
COLAIACOVO |
-0.146209 |
0.249 |
|
DAGOSTINI |
0.02325233 |
0.8662 |
|
JACKMAN |
-0.1523873 |
0.1744 |
|
LANGENBRUNNER |
-0.3800199 |
0.001175 |
*** |
MCDONALD |
-0.1835629 |
0.3798 |
|
NICHOL |
-0.2926929 |
0.008421 |
*** |
OSHIE |
-0.08341926 |
0.4619 |
|
PIETRANGELO |
-0.05893433 |
0.6012 |
|
POLAK |
-0.05671474 |
0.6242 |
|
REAVES |
-0.1837845 |
0.1598 |
|
SHATTENKIRK |
-0.09566291 |
0.3956 |
|
SOBOTKA |
-0.08237294 |
0.4884 |
|
STEEN |
-0.3861834 |
0.01053 |
*** |
STEWART |
0.16631 |
0.143 |
|
HUSKINS |
-0.1488693 |
0.4776 |
|
GRACHEV |
-0.1452893 |
0.4788 |
|
PORTER |
-0.4142457 |
0.003795 |
*** |
COLE |
-0.1598588 |
0.4353 |
|
RUSSELL |
-0.1231274 |
0.4315 |
|
PERRON |
0.2456533 |
0.06549 |
|
CROMBEEN |
-0.2325226 |
0.1488 |
|
Roman Polak had 5. Kris Russell, was positive. The other four defensemen were negative.
Teammate |
Corr |
p-value |
signif |
ARNOTT |
-0.1193429 |
0.3361 |
|
BACKES |
-0.1619794 |
0.1593 |
|
BERGLUND |
-0.04074518 |
0.725 |
|
COLAIACOVO |
-0.2784425 |
0.03273 |
*** |
DAGOSTINI |
0.02740952 |
0.8425 |
|
JACKMAN |
-0.3232292 |
0.004399 |
*** |
LANGENBRUNNER |
-0.148144 |
0.2389 |
|
MCDONALD |
-0.3058454 |
0.1371 |
|
NICHOL |
-0.1094956 |
0.3497 |
|
OSHIE |
-0.1261966 |
0.2807 |
|
PIETRANGELO |
-0.3003548 |
0.008384 |
*** |
REAVES |
-0.1167986 |
0.3913 |
|
SHATTENKIRK |
-0.4595753 |
2.964e-05 |
*** |
SOBOTKA |
-0.1700962 |
0.1655 |
|
STEEN |
-0.054706 |
0.7374 |
|
STEWART |
0.0343095 |
0.7717 |
|
HUSKINS |
-0.06416549 |
0.7881 |
|
GRACHEV |
-0.09853764 |
0.632 |
|
PORTER |
-0.007140809 |
0.9629 |
|
COLE |
-0.01482572 |
0.9452 |
|
RUSSELL |
0.3188754 |
0.04491 |
*** |
PERRON |
-0.01604327 |
0.9101 |
|
CROMBEEN |
-0.03349883 |
0.8396 |
|
Most players on the Blues have between 3 and 6 teammates that matter. David Backes had the most at 8. Andy McDonald, who only played 26 games, had 2 (Nichol and Stewart, both negative associations). Kent Huskins, who, like McDonald, played 26 games, and Vladimir Sobotka each had 1. (For Huskins, a positive association with Crombeen and for Sobotka a negative association with McDonald) Matt D'Agostini,despite playing 56 games, actually had none. Note that this is not symmetric. Sobotka's Corsi Rel was correlated with McDonald's CorsiOn but McDonald's Corsi Rel was not correlated with Sobotka's CorsiOn.
So a player's Corsi Rel generally is not "relative to the rest of his team." It's relative to a small subset of the rest of the team. As a result, replacing a single player (due to injury or trade) may not make a substantial impact on team Corsi, but it could have a huge impact on a given player's Corsi Rel.
Corsi Rel depends on ice time
Imagine two players. Both are Corsi +6. Both play on teams that are Corsi 0. Player A plays 25% of his team's minutes, Player B plays 50% of his team's minutes. Player A will have a Corsi Rel of +8. Player B will have a Corsi Rel of +12. This seems like undesirable behavior.
The average Corsi Rel is not 0.0
I was trying to run a Monte Carlo using Corsi Rel QOC. I expected to get an average of 0.0, but I kept getting answers that were not. I rewrote my simulation a couple times before I decided the problem wasn't the simulation but the input. Sure enough, when I looked at Gabe's tables, I get a weighted average for Corsi Rel of about 0.3. Mathematically, Corsi Rel isn't necessarily any particular value. If you set up the equation for Corsi Rel, multiply by Time On Ice, and sum over all players, you eventually get a term that looks like:
It took me a while to figure out the problem here. ForOff is not quite equal to AgOff. In some games the number of players on the benches is not equal between the two teams so each on-ice event creates a different number of ForOff and AgOff events . Both terms total up to over a million, and the difference is less than 1000 total. However, that is still roughly 1 per player. The ratio of TimeOn/TimeOff is about 0.30, which (I suspect) then gives us the observed result.
As a result, any time you see a player utilization table using Corsi Rel, the zero point is not 0.0. It's more like 0.3
Corsi Rel is not the best estimator of player effect
Corsi Rel is trying to estimate the impact a player on Corsi. So how would you actually estimate this? In essence, this is a one-way ANOVA. Your players constitute categorical independent variables. If you did the ANOVA, the average of each categorical variable is compared to the global mean. Here the global mean is CorsiTeam. The best estimate of player effect therefor is CorsiOn – CorsiTeam. I call that CorsiDelta. If someone else has described this metric and called it something else, I'm sorry. I searched but could not find it described anywhere else.
Corsi Rel is a "biased" estimator of player effect (in the statistical sense of the word "bias"). Corsi Delta is "unbiased". Corsi Delta will always have a magnitude that is less than the magnitude of Corsi Rel.
Corsi Delta versus Corsi Rel
Corsi Delta does sum to 0.0 when weighted by time on ice. Corsi Delta is independent of ice time. Corsi Delta, despite being defined relative to the team as a whole, does still depend on a small number of teammates.