/cdn.vox-cdn.com/uploads/chorus_image/image/38968764/cardfan.0.jpg)
We know a lot about the Blues players without using WOWY. Looking at the Blues Corsi results for last season we see:
Player |
Corsi% |
TARASENKO |
58.1 |
STEEN |
57.0 |
SOBOTKA |
56.9 |
SCHWARTZ |
56.7 |
BACKES |
55.5 |
SHATTENKIRK |
55.3 |
PIETRANGELO |
54.9 |
BERGLUND |
54.6 |
OSHIE |
54.5 |
BOUWMEESTER |
53.9 |
OTT |
53.7 |
PAAJARVI |
53.4 |
ROY |
53.2 |
LEOPOLD |
53.1 |
JACKMAN |
52.9 |
COLAIACOVO |
52.4 |
JASKIN |
52.3 |
POLAK |
49.4 |
CRACKNELL |
48.8 |
STEWART |
48.4 |
MORROW |
48.4 |
COLE |
48.4 |
PORTER |
47.2 |
REAVES |
46.3 |
LAPIERRE |
45.2 |
Tarasenko is the best Blues player, with Steen, Sobotka, and Schwartz not far behind. Lapierre is the worst Blues player, with Reaves and Porter also at the bottom of the table.
Without resorting to WOWY, we know that Tarasenko had to have a tendency to make the other Blues players better. His numbers were better than everybody else, so your shifts with Tarasenko had to be better than your shifts without him (on average). Similarly, your shifts with Lapierre had to be worse than your shifts without him. Remember that WOWY isn't really With Or Without You. It's With You or With Some Other Guy. Tarasenko is, by definition, better than SOG, whoever that may be. The only place where WOWY would be of value is if there is some unsuspected synergy between two players.
For example, if you look at David Backes With/Without Patrik Berglund, it looks like there may be something there. Backes with Berglund is 68.2%. Backes without Berglund is 55.4% Maybe Bergy has some synergy with Backes.
Card Shuffling Model
Let's use a card shuffling model for WOWY. David Backes had 1388 5v5 shifts in 2013-14. He had 1010 offensive Corsi events and 810 defensive Corsi events. Imagine that we write the results of each shift on cards, one shift to a card. We would have a ton of cards that say "Offense: 0 Defense: 0", a lot that say "Offense: 1 Defense: 0", a lot that say "Offense: 0 Defense: 1", etc., on up to 4 cards that say "Offense: 5 Defense: 0", and one card that says "Offense: 1 Defense: 7".
If Bergy has no effect of Backes, if we shuffle the cards and deal them randomly, he should get some good cards and some bad cards. If Bergy has a positive effect on Backes, he should get more good cards than we can account for simply by random shuffling. If Bergy has a negative effect on Backes, he should get more bad cards than predicted.
It looks like Backes had about 35 shifts with Berglund and therefor about 1353 without. (The numbers are a little inexact because sometimes their shifts didn't begin and end at the same time.) To simulate Backes with and without Berglund we can take the 1388 Backes cards and shuffle them up. Put 35 cards in the "With" pile and the other 1353 in the "Without" pile. Total up the Corsi events "With" and "Without". Repeat 10,000 times. What we get for Berglund is the following:
The lowest "With" we see is 24.4%. The highest "With" is 86.7% The 95% Confidence Interval is 38.1% to 72.7%. Bergy isn't magic! The 68.2% we saw could easily just be randomness.
Next up, Backes and Steen. 990 shifts/cards with Steen 398 without. Then Backes and Oshie. 880 shifts/cards "With" and 508 "Without". Repeat the process throughout the lineup. We get
Player |
With/Without |
Actual With |
0% |
2.5% |
97.5% |
100% |
BACKES |
BERGLUND |
0.682 |
0.244 |
0.381 |
0.727 |
0.867 |
BACKES |
BOUWMEESTER |
0.567 |
0.505 |
0.529 |
0.582 |
0.603 |
BACKES |
COLAIACOVO |
0.558 |
0.326 |
0.440 |
0.671 |
0.794 |
BACKES |
COLE |
0.506 |
0.396 |
0.470 |
0.639 |
0.731 |
BACKES |
CRACKNELL |
0.500 |
0.000 |
0.200 |
0.889 |
1.000 |
BACKES |
JACKMAN |
0.539 |
0.465 |
0.507 |
0.603 |
0.658 |
BACKES |
JASKIN |
0.778 |
0.000 |
0.273 |
0.824 |
1.000 |
BACKES |
LAPIERRE |
0.545 |
0.071 |
0.310 |
0.800 |
1.000 |
BACKES |
LEOPOLD |
0.562 |
0.379 |
0.456 |
0.651 |
0.709 |
BACKES |
MORROW |
0.468 |
0.293 |
0.407 |
0.703 |
0.830 |
BACKES |
OSHIE |
0.562 |
0.512 |
0.534 |
0.576 |
0.598 |
BACKES |
OTT |
0.481 |
0.315 |
0.444 |
0.664 |
0.811 |
BACKES |
PAAJARVI |
0.500 |
0.138 |
0.342 |
0.765 |
0.936 |
BACKES |
PIETRANGELO |
0.572 |
0.503 |
0.530 |
0.580 |
0.613 |
BACKES |
POLAK |
0.509 |
0.431 |
0.501 |
0.609 |
0.658 |
BACKES |
PORTER |
0.615 |
0.118 |
0.321 |
0.786 |
0.950 |
BACKES |
REAVES |
0.396 |
0.262 |
0.392 |
0.717 |
0.900 |
BACKES |
ROY |
0.624 |
0.325 |
0.438 |
0.674 |
0.790 |
BACKES |
SCHWARTZ |
0.596 |
0.474 |
0.510 |
0.600 |
0.640 |
BACKES |
SHATTENKIRK |
0.548 |
0.466 |
0.508 |
0.601 |
0.639 |
BACKES |
SOBOTKA |
0.450 |
0.207 |
0.372 |
0.737 |
0.893 |
BACKES |
STEEN |
0.554 |
0.523 |
0.538 |
0.573 |
0.589 |
BACKES |
STEWART |
0.468 |
0.397 |
0.479 |
0.633 |
0.698 |
BACKES |
TARASENKO |
0.710 |
0.277 |
0.406 |
0.702 |
0.827 |
In this table, 0% is the lowest value seen in the Monte Carlo, 2.5% is the lower limit of the 95% Confidence Interval, 97.5% is the upper limit of the 95% Confidence Interval, and 100% is the highest value seen. So when it comes to their effect on Backes, Stewart is just below the lower limit of what you would expect to get from random shuffling and Tarasenko just above the upper limit.
If you do this for all 25 Blues players, and look at all 25x24 = 600 WOWY interactions, we get 38 pairs where the observed value is above the upper boundary of the 95% Confidence Interval and 45 pairs where the observed value is below the lower boundary of the 95% CI. (We would expect a total of about 30 pairs to fall out of the CI (5% of 600)). The players who made their teammates better:
Players |
Number of Pairs |
Berglund |
3 |
Bouwmeester |
1 |
Colaiacovo |
1 |
Ott |
1 |
Paajarvi |
1 |
Pietrangelo |
2 |
Roy |
3 |
Schwartz |
3 |
Sobotka |
6 |
Steen |
7 |
Tarasenko |
7 |
20 of the 38, slightly more than half, come from Tarasenko, Steen, and Sobotka. The players who made their teammates worse:
Players |
Number of Pairs |
Backes |
1 |
Bouwmeester |
1 |
Cole |
5 |
Cracknell 1 |
1 |
Jackman 2 |
2 |
Lapierre 10 |
10 |
Leopold 1 |
1 |
Morrow 3 |
3 |
Ott 1 |
1 |
Polak 5 |
5 |
Porter 3 |
3 |
Reaves 5 |
5 |
Stewart 6 |
6 |
Tarasenko 1 |
1 |
A little less than half (21 of 48) come from Lapierre, Stewart, and Reaves. Add in Polak and they explain more than half. Amusingly, Tarasenko makes Lapierre worse.
Sample size issues
Bouwmeester makes Cracknell "better" based on 72 events (41min 27sec). Paajarvi makes Roy "better" based on 132 events (75m 37s). Bouwmeester made Jackman "worse". 54 events or 28m 38s. Tarasenko and Lapierre had 50 events together in 29m 12s. Pretty much all of the "significant" effects are based on very limited samples sizes.
Talent is persistent
If Bouwmeester truly makes Cracknell better, it ought to happen year after year. Let me define WDiff as (Corsi With) – (Corsi Without). In the Backes/Berglund example above WDiff is 68.2% - 55.5% = 12.7%. or 0.127. Looking back a year at the available pairs
Player |
With/Without |
2013-14 WDiff |
2012-13 WDiff |
COLAIACOVO |
BERGLUND |
0.160 |
NA |
OTT |
BERGLUND |
0.096 |
NA |
SHATTENKIRK |
BERGLUND |
0.060 |
0.000 |
CRACKNELL |
BOUWMEESTER |
0.173 |
-0.095 |
BERGLUND |
COLAIACOVO |
0.108 |
NA |
BERGLUND |
OTT |
0.078 |
NA |
ROY |
PAAJARVI |
0.114 |
NA |
BOUWMEESTER |
PIETRANGELO |
0.095 |
0.099 |
PORTER |
PIETRANGELO |
0.141 |
0.036 |
LAPIERRE |
ROY |
0.156 |
NA |
PAAJARVI |
ROY |
0.120 |
NA |
STEEN |
ROY |
0.169 |
NA |
JACKMAN |
SCHWARTZ |
0.061 |
-0.003 |
MORROW |
SCHWARTZ |
0.172 |
NA |
POLAK |
SCHWARTZ |
0.088 |
-0.055 |
BERGLUND |
SHATTENKIRK |
0.076 |
0.099 |
JACKMAN |
SHATTENKIRK |
0.056 |
0.030 |
ROY |
SHATTENKIRK |
0.084 |
NA |
COLE |
SOBOTKA |
0.120 |
0.147 |
MORROW |
SOBOTKA |
0.125 |
NA |
POLAK |
SOBOTKA |
0.099 |
0.063 |
SHATTENKIRK |
SOBOTKA |
0.067 |
0.007 |
STEWART |
SOBOTKA |
0.114 |
0.026 |
TARASENKO |
SOBOTKA |
0.070 |
0.081 |
BERGLUND |
STEEN |
0.135 |
No events |
BOUWMEESTER |
STEEN |
0.061 |
0.131 |
OSHIE |
STEEN |
0.064 |
-0.058 |
PIETRANGELO |
STEEN |
0.061 |
-0.023 |
POLAK |
STEEN |
0.072 |
-0.215 |
ROY |
STEEN |
0.206 |
NA |
TARASENKO |
STEEN |
0.141 |
-0.021 |
BACKES |
TARASENKO |
0.160 |
0.042 |
BERGLUND |
TARASENKO |
0.062 |
0.101 |
JACKMAN |
TARASENKO |
0.105 |
0.097 |
PIETRANGELO |
TARASENKO |
0.060 |
0.033 |
SHATTENKIRK |
TARASENKO |
0.091 |
0.036 |
SOBOTKA |
TARASENKO |
0.092 |
-0.093 |
STEEN |
TARASENKO |
0.152 |
0.057 |
So a lot of these "effects" look to be merely random fluctuations.
A problem with this model
An issue I have with this model is that I think it substantially underestimates the variability in these systems. As a result, this model suggests that some player A/player B pairs show a statistically significant effect when really there is nothing there. I will get into why this model underestimates the variability in part 4 (so you have that to look forward to!).
So let's take a different approach. Doctors like placebos. ("Placebo. So powerful, it's the standard by which medications are tested. ") In part 2, I look at WOWY using a form of placebo.