clock menu more-arrow no yes mobile

Filed under:

WOWY Is Worthless: Part 1

WOWY is pretty worthless. It's things we know from Corsi plus a big dose of statistical noise. In Part 1, I use a card shuffling model to look at the amount of randomness in WOWY.

Pick a card, any card
Pick a card, any card
Dennis Galloway

We know a lot about the Blues players without using WOWY. Looking at the Blues Corsi results for last season we see:

Player

Corsi%

TARASENKO

58.1

STEEN

57.0

SOBOTKA

56.9

SCHWARTZ

56.7

BACKES

55.5

SHATTENKIRK

55.3

PIETRANGELO

54.9

BERGLUND

54.6

OSHIE

54.5

BOUWMEESTER

53.9

OTT

53.7

PAAJARVI

53.4

ROY

53.2

LEOPOLD

53.1

JACKMAN

52.9

COLAIACOVO

52.4

JASKIN

52.3

POLAK

49.4

CRACKNELL

48.8

STEWART

48.4

MORROW

48.4

COLE

48.4

PORTER

47.2

REAVES

46.3

LAPIERRE

45.2

Tarasenko is the best Blues player, with Steen, Sobotka, and Schwartz not far behind. Lapierre is the worst Blues player, with Reaves and Porter also at the bottom of the table.

Without resorting to WOWY, we know that Tarasenko had to have a tendency to make the other Blues players better. His numbers were better than everybody else, so your shifts with Tarasenko had to be better than your shifts without him (on average). Similarly, your shifts with Lapierre had to be worse than your shifts without him. Remember that WOWY isn't really With Or Without You. It's With You or With Some Other Guy. Tarasenko is, by definition, better than SOG, whoever that may be. The only place where WOWY would be of value is if there is some unsuspected synergy between two players.

For example, if you look at David Backes With/Without Patrik Berglund, it looks like there may be something there. Backes with Berglund is 68.2%. Backes without Berglund is 55.4% Maybe Bergy has some synergy with Backes.

Card Shuffling Model

Let's use a card shuffling model for WOWY. David Backes had 1388 5v5 shifts in 2013-14. He had 1010 offensive Corsi events and 810 defensive Corsi events. Imagine that we write the results of each shift on cards, one shift to a card. We would have a ton of cards that say "Offense: 0 Defense: 0", a lot that say "Offense: 1 Defense: 0", a lot that say "Offense: 0 Defense: 1", etc., on up to 4 cards that say "Offense: 5 Defense: 0", and one card that says "Offense: 1 Defense: 7".

If Bergy has no effect of Backes, if we shuffle the cards and deal them randomly, he should get some good cards and some bad cards. If Bergy has a positive effect on Backes, he should get more good cards than we can account for simply by random shuffling. If Bergy has a negative effect on Backes, he should get more bad cards than predicted.

It looks like Backes had about 35 shifts with Berglund and therefor about 1353 without. (The numbers are a little inexact because sometimes their shifts didn't begin and end at the same time.) To simulate Backes with and without Berglund we can take the 1388 Backes cards and shuffle them up. Put 35 cards in the "With" pile and the other 1353 in the "Without" pile. Total up the Corsi events "With" and "Without". Repeat 10,000 times. What we get for Berglund is the following:

The lowest "With" we see is 24.4%. The highest "With" is 86.7% The 95% Confidence Interval is 38.1% to 72.7%. Bergy isn't magic! The 68.2% we saw could easily just be randomness.

Next up, Backes and Steen. 990 shifts/cards with Steen 398 without. Then Backes and Oshie. 880 shifts/cards "With" and 508 "Without". Repeat the process throughout the lineup. We get

Player

With/Without

Actual With

0%

2.5%

97.5%

100%

BACKES

BERGLUND

0.682

0.244

0.381

0.727

0.867

BACKES

BOUWMEESTER

0.567

0.505

0.529

0.582

0.603

BACKES

COLAIACOVO

0.558

0.326

0.440

0.671

0.794

BACKES

COLE

0.506

0.396

0.470

0.639

0.731

BACKES

CRACKNELL

0.500

0.000

0.200

0.889

1.000

BACKES

JACKMAN

0.539

0.465

0.507

0.603

0.658

BACKES

JASKIN

0.778

0.000

0.273

0.824

1.000

BACKES

LAPIERRE

0.545

0.071

0.310

0.800

1.000

BACKES

LEOPOLD

0.562

0.379

0.456

0.651

0.709

BACKES

MORROW

0.468

0.293

0.407

0.703

0.830

BACKES

OSHIE

0.562

0.512

0.534

0.576

0.598

BACKES

OTT

0.481

0.315

0.444

0.664

0.811

BACKES

PAAJARVI

0.500

0.138

0.342

0.765

0.936

BACKES

PIETRANGELO

0.572

0.503

0.530

0.580

0.613

BACKES

POLAK

0.509

0.431

0.501

0.609

0.658

BACKES

PORTER

0.615

0.118

0.321

0.786

0.950

BACKES

REAVES

0.396

0.262

0.392

0.717

0.900

BACKES

ROY

0.624

0.325

0.438

0.674

0.790

BACKES

SCHWARTZ

0.596

0.474

0.510

0.600

0.640

BACKES

SHATTENKIRK

0.548

0.466

0.508

0.601

0.639

BACKES

SOBOTKA

0.450

0.207

0.372

0.737

0.893

BACKES

STEEN

0.554

0.523

0.538

0.573

0.589

BACKES

STEWART

0.468

0.397

0.479

0.633

0.698

BACKES

TARASENKO

0.710

0.277

0.406

0.702

0.827

In this table, 0% is the lowest value seen in the Monte Carlo, 2.5% is the lower limit of the 95% Confidence Interval, 97.5% is the upper limit of the 95% Confidence Interval, and 100% is the highest value seen. So when it comes to their effect on Backes, Stewart is just below the lower limit of what you would expect to get from random shuffling and Tarasenko just above the upper limit.

If you do this for all 25 Blues players, and look at all 25x24 = 600 WOWY interactions, we get 38 pairs where the observed value is above the upper boundary of the 95% Confidence Interval and 45 pairs where the observed value is below the lower boundary of the 95% CI. (We would expect a total of about 30 pairs to fall out of the CI (5% of 600)). The players who made their teammates better:

Players

Number of Pairs

Berglund

3

Bouwmeester

1

Colaiacovo

1

Ott

1

Paajarvi

1

Pietrangelo

2

Roy

3

Schwartz

3

Sobotka

6

Steen

7

Tarasenko

7

20 of the 38, slightly more than half, come from Tarasenko, Steen, and Sobotka. The players who made their teammates worse:

Players

Number of Pairs

Backes

1

Bouwmeester

1

Cole

5

Cracknell 1

1

Jackman 2

2

Lapierre 10

10

Leopold 1

1

Morrow 3

3

Ott 1

1

Polak 5

5

Porter 3

3

Reaves 5

5

Stewart 6

6

Tarasenko 1

1

A little less than half (21 of 48) come from Lapierre, Stewart, and Reaves. Add in Polak and they explain more than half. Amusingly, Tarasenko makes Lapierre worse.

Sample size issues

Bouwmeester makes Cracknell "better" based on 72 events (41min 27sec). Paajarvi makes Roy "better" based on 132 events (75m 37s). Bouwmeester made Jackman "worse". 54 events or 28m 38s. Tarasenko and Lapierre had 50 events together in 29m 12s. Pretty much all of the "significant" effects are based on very limited samples sizes.

Talent is persistent

If Bouwmeester truly makes Cracknell better, it ought to happen year after year. Let me define WDiff as (Corsi With) – (Corsi Without). In the Backes/Berglund example above WDiff is 68.2% - 55.5% = 12.7%. or 0.127. Looking back a year at the available pairs

Player

With/Without

2013-14 WDiff

2012-13 WDiff

COLAIACOVO

BERGLUND

0.160

NA

OTT

BERGLUND

0.096

NA

SHATTENKIRK

BERGLUND

0.060

0.000

CRACKNELL

BOUWMEESTER

0.173

-0.095

BERGLUND

COLAIACOVO

0.108

NA

BERGLUND

OTT

0.078

NA

ROY

PAAJARVI

0.114

NA

BOUWMEESTER

PIETRANGELO

0.095

0.099

PORTER

PIETRANGELO

0.141

0.036

LAPIERRE

ROY

0.156

NA

PAAJARVI

ROY

0.120

NA

STEEN

ROY

0.169

NA

JACKMAN

SCHWARTZ

0.061

-0.003

MORROW

SCHWARTZ

0.172

NA

POLAK

SCHWARTZ

0.088

-0.055

BERGLUND

SHATTENKIRK

0.076

0.099

JACKMAN

SHATTENKIRK

0.056

0.030

ROY

SHATTENKIRK

0.084

NA

COLE

SOBOTKA

0.120

0.147

MORROW

SOBOTKA

0.125

NA

POLAK

SOBOTKA

0.099

0.063

SHATTENKIRK

SOBOTKA

0.067

0.007

STEWART

SOBOTKA

0.114

0.026

TARASENKO

SOBOTKA

0.070

0.081

BERGLUND

STEEN

0.135

No events

BOUWMEESTER

STEEN

0.061

0.131

OSHIE

STEEN

0.064

-0.058

PIETRANGELO

STEEN

0.061

-0.023

POLAK

STEEN

0.072

-0.215

ROY

STEEN

0.206

NA

TARASENKO

STEEN

0.141

-0.021

BACKES

TARASENKO

0.160

0.042

BERGLUND

TARASENKO

0.062

0.101

JACKMAN

TARASENKO

0.105

0.097

PIETRANGELO

TARASENKO

0.060

0.033

SHATTENKIRK

TARASENKO

0.091

0.036

SOBOTKA

TARASENKO

0.092

-0.093

STEEN

TARASENKO

0.152

0.057

So a lot of these "effects" look to be merely random fluctuations.

A problem with this model

An issue I have with this model is that I think it substantially underestimates the variability in these systems. As a result, this model suggests that some player A/player B pairs show a statistically significant effect when really there is nothing there. I will get into why this model underestimates the variability in part 4 (so you have that to look forward to!).

So let's take a different approach. Doctors like placebos. ("Placebo. So powerful, it's the standard by which medications are tested. ") In part 2, I look at WOWY using a form of placebo.