Welcome to the Blueshirts Breakaway Hockey Lexicon! The purpose of this is to serve as a one stop resource for anyone looking to learn more about hockey analysis in general or specific “fancy stats.” We created this comprehensive resource for individuals looking to understand the key concepts behind hockey analytics, as well as the specific metrics and models available to the public. This resource also provides information and links to key analytical resources you can access for free online, as well as a litany of extremely smart individuals that everyone should follow on Twitter to learn more.
Our goal with this resources is to not only provide the definitions for the litany of advanced stats and concepts available, but also give context to why the stats are important, and how you can use them. All of the specific stats used in this piece are courtesy of Corsica, unless otherwise stated.
We also created a Metrics Glossary, which serves as an abbreviated version of this resource and contains only the terms and definitions. The Metrics Glossary is also incorporated throughout the site, so that any term used in an article will be linked to a little flyout section containing the definition.
The content of the Hockey Lexicon and Metrics Glossary were created by and are maintained by Drew Way. The design and implementation throughout the site of the resources are courtesy of Dean Chutis.
Key Concepts
5V5 PRODUCTION
5v5 statistics, quite simply, are statistics amassed while both teams have five skaters on the ice. Hockey analysts don’t agree on much, but one thing that most tend to acknowledge is the fact that 5v5 production is much more representative of a player’s true ability than total statistics. Statistics that represent all situations can easily become bloated (or depressed) based on a player’s usage on the power play and penalty kill. Further, we specify 5v5 as opposed to all even strength play in order to strip away empty net goals and situations where both teams are a man down, which can pad a player’s statistics.
This is not to say that you should just throw away non-5v5 data; any good analyst will tell you that you never throw away good data. However, for the purposes of player analysis, particularly when discussing shot quality and possession metrics, it is best to attempt to remove any factors that may be artificially skewing the data one way or another.
Michael Grabner’s 2017-2018 season serves as an excellent test case for why 5v5 production is important. Through the first 44 games of the 2017-2018 season, Grabner led the team with 19 goals in all situations, which is approximately a 36-goal pace! However, when you look at the 5v5 data, you quickly see that his goal total was being inflated by his proclivity to score empty net goals (which any Ranger fan knows is not something the Rangers have been particularly adept at the past few seasons, so it is a valuable skill). Grabner still led the team in 5v5 goals, but his total drops all the way down from 19 to 12 due to the fact that he had scored 6 empty netters and one at 4v4.
This isn’t to say that Grabner’s non-5v5 production isn’t important, because it most certainly is. However, when you are evaluating league goal scorers, it would be inappropriate to think Grabner is an equivalent goal scorer to the likes of Auston Matthews or Vladimir Tarasenko, both of whom also had 19 goals at the time of writing this.
DEPLOYMENT
Whenever you analyze a player, whether you are using the eye test or advanced analytics, it is important to understand the role in which the player is being used by the coach. Is the player a top-6 scoring wing, whom the coach throws onto the ice for as many offensive zone faceoffs as possible? Is he a shutdown center that the coach throws onto the ice against the opposing team’s top line? Is he a third-paring defenseman that gets sheltered by the coach, or in layman’s terms, is the player primarily used in the offensive zone and is only on the ice against weaker opposition lines?
Understanding a player’s role on the team and how he is deployed by the coach, are critical to understanding how well he is actually playing. If a forward is commonly used in a defensive role, and his primary responsibility is to prevent the opposing top line from scoring, it would be irresponsible for us as fans to expect him to put up lofty point totals. Conversely, a defenseman who is primarily used against weaker opponents and in the offensive zone (i.e. sheltered) is likely to put up inflated shot metrics and point totals compared to a defenseman tasked with shutting down the opposition’s best players.
There are a number of components that must be considered to fully understand how a player is being deployed:
Zone Starts – Simply put, a zone start is any time a player is on the ice for a faceoff. We all know hockey is a fluid sport, with line changes happening all the time during play. This makes it difficult to track how exactly a player is deployed, particularly due to the fact that line changes that occur during an active play often are only partial changes, so the lines get temporarily scrambled and a player might be out in a situation that the coach didn’t necessarily plan. Because of this, and the general chaotic nature of line changes at times, it isn’t all that valuable to track player deployment in terms of the situation every time they touch the ice.
The reason analysts use zone start to measure deployment is because, with the exception of icings following faceoffs, a coach has the ability to specify exactly which players he wants on the ice for the faceoff. Coaches take a number of variables into account when deciding who to deploy on a faceoff, including the state of the game and the zone the puck is in (offensive, neutral or defensive). Further, the home team has the benefit of “last change,” which means they can wait for the away team to choose who to deploy before they choose, allowing the home team to also consider the opponents on the ice when choosing who to deploy.
When analyzing zone starts, data sites such as Corsica provide statistics that show the number of zone starts or percentage of zone starts a player takes in each of the three zones: offensive, neutral or defensive. Typically, if a player has a higher percentage of offensive zone starts than defensive, it means the coach views him as a player he wants on the ice in offensive situations, most likely because he has a relatively strong ability to score or playmake compared to his other teammates. Conversely, if a player has an disproportionally large amount of defensive zone starts, it means the coach views that player as a defensive-oriented player, who he trusts to prevent the opposition from scoring after a defensive zone faceoff.
Using the 2016-2017 Rangers as an example, when looking at the zone starts, we see that head coach Alain Vigneualt viewed Brady Skjei and Chris Kreider as players he wanted on the ice in more offensive situations, as they were the only two players with a minimum of 500 minutes played with over 35% of their zone starts coming in the offensive zone. Conversely, Josh Jooris, Kevin Hayes and Nick Holden were all viewed by the coach as defensive-first players, as each had over 37% of their zone starts in the defensive end of the ice.
Quality of Teammates – The reason why looking at the quality of teammates that a player shares large amounts of ice time with is simple: good players make those around them better. This is a very simple concept that we hear across all team sports. There is much debate within analytical communities across all sports regarding how to quantify just how much a player can improve the play of those around him, but nearly all analysts agree that the quality of the players that an individual plays with will certainly have an impact on his play. In fact, Connor Tompkins wrote a Hockey-Graphs article a few years back that concluded that “quality of teammate effects are observable in a full season sample size.” In layman’s terms, his study concluded beyond a reasonable doubt that the data shows that the quality of teammates an individual plays with has a measurable impact on his play and production.
WOWY analysis is one way you can view this impact. Using the Mika Zibanejad example from the WOWY Analysis section below, you can clearly see Mika’s impact on his teammates. Mika Zibanejad is no doubt a high quality player for the Rangers, and the chart shows that nearly every single player he shares considerable ice time with benefits from his presence on the ice. Because of these measurable impacts, we can conclude that teammate quality is certainly important, and should be considered when discussing player analysis and deployment.
Quality of Competition – There is considerably more debate within the analytics community regarding the impact that quality of competition has compared to quality of teammates. The Connor Tompkins article I shared above offers a second conclusion, with Tompkins stating that he “did not find evidence that coaches can choose the quality of competition their players face over a full season of play.” He goes on to discuss numerous reasons for why he did not find any evidence of this, most notably sample size issues.
He also explicitly states that, just because that the data shows no evidence of the impact, doesn’t mean that quality of competition doesn’t have an impact, and he went on to share that Garret Hohl (whom is the co-founder of both Hockey-Graphs and a data company that professional hockey teams pay for information) has research showing that quality of competition has a greater impact on an individual game or playoff series than across an entire season.
There have been numerous other studies on the topic, but long story short, while many agree that quality of competition does matter, there is little consensus over exactly how it matters and how to measure its impact. Various sites, including Corsica, provide statistics that are weighted to demonstrate the quality of competition a skater plays against, and the quality of teammates he plays with, which I will discuss in greater detail within their dedicated sections below.
One thing that is important to note is that most of these models use an opposing player’s ice time as the weight used to account for quality. By this I mean that a metric such as Corsi Quality of Competition uses the time on ice the opponents received as the weight, with the methodology being that better players receive more ice time. All Ranger fans know the relatively limited ice time received by Pavel Buchnevich to this point in his career is a perfect example that this is not necessarily the best barometer of the quality of a player. However, it is still a step in the right direction, and more often than not, the best players on a team receive the most ice time.
Now that we’ve defined all of the important components behind deployment, let’s look at a test case from the 2016-2017 season to see how deployment can influence a player’s statistics: Brady Skjei. During the 2016-2017 season, Skjei was most certainly sheltered by head coach Alain Vigneault, which is a common tactic used by coaches looking to develop young defenseman.
Of the seven NYR defenseman that logged at least 500 minutes last year, Skjei comfortably had the largest offensive zone start rate at 38.37%. Skjei also had the lowest defensive zone start rate at only 27.6%. For context, Kevin Klein had the second highest offensive zone start rate at 33.77% and the second lowest defensive zone start rate at 30.09%. Further, when you look at the time on ice numbers weighted for quality of competition, Skjei received the easiest minutes of any NYR defender with respect to the average competition faced.
We all know the point production Skjei accumulated last year (5 goals and a 33 assists). He also finished the season first amongst defenseman with at least 500 minutes in 5v5 Corsi for % (50.38%) and second only to Ryan McDonagh in 5v5 expected goals for % (49.47%). While I believe Brady Skjei is an excellent player and will progress to be a top pairing defenseman, I think it is important to recognize that some of Skjei’s impressive production and analytics from the 2016-2017 season were, in part, due to the fact that he was clearly sheltered by AV. When you start an abnormally high percentage of shifts in the offensive zone, and primarily play against weaker opposition, it’s easier to put up more impressive figures compared to a player with comparable ability in a less sheltered role.
Player Usage Charts – Rob Vollman recently added a dynamic Player Usage Charts tool to his website which is an excellent resource for those looking to understand how teams deploy their players and how well the players perform in these roles. Individuals can customize the parameters listed on the left (season, games played TOI, position and team) in order to view the exact information they are looking for.
The chart on the right displays the results contains five critical elements to interpreting it. The x-axis charts the offensive deployment of each player; the further to the right the player appears, the higher their offensive zone start percentage is. The y-axis charts the quality of competition the player faces on average; the higher a player appears, the stronger the competition he typically faces is. In this tool, the benchmark used to assess quality of competition is the metric relative Corsi quality of competition, which is the weighted average relative Corsi for percentage of the opponents that an individual faces over a specified period of time. In layman’s terms, if a player is high on the chart, it means they often are on the ice against opponents who have a large positive impact on their team’s ability to control the shot attempt battle.
You will notice in addition to the axis placement of players, that each player has a circle of varying size and color. The size of the circle represents the time on ice per-game that each player receives; the larger the circle, the more ice time the player receives. The color designates the player’s relative Corsi for percentage, with dark blue indicating a strong relative Corsi for percentage, and dark red indicating a very poor relative Corsi for %. In layman’s terms, if a player’s circle is dark blue, that means that they have a strong positive impact on the amount of shot attempts his team takes when he is one the ice, compared to when he is off, and dark red means the player has a strong negative impact.
The last element of the chart are little data flyout windows that appear if you hover-over a particular player, which contain the exact data applicable to the chart. Each data flyout contains the following information: games played, total minutes, minutes per-game, quality of competition relative Corsi for %, offensive zone start %, Corsi for %, Corsi for per-60, Corsi against per-60, goals for %, goals for per-60 and goals against per-60. All of the Corsi and goals data is in relative terms (which compares how the team does when the player is on vs. off the ice), and it is all also listed in a sortable table beneath the chart.
Lastly, each flyout also lists the player’s role based off of the data. So, for example, Tony DeAngelo is listed as an “effective sheltered defenseman” at the time of writing this (February 6, 2018). This is because he has a very high offensive zone start rate (indicated by his placement on the far right side of the graph) and low average quality of competition faced (indicated by his placement low on the graph), which combined mean that DeAngelo is “sheltered” by head coach Alain Vignealut. Sheltering young defenseman is a common tactic to aide development, and it is something we saw from the Rangers last year with Brady Skjei. DeAngelo also has a deep blue but small circle, indicating he has a strong relative Corsi for %, but minimal ice time. So, when yo combine all of this, you come to the conclusion that DeAngelo has performed well in a sheltered role, AKA an “effective sheltered defenseman.”
PREDICTIVE ABILITY
One thing that I will reference throughout this glossary is the fact that certain stats, particularly shot quantity and quality-related metrics, have greater predictive ability than goals. Noted hockey statistician Rob Vollman illustrated in his book—Hockey Abstract 2017—that goal differential is 90% correlated with same-season win percentage, but only 25% correlated with next-season win percentage. In other words, goal differential tells us a lot about what has happened this season, which makes sense given that the team that scores more goals wins the game, but has little predictive power for the future. There are a litany of reasons for this, many of which are outlined in Rob’s book, but much of it comes down to the simple fact that hockey is a sport where luck plays a significant role, and goals can be very fluky. I know, that’s not the most analytical answer in the world, but it’s simply an objective fact.
However, certain advanced stats, particularly expected goals, have been proven to harbor stronger predictive ability than goal differential. If you don’t believe me, check out this article on Hockey-Graphs, where Dawson Sprigings (better known as DTM About Heart on Twitter), provides all of the data proving this claim. Because of the sort of information discussed in Dawson’s article, advanced stats have really caught on. At the end of the day, the reason why advanced stats are so important is not because of their ability to tell us what has happened, or what is currently happening. The true value of advanced stats comes in their ability to tell us what is likely to happen in the future, aka their predictive ability.
PRIMARY POINT PRODUCTION
The concept of primary points is simple. A player is awarded a primary point if he either scores a goal or receives the primary assist. The primary assist is awarded to the player of the same team who last touched the puck before the goal scorer. The secondary assist goes to the player of the same team who touched the puck before the primary assister. Another way to look at this, is primary points are total points (the standard statistic all fans know and love) minus secondary assists.
The reason for using primary point production for player analysis instead of total points is that more often than not, the primary assist served a larger role in the goal being scored than the secondary assist. Further, multiple leading analysts have proven beyond a shadow of a doubt that primary assists are far more indicative of a player’s talent level and that they are also more repeatable. In layman’s terms, if a player racks up a high primary assist total, it is far more likely that this is due to his abilities and more likely to occur again in the future, compared to a player that only racks up a lofty secondary assist total.
When analyzing players, using primary assists instead of total assists is a good way to understand which players truly possess high levels of playmaking ability. Further, you can use secondary assists as a way to identify a player whose point total may be bloated compared to his actual talent level. If you are a fantasy hockey player, you can use this to your advantage by selling high on a guy that has padded his assist totals with secondary assists, as these are far less repeatable across the remainder of the season than primary assists.
RATE STATISTICS
Rate statistics, also referred to as per-60 minute or per-hour statistics, are a player’s or team’s statistics per-60 minutes of play. Most of the data-driven analysts in both the basketball and hockey worlds agree that rate statistics provide a more accurate depiction of a player’s talent than raw stat totals. In addition to the obvious advantage of adjusting for games missed due to injuries, a prominent reason for using rate statistics is that they help account for the effects of a coach who makes terrible deployment decisions. While cumulative stats are easier to digest for most people, they can severely over or underrate players’ true values due to them getting more or less ice time than deserved.
It is important to note that despite being per-60 minute stats and there being 60 minutes in a regulation hockey game, rate stats are not per-game stats. This is for multiple reasons. First, when analyzing certain metrics such as Corsi, you are often looking at 5v5 data, and it is rare for a single game to have 60 minutes of 5v5 play. Second, overtime games obviously run longer than 60 minutes. At the individual level, this is especially not the case, as it takes over 2 games for a skater to accumulate 60 minutes of ice time, and over 3 games for most skaters. To repeat and make it perfectly clear, rate stats ARE NOT a per-game statistic at either the team or skater level.
One thing to note when discussing rate stats is that sample size is critical. A player who is called up for one game and puts up a nice performance in only a few minutes of ice time will have tremendous rate stats, so it is important to eliminate players who haven’t garnered enough ice time. It is also imperative to understand player deployment (discussed next) when discussing rate statistics. A player’s rate statistics can quickly become inflated if they are sheltered by the coach, and therefor often play in the offensive zone and/or against weak opposition.
Pavel Buchnevich serves as the most obvious example on the Rangers for the value of rate statistics. One-third of the way through the 2017-2018 season, Buchnevich had 2.48 primary points per-60 at all strengths, good for the most on the team and 25th in the NHL. However, due to his relatively low usage to that point in the season, he ranked just 62nd in the NHL with just 16 primary points on the season, which was still tied with Zibanejad for the team lead.
SCORE ADJUSTED STATISTICS
Score adjusted statistics are those that are weighted in accordance to the game score in order to account for the fact that at various game states (e.g. a team is winning/losing) teams will be playing differently. The reason for this is, when a team is trailing, they are likely pressing to try to tie it up, and vice versa. To account for this, a formula is applied to weigh shot attempts in accordance to the game state; a higher weight is applied to shot attempts taken by the team with the lead and vice versa, and different weights are applied based on how many goals a team is leading or trailing by.
Score adjusting statistics, particularly shot-based metrics, have been proven to better represent how a team/player performed. For example, the Rangers are a team that, when looking at data for a specific game, have a large difference between regular and score adjusted Corsi in games in which they have a lead. The Rangers are notorious for “turtling” (becoming extremely conservative in their play, collapsing into their own zone and trying desperately to prevent an opposing goal) late in games when they have the lead, and because of this, they get peppered by shot attempts by the opponent. Adjusting the shot data to account for the impact of the game score helps account for the fact that the Rangers play the NHL-equivalent of the prevent defense, so the stats don’t reflect quite as poorly upon the players.
SHOT QUALITY
This is a concept that all fans are already aware of, the simple fact that some shots are of “higher quality,” or more likely to lead to a goal, than others. Shot quality, however, is of particular importance to a number of the key advanced analytics currently permeating hockey discussions. From ex-Rangers goalie and current MSG studio analyst Steve Valiquette, to hockey statisticians such as Dawson Sprigings (DTMAboutHeart) and Emmanuel Perry (proprietor of Corsica), shot quality plays a vital role in analysis, particularly when it comes to discussing expected goals and adjusted save percentage (both of which are discussed in full in their own dedicated sections later).
Before we get ahead of ourselves and discuss specific stats that incorporate shot quality, we must first understand how shot quality serves as the foundation for these stats. A few different analysts have their own models for stats that incorporate shot quality analysis, but at the core of all of them, they assign weightings to shot attempts based on the quality of the shot. Shot quality is determined by numerous things, including distance from the net, angle of the shot, type of shot (slap shot, one-timer, wrist shot, rebound etc.), whether the shot was on a breakaway and more.
Typically, for simplicity sake when sharing shot quality data with the public, analysts and models organize shots across three buckets: high danger, medium danger and low danger shots. Different models have different specifications for what qualifies for each shot classification, and it is not a hard and fast science (and all of the analyst will tell you that themselves). The model I use most frequently is Corsica’s, which has the following definitions for each shot quality bucket:
- Low Danger – Shots with a Fenwick shooting percentage (shooting percentage on all unblocked shot attempts) of less than 3%.
- Medium Danger – Shots with a Fenwick shooting percentage of equal to or greater than 3% and less than 9%.
- High Danger – Shots with a Fenwick shooting percentage of 9% or greater.
We will discuss Fenwick in greater detail in its own section, but it should be quickly noted that the reason we use Fenwick shooting percentage instead of standard shooting percentage is because Fenwick shooting percentage serves as a far more accurate representation of a player’s actual shooting percentage. This is because standard shooting percentage only accounts for shots on goal and ignores all of the times the player shot the puck and missed the net.
The image below is from the Corsica article discussing Manny Perry’s expected goals model, and it shows the general areas of the ice where each shot quality type comes from.
TEAM RELATIVE STATISTICS
Team relative statistics, which the vast majority of the time are what people are referring to when they mention “relative statistics,” illustrate the difference between a team’s performance when a certain player is on the ice compared to when he is off. A crude example of this is if the Rangers post a Corsi total of +5 for a game while Rick Nash is on the ice, and log a Corsi of -5 throughout the entire game while Nash is off the ice, then Nash’s relative Corsi would be a +10. A relative statistic does not illustrate how the player did in a vacuum, per se, but instead illustrates how the player did relative to him teammates.
Relative statistics help demonstrate a player’s impact on their team, and somewhat mitigate the impact that the team has on the player. A quick example of this is that a generally poor possession team (such as the Rangers) will generally have a large number of players below 50% in Corsi For Percentage. However, when you look at the relative Corsi figures, it will be a much more level playing field with respect to this in the positives and negatives.
For example, of the 22 Rangers skaters who amassed at least 400 minutes of icetime throughout 2016-2017 season (some across multiple teams), only six finished with a CF% above 50%: Adam Clendening, Chris Kreider, Mats Zuccarello, Derek Stepan, Brady Skjei and Pavel Buchnevich. However, using relative statistics removes the impact of a poor overall possession team on a player’s numbers, and instead displays the numbers in terms of how much better the team was with a certain player on the ice, compared to when he was off the ice. It isn’t perfect, as players who play the majority of their minutes with the same players (such as Girardi being anchored to McDonagh for most of the 2016-2017 season) will still be impacted by those teammates, but it is a significant step in the right direction towards isolating a player’s impact on his team.
Chris Kreider for example posted a CF% of 53.75% throughout the 2016-2017 season in 5v5 play, an admirable total. However, his relative CF% was 7.81, meaning that the team over the course of the season controlled 7.81% more of the shot attempts while Kreider was on the ice, compared to when he was off.
TEAMMATE RELATIVE STATISTICS
Teammate relative statistics (abbreviated Rel TM) go a step beyond team relative metrics and attempts to further isolate a player’s performance by benchmarking his numbers against all of his individual teammates, instead of against his entire team in aggregate. Teammate relative statistics accomplish this by combining principles used in calculating team relative statistics and WOWY analysis (discussed in the next section). Teammate relative statistics were initially made publicly available by David Johnson, who ran the popular hockeyanalysis.com site, before he was hired by the Calgary Flames in the summer of 2017.
The key in calculating relative teammate statistics is by including the player’s on-ice performance as well as the average of all of his individual teammates’ on-ice performance when the player we are analyzing is not on the ice. So, for example, if we are discussing Corsi for per-60 (CF/60), the calculation would take the total on-ice CF/60 of the player we are analyzing, but also subtract the average of all is individual teammates’ on-ice CF/60s without the player on the ice. Another key point to the calculation is that the teammates’ portion of the calculation is weighted by individual teammate time on ice percentage (TOI%) with the player being analyzed. Each individual teammate is assigned a weight relative to their TOI% with the player being analyzed in order to properly account for how much of a potential impact that teammate may have on the player.
Noted hockey statisticians and Hockey-Graphs contributors The Solberg Twins (Luke and Josh, better known as EvolvingWild on Twitter) recently wrote a fantastic two-part article series for Hockey-Graphs that discusses the various relative shot metrics (team, teammate and WOWY) and highlights the pros and cons of each of the metrics. The piece also includes a highly detailed explanation of his relative teammate calculation, and it breaks down all of the initial components to highlight exactly how the stats are formulated.
In the piece, the Solbergs note that the biggest issue with both team relative stats and standard teammate relative stats (such as the “RelT” data available on Corsica), simply put, is that it is difficult to isolate the performance of a player who is often deployed with the same teammate(s).
For example, for much of Ryan McDonagh’s tenure with the Rangers, he was paired with Dan Girardi; thus their team relative metrics are both greatly impacted by the fact that they are so frequently on the ice together. In very large sample sizes, this issue isn’t as problematic, but in small sample sizes (which includes a full season), this can be a huge issue when evaluating a player’s performance.
An additional issue pointed out by the twins is that team strength has an impact on both forms of relative statistics. They note that, “players on the worst teams appear better and players on the best teams appear worse relative to the league.”
In order to account for the impact that both of these issues have on relative teammate statistics, the Solbergs created adjustments that they discuss at length in their article. This isn’t the place to get into the nitty gritty calculations of how specifically he made the adjustment; if you wish to know the mathematical specifics, please check out their Hockey-Graphs article. The important thing to note however, is that they explicitly state that they believe the adjustments “do a very good job dealing with the innate problems the Rel TM method poses,” and they provide ample evidence to back up this claim.
The point in me explaining all of their calculations, weightings and adjustments that are involved in their relative teammate statistical model, is to lay the foundation for this claim, which they made in the conclusion of his piece: “for every type of long-term player evaluation, I feel the adjusted Rel TM method is vastly superior to the Rel Team method.” They also state that, “In general, I feel the Rel TM method – when adjusted for its inherent issues – is one of the best single-number “pen and paper” methods we have at our disposal for player evaluation.”
In other words, relative teammate statistics serve as a more reliable way to isolate and analyze single player performance than WOWY analysis or relative team analysis. Now, relative team and WOWY analysis certainly still are valuable evaluation techniques in their own right, but the new adjusted relative teammate versions serve as better long-term analysis techniques for individual players.
One good example of how we can use The Solberg Twins’adjusted Rel TM model is by comparing J.T. Miller to Vladislav Namestnikov, both of whom were involved in a blockbuster deal between the Rangers and the Lightning at the 2018 trade deadline. A point that many fans brought up with regards to comparing the two players, was the fact that Namestnikov may have been a benefactor of playing on a better team (Tampa Bay) and on a far superior line to any that J.T. Miller ever played on. During the 2017-2018 season, Namestnikov’s most common linemates were MVP candidates Steven Stamkos and Nikita Kucherov, which obviously had a positive impact on Namestnikov’s point production. Conversely, J.T. Miller had a rotating cast of linemates throughout the season, with his most common partners being Mats Zuccarello and Michael Grabner; not bad, but a far cry from Kucherov and Stamkos.
We can use Luke’s data, which he made publicly available here, to gain a better understanding of the impact that J.T. Miller and Vladislav Namestnikov had relative to their respective teammates. Of all the data shared in the Google Drive doc linked above, what stands out the most to me is what Luke refers to as “relative teammate total impact” statistics. I urge you to read Luke’s Hockey-Graphs piece for a full breakdown, but in layman’s terms, the statistics attempt to measure a player’s total impact on his team, relative to his teammates, for a respective metric (such as Corsi or expected goals).
The impact statistics take into account all 5v5 data and include the adjustments we discussed above. They work similarly to any differential number (plus-minus style), which is important to note because players with more ice time can have a greater variance in their numbers. As much as we all like to isolate per-60-minute production, total impact statistics are also vitally important, because a player who can produce at a high level while receiving high usage is obviously more valuable and has a larger team impact than a player who produces at a high level but receives minimal ice time. After reaching out to Luke on Twitter, and confirming I was interpreting these statistics correctly, he added that, ““I would probably describe that [an impact stat] as an *estimated* net total contribution.” So, for example, if a player has an adjusted relative teammate expected goal total impact of 15, that means that his estimated net total contribution to the team’s expected goals total is +15 over the course of the season.
Now, let’s finally look at the data for Miller and Namestnikov, shall we? As of the most recent update Luke made to his data (March 2, 2018), Vladislav Namestnikov has an adjusted expected goal total impact of 0.9, while J.T. Miller sits at 0.4. As a reminder, these total impact statistics are differentials (plus/minus, with 0 equating to no impact positive or negative), so these numbers by Namestnikov and Miller indicate that both have had a slightly positive impact on their teams’ expected goal differential while they are on the ice, relative to their teammates. In terms of the NHL ranks in the statistic, Namestnikov is 275th while Miller is 299th out of 600. For context, Dougie Hamilton leads the NHL at 16.1, and Brooks Oprik is the worst in the NHL at -13.4. It is worth noting that this data uses Luke’s proprietary expected goal model, and not the Corsica version (which is Manny Perry’s model), which is likely the most common expected goal model you see referenced.
The Corsi impact numbers more strongly favor Namestnikov to Miller, as Namestnikov has an adjusted Corsi total impact of 43.8, while J.T. Miller sports a -33.9. Using this data, we can conclude that Namestnikov has had a positive impact on his team’s shot attempt share rate while he is on the ice, relative to his teammates, while J.T. Miller has had a negative impact relative to his teammates. Namestnikov ranks 204th out of 600 in terms of Corsi total impact, while Miller is 425th, so roughly one-third of the qualified players fall between the two. For context, Dougie Hamilton also leads the NHL in this statistic with a 337.2, while Justin Braun brings up the rear at -348.4.
Finally, we get to the conclusion: what does all of this actually mean, and how does it help us with the J.T. Miller versus Vladislav Namestnikov debate that was prevalent on social media after the trade? Using this relative to teammate data, we can reasonably state that even when one accounts for the fact that Namestnikov was playing on a far superior team with far superior linemates than J.T. Miller, he still impacted the game in a positive way more than J.T. Miller did, at least in terms of expected goal and shot attempt differentials.
WOWY ANALYSIS
WOWY (With or Without You) analysis attempts to help us understand how specific players impact one another on the ice. In essence, WOWY analysis looks at pairs of players and examines how they perform together versus apart. A common misconception, and one that I fell victim to when I first started examining advanced stats, is that WOWY analysis and relative statistics are the same thing. The key difference is that WOWY analysis examines the impact that one specific player has on another, while relative statistics attempt to illuminate how the entire team does with a specific player on the ice compared to when he is off the ice.
The primary reason why WOWY analysis is important is because it helps us understand how a player performs with or without another player. This sort of analysis helps us to quantify some (true) clichés that you hear bandied about the hockey world such as “good players make others better” and that two players have “chemistry” with one another. With WOWY analysis, we can dig into the data and see that player A in fact always plays better when he is paired with player B. We can also look at how a player impacts each individual player on a team, and if the player usually raises the play of his teammates, you can pretty safely conclude he in fact is an effective player.
HockeyViz is an excellent resource brought to us by Micah Blake McCurdy that, among its many capabilities, offers graphical depictions of WOWY analysis. The graph below represents how every player on the Rangers performs with and without Mika Zibanejad in terms of score-adjusted Corsi (minimum 32 minutes played together). The y-axis depicts shot attempts against, while the x-axis shows shot attempts for. The further up you look on the graph, the less shot attempts against occur while the pair is on the ice, and the further to the right you go, the more shot attempts for. In other words, top-right = good, bottom-right = shots for everyone, bottom-left = bad and top-left = shot attempts for nobody.
The black boxes represent how the player whose jersey number is listed within the box performs WITH Mika, the red boxes depict the player WITHOUT Mika, and the blue boxes illustrate how Mike does without the player. The legend to the right lists the amount of time the player has played with and without Mika, and how much time Mika has played without the player.
As you can see, nearly all of the black boxes, which represent how the player does with Mika, are towards the top-right portion of the graph, relative to the red boxes. In translation, the vast majority of players on the Rangers perform better in terms of shot differential WITH Mika Zibanejad on the ice with them, compared to without Mika. All of the blue boxes are near the top-right as well, showing that Mika does just fine without most of these players. The translation to all of this is that, with a few small exceptions, nearly every player on the Rangers performs better when they are on the ice with Mika Zibanejad. In other words, Mika Zibanejad is very good at doing that hockey.
Statistics
Corsi
Corsi is simply a fancy word for shot attempts. By shot attempts, we mean all shots directed by a team at a net, including those that miss the net and those that are blocked by the opponent. The reason it is called Corsi, and not just shot attempts, is because goaltender coach Jim Corsi was the first to start tracking it, and he felt the stat served as a much better measure for how much work his goalies have to put in each night than standard shots on goal, because a goalie has to react to all opponent shot attempts, regardless if they barely miss the net or get blocked.
The primary reason why Corsi is so highly regarded within the analytics community is because it has been proven beyond any reasonable doubt to have more predictive power than standard shots and especially goals. This is because shot attempts are more repeatable, and in a sport such as hockey where all goals are precious, the ability to consistently generate chances will drive scoring far more than anything else.
Unfortunately, certain individuals within the hockey universe treated Corsi as a catch-all statistic for a stretch of time. For those unfamiliar with catch-all statistics such as WAR in baseball (or hockey, which I will get to later), they are single metrics based on complex formulas that attempt to encapsulate the overall value of a player, and convey the value against that of an average “replacement player.” Let me make this clear, CORSI IS NOT MEANT TO ILLUSTRATE THE OVERALL VALUE OF A PLAYER.
Corsi comes in many forms, and to understand how to use Corsi in player or team analysis, it is important to learn all of its versions.
CORSI FOR (CF)
The amount of shot attempts a team takes, including shots on goal, shots that miss the net (or hit the post) and blocked shots. Corsi in this context can be applied to illustrate how a whole team does, or the impact a specific player has on the team’s ability to generate shot attempts. Specifically, you can use CF to simply state how many shot attempts a team had over a specified timeframe (period, game, season etc.), or you can use it to convey how many shot attempts a team takes while a specific player is on the ice.
For example, during the Rangers game against the New Jersey Devils on December 9, 2017, the Rangers as a team amassed a Corsi for during 5v5 play of 46. This means that when both teams had 5 skaters on the ice, the Rangers attempted 46 shots, which include shots on goal, shots that missed the net and shots that were blocked by the Devils. At an individual level, the team generated the most 5v5 shot attempts while Brendan Smith was on the ice, with 19 shot attempts. This does not mean Brendan Smith himself took 19 shot attempts during the game, it means the entire New York Rangers team took 19 shot attempts during his 17.8 minutes of 5v5 ice time.
CORSI AGAINST (CA)
The amount of shot attempts a team allows, including shots on goal, missed shots (including those that hit the post) and blocked shots. Similar to Corsi for, Corsi against can be used to show how many shot attempts an entire team allows over a specified period, or it can be used to illustrate the impact a specific player has on the shot attempts allowed by a team.
INDIVIDUAL CORSI FOR (ICF)
The amount of shot attempts an individual player takes himself. Using the same example from the Corsi for definition, while the Rangers logged 19 total shot attempts during 5v5 play while Brendan Smith was on the ice, Brendan Smith personally took only 2 shot attempts, giving him an iCF of 2 for the game. There is no individual Corsi against measurement, as it is extremely difficult in many cases to accurately state whether a specific shot attempt was given up by a specific player.
CORSI FOR PER-60 (CF/60)
The amount of shot attempts a team accumulates per-60 minutes of play. While team-level CF/60 is not necessarily a per-game statistic, it still helps us account for the differences in games played at any point in the season between teams, as teams do not accumulate games played at the exact same rate. It also helps us account for differences in the numbers of penalties teams take and draw, and the amount of times they go into overtime, all of which can influence a team’s raw shot attempt totals across all situations.
At an individual level, the stat shows the impact that a player has on his team’s shot attempt rate while he was on the ice, and helps us account for the fact that some players get more ice time than others. For example, throughout the first 30 games of the Rangers 2017-2018 season, the Rangers had generated 364 shot attempts with Pavel Buchnevich on the ice, 7th most on the team. However, when you looked at the per-60 stats, the Rangers took 63.61 shot attempts per-60 minutes, 2nd most on the team amongst players with at least 200 minutes of ice time. Using rate statistics helps us account for playing time differences to help us understand who truly has the most significant impact on shot generation. It should be noted that rate statistics can be heavily influenced by deployment as well, so it is important to understand how a player is being utilized by his coach when discussing any rate statistic.
CORSI AGAINST PER-60 (CA/60)
The amount of shot attempts a team allows per-60 minutes of play. All of the same context provided within the CF/60 definition can be applied to CA/60, with the obvious exception being with CA/60, we are talking about a team’s ability suppress shot attempts or a specific player’s impact on shot attempts against.
CORSI FOR PERCENTAGE (CF%)
The percentage of all shot attempts that are taken by a team. This is the most common shot attempt-based metric used, and often when a fan or analyst simply refers to a team’s or player’s “Corsi,” they are in fact referencing the Corsi for percentage. The formula is CF% = CF/(CF+CA). Similar to both Corsi for and Corsi against, Corsi for % can be used to demonstrate the percentage of shot attempts taken by an entire team during a game, or it can be used to illustrate the percentage of shot attempts taken by a team when a specific player in on the ice. A CF% of 50% means that the team and the opponent took the exact same number of shot attempts over the specified period of time.
As I discussed in the beginning of the Corsi section, the bulk of Corsi’s value comes in its predictive ability, particularly compared to goals and standard shots. CF% in particular is the metric that has been used in the various predictably studies. CF% has been proven by multiple renowned analysts to be a better predictor of future goal differential at both the team and player level.
A few analysts now make graphical representations of team and player Corsi to help fans digest and interpret the information, and nearly all of the graphs are labeled the same way. The x-axis depicts Corsi for, and increasing as you move further to the right (i.e. more shot attempts as you move to the right), while the y-axis depicts Corsi against and is inverted, meaning it is decreasing as you move further up (i.e. less shot attempts allowed as you move up). The four quadrants of the graph are typically labeled something to the effect of: upper-right = good (lots of shot attempts for, few shot attempts against); bottom-right = fun (lots of shot attempts for and against); bottom-left = bad (few shot attempts for and lots of shot attempts against); top-left = dull (few shot attempts for and against). While this isn’t technically charting CF%, it can be interpreted as such, as it is charting CF against CA, which is in essence what CF% represents.
The graph below, courtesy of Sean Tierney, whom is a fantastic follow on Twitter, is one such example of a Corsi graph. You will notice that he does not label the graph as “team possession,” and instead calls it “team shot rates.” This data used in the graph is as of January 16, 2018, and at the time the Rangers had a poor Corsi For/60 of 46.51, which was 27th in the NHL, and a paltry Corsi Against/60 of 53.98, 2nd worst in the NHL (all data used in the chart was 5v5 and score adjusted). This resulted in the Rangers having an abysmal CF% of 46.28%, 2nd worst in the NHL and firmly entrenched in the “Bad” quadrant of the graph.
It works similarly when used to evaluate player performance. If a player has a CF% over 50%, it means that the team controls over half of the shot attempts while the player is on the ice. When you look at the CF%s of a team’s players, it should come as no surprise that most of the best players are at the top of the rankings. I am not saying that just because a player has a strong CF% means he is a good player; but more often than not, the better players on a team are typically in the upper echelon of the team’s CF% ranks. Using the Rangers through the first 45 games of the 2017-2018 season as an example, the top four players in CF% are (in order from 1-4): Mika Zibanejad, Pavel Buchnevich, Chris Kreider and Rick Nash. The worst, from worst to best, are David Desharnais, Paul Carey, Steven Kampfer and Jesper Fast. It should be noted that these ranking are among NYR skater with at least 200 minutes of ice time. When analyzing player CF%, it is important to place a TOI minimum on the data set in order to get rid of any outliers whose performance may be skewed by a tiny sample size.
RELATIVE CORSI FOR PERCENTAGE (RELCF%)
The difference between a team’s Corsi when a player is on and off the ice. As I discussed in the Relative Statistics portion of this resource, relative statistics help us view a player’s impact on a team and mitigate the effects that a team has on a player, and thereby making it easier for us to truly compare a player on a poor shot attempt differential team to a player on a strong shot attempt differential team.
CORSI QUALITY OF COMPETITION (CF.QOC)
The weighted average Corsi for percentage of the opponents that an individual faces over a specified period of time. In most models, the weight used is ice time, with the theory that, in general, the best players are those that get the most ice time.
CORSI QUALITY OF TEAMMATES (CF.QOT)
The weighted average Corsi for percentage of the teammates a player shares the ice with over a specified period of time. In most models, the weight used is ice time, with the theory that, in general, the best players are those that get the most ice time.
Fenwick
Fenwick measures all unblocked shot attempts. Fenwick is very similar to Corsi in that it recognizes the value in shot attempts that miss the net, but it strips out blocked shots. This is because some analysts feel that shot blocking is a skill, or at least a purposeful tactic employed by an opposing team, and therefor they believe that a team should not be given credit for a shot attempt when it is blocked. The stat is named after Matt Fenwick, who was a writer for an old Calgary Flames blog, The Battle for Alberta. Long story short, Matt Fenwick believed that Corsi’s best application was to gauge scoring chances, and a blocked shot is not something that should be deemed a scoring chance; therefor, stripping out blocked shots would improve the stat when it comes to measuring scoring chances.
Both Fenwick and Corsi have been proven to have more predictive power then shots on goal and there are strong arguments for using each. Personally, I believe both stats are useful and can be used and dissected in different ways to tell different stories; I do not think you should use one or the other, and instead you should consider both. In terms of predictive ability, noted hockey statistician Micah Blake McCurdy wrote a piece for Hockey-Graphs a few years ago that, among many things, concluded that score-adjusted Corsi serves as a better predictor than score-adjusted Fenwick (it should be noted this was not the purpose of the article, simply one takeaway from it).
All of the context I provided in the above Corsi section can also be applied to Fenwick. It is NOT meant to be a WAR-like stat, and it can be applied in a variety of ways to help us understand team, player and goalie performance.
FENWICK FOR (FF)
The amount of unblocked shot attempts a team takes, including shots on goal and shots that miss the net (or hit the post). Similar to Corsi, Fenwick in this context can be applied to illustrate how a whole team does, or the impact a specific player has on the team’s ability to generate unblocked shot attempts. Specifically, you can use FF to state how many unblocked shot attempts a team has had over a specified course of time (period, game, season etc.), or you can use it to convey how many unblocked shot attempts a team takes while a specific player is on the ice.
For example, during the Rangers game against the Anaheim Ducks on December 19, 2017, the Rangers as a team amassed a Fenwick for during 5v5 play of 40, and a Corsi for of 45. This means that when both teams had 5 skaters on the ice, the Rangers attempted 45 shots, but 5 of those shot attempts were blocked by the Ducks, resulting in a Fenwick for of 40. At an individual level, the team generated the most 5v5 unblocked shot attempts while Ryan McDonagh was on the ice, with 18 unblocked shot attempts. This does not mean Ryan McDonagh himself took 18 shot attempts that were not blocked during the game, it means the entire New York Rangers team took 18 unblocked shot attempts during his 17.5 minutes of 5v5 ice time.
FENWICK AGAINST (FA)
The amount of unblocked shot attempts a team allows, including shots on goal and shots that miss the net (or hit the post). Similar to Fenwick for, Fenwick against can be used to show how many unblocked shot attempts an entire team allows over a specified period, or it can be used to illustrate the impact a specific player has on the unblocked shot attempts allowed by a team.
INDIVIDUAL FENWICK FOR (IFF)
The amount of unblocked shot attempts an individual player takes himself. Using the same example from the Fenwick for definition, while the Rangers logged 18 total unblocked shot attempts during 5v5 play while Ryan McDonagh was on the ice, McDonagh personally took only 3 shot attempts that went unblocked, giving him an iFF of 3 for the game. There is no individual Fenwick against measurement, as it is extremely difficult in many cases to accurately state whether a specific unblocked shot attempt was given up by a specific player.
FENWICK FOR PER-60 (FF/60)
The amount of unblocked shot attempts a team accumulates per-60 minutes of play. While team-level FF/60 is not necessarily a per-game statistic, it still helps us account for the differences in games played at any point in the season between teams, as teams do not accumulate games played at the exact same rate. It also helps us account for differences in the numbers of penalties teams take and draw, and the amount of times they go into overtime, all of which can influence a team’s raw unblocked shot attempts across all situations.
At an individual level, the stat shows the impact a player has on his team’s unblocked shot attempt rate while he was on the ice, and helps us account for the fact that some players get more ice time than others. For example, throughout the first 45 games of the Rangers 2017-2018 season, the Rangers have generated 369 unblocked shot attempts with Pavel Buchnevich on the ice, 10th most on the team. However, when you look at the per-60 stats, the Rangers take 44 unblocked shot attempts per-60 minutes with Buchnevich on the ice, 3rd most on the team amongst players with at least 200 minutes of ice time. Using rate statistics helps us account for playing time differences to help us understand who truly has the most significant impact on shot generation. It should be noted that rate statistics can be heavily influenced by deployment as well, so it is important to understand how a player is being utilized by his coach when discussing any rate statistic.
FENWICK AGAINST PER-60 (FA/60)
The amount of unblocked shot attempts a team allows per-60 minutes of play. All of the same context provided within the FF/60 definition can be applied to FA/60, with the obvious exception being with FA/60, we are talking about a team’s ability suppress shot attempts or a specific player’s impact on shot attempts against.
FENWICK FOR PERCENTAGE (FF%)
The percentage of all unblocked shot attempts that are taken by a team. This is the most common Fenwick-based metric used, and often when a fan or analyst simply refers to a team’s or player’s “Fenwick,” they are in fact referencing their Fenwick for percentage. The formula is FF% = FF/(FF+FA). Similar to both Fenwick for and Fenwick against, Fenwick for % can be used to demonstrate the percentage of unblocked shot attempts taken by an entire team during a game, or it can be used to illustrate the percentage of shot attempts taken by a team when a specific player in on the ice. A FF% of 50% means that the team and the opponent took the exact same number of unblocked shot attempts over the specified period of time.
RELATIVE FENWICK FOR PERCENTAGE (RELFF%)
The difference between a team’s Fenwick when a player is on and off the ice. As I discussed in the Relative Statistics portion of this resource, relative statistics helps us view a player’s impact on a team and mitigate the effects that a team has on a player, and therefor makes it easier for us to truly compare a player on a poor shot attempt differential team to a player on a strong shot attempt differential team.
FENWICK QUALITY OF COMPETITION (FF.QOC)
The weighted average Fenwick for percentage of the opponents that an individual faces over a specified period of time. In most models, the weight used is ice time, with the theory that, in general, the best players are those that get the most ice time.
FENWICK QUALITY OF TEAMMATES (FF.QOT)
The weighted average Fenwick for percentage of the teammates a player shares the ice with over a specified period of time. In most models, the weight used is ice time, with the theory that, in general, the best players are those that get the most ice time.
Shots on Goal
Shots on goal is the stat that even the most novice hockey fan is familiar with, as this metric is used in all hockey broadcasts and box scores, and is often simply referred to as “shots.” When you are watching a hockey game on TV, and you hear the announcer mention something like, “the Rangers are currently being outshot by the Ducks 22 to 14,” what the announcer is saying is that the Ducks have taken 22 shot attempts that caused the goalie to have to physically make a save to prevent a goal, compared to just 14 by the Rangers. Shots that hit the post are not included in this shots metric, nor are shots that miss the net or those that get blocked by the opponent. Goals however, are included in this stat (and goals are also included in both Corsi and Fenwick).
The reason I am including shots on goal here, despite its widespread popularity, is because I think it is important to note that shots data is also available in all of the flavors that we just discussed in the previous Corsi and Fenwick sections. While you will likely only hear broadcasters discuss shots in terms of team or individual player shot attempts, you can break down the data in all of the same ways you can with Corsi and Fenwick. Shots data is not as predictive as Corsi or Fenwick, but it still matters and helps round out the analysis of a team.
I will not waste everyone’s time with the same level of context I provided within the Corsi and Fenwick sections, but below are quick definitions of all of the forms of shots on goal that are freely available on data sites such as Corsica.
SHOTS FOR (SF)
The amount of shots on goal a team takes. Similar to Corsi and Fenwick, shots in this context can be applied to illustrate how a whole team does, or the impact a specific player has on the team’s ability to generate shots on goal. Specifically, you can use SF to state how many shots on goal a team has had over a specified course of time (period, game, season etc.), or you can use it to convey how many shots on goal a team takes while a specific player is on the ice.
SHOTS AGAINST (SA)
The amount of shots on goal a team allows. Similar to shots for, shots against can be used to show how many shots on goal an entire team allows over a specified period, or it can be used to illustrate the impact a specific player has on the shots on goal allowed by a team.
INDIVIDUAL SHOTS FOR (ISF)
The amount of shots on goal an individual player takes himself.
SHOTS FOR PER-60 (SF/60)
The amount of shots on goal a team accumulates per-60 minutes of play. Similar to SF and SA, SF/60 can be applied to see how the entire team did or to view the impact a specific player has on the team.
SHOTS AGAINST PER-60 (SA/60)
The amount of shots on goal a team allows per-60 minutes of play. Similar to SF and SA, SA/60 can be applied to see how the entire team did or to view the impact a specific player has on the team.
SHOTS FOR PERCENTAGE (SF%)
The percentage of all shots on goal that are taken by a team. The formula is SF% = SF/(SF+SA). SF% can be used to demonstrate the percentage of shots on goal taken by an entire team during a game, or it can be used to illustrate the percentage of shots on goal taken by a team when a specific player in on the ice. A SF% of 50% means that the team and the opponent took the exact same number of shots on goal over the specified period of time.
RELATIVE SHOTS FOR PERCENTAGE (RELSF%)
The difference between a team’s shots when a player is on and off the ice. As I discussed in the Relative Statistics portion of this resource, relative statistics help us view a player’s impact on a team and mitigates the effects that a team has on a player, making it easier for us to truly compare players on poor and strong shot attempt differential teams.
Expected Goals
Expected goals is a statistic that considers both shot quantity and quality in order to provide a metric for how many goals a team (or player) should have scored, given the quality of scoring chances generated, if the opposing goalie played at a league-average level. Expected goals accomplishes this by weighting each unblocked shot attempt by a variety of shot attributes, with heavier weightings applied to shot characteristics with a higher chance of leading to a goal.
The shot characteristics considered by expected goals include shot type (wrist, slap, deflection etc.), distance from the net, shot angle, whether a shot was a rebound or generated off the rush, and if it was taken on the power play, even strength or on the penalty kill. So, for a quick example of how the model works, a slap shot taken from the slot that was generated off a two-on-one rush would have a much heavier weight applied to it than a weak wrist shot taken by a defenseman from the point.
It should be noted that multiple models for expected goals exist and they have some variations in their calculations. The most commonly referenced model at the time of this Glossary’s publication are courtesy of Emmanuel (Manny) Perry, who created and runs the fantastic hockey stats site Corsica. Personally, at the time I am writing this, I typically use Manny’s model, as it is freely available to all on Corsica at the player and team level and it is provided in (near) real-time as games are occurring.
Expected goals is far from a perfect stat, and quite frankly there is no such thing as a perfect stat. Nonetheless, expected goals is a valuable tool for evaluating games, players and teams. In fact, expected goals has been proven by multiple leading hockey statisticians to have even more predictive power than Corsi and Fenwick. Rob Vollman stated in his Hockey Abstract 2017 that, “even-strength expected goals, by themselves, slightly outperformed shot differential when predicting next season’s results.” Dawson Sprigings (DTMAboutHeart) also noted that expected goals serves as a better predictor of future success than Corsi, and provides a graph demonstrating the predictive abilities of expected goals against Corsi and goal differential.
Lastly, and least analytically, expected goals just passes the smell test better than any other commonly used advanced metric. You often hear fans say things like “we deserved the win/loss!” after a game. I personally have found that, far more often than not, expected goals totals from a game aligns with these sentiments far more often than Corsi totals do.
A perfect example of this was the December 28, 2017 game between the Rangers and Capitals, which went scoreless into the shootout. Despite neither team scoring a goal outside of the shootout, Twitter fans and the NBC studio analysts after the game made the comment that the Rangers “deserved the win more than the Capitals.” At 5v5, the Rangers had a Corsi For advantage of 49-43 over the Capitals, making for a fine CF% of 53.26. While this is an advantage, it certainly does not warrant everyone and their mothers commenting that the Rangers definitely deserved the win. However, when you look at the 5v5 expected goals data, you see that the Rangers had a 3.57-1.32 advantage, making for an impressive expected goals for % of 73%. This aligns much more closely with what everyone saw, and does in fact paint the picture that the Rangers were the better team on that evening.
Similar to the shot metrics discussed above, expected goals data comes in many forms, and to understand how to use the data in player or team analysis it is important to learn all of its versions.
EXPECTED GOALS FOR (XGF)
The number of goals a team should have scored against league-average goaltending, given the quality of scoring chances they generated across a specified period of time (period, game, season etc.). Similar to shot attempt metrics, expected goals in this context can be applied to illustrate how a whole team does, or the impact a specific player has on the team’s ability to generate scoring chances.
For example, during the December 19, 2017 game between the Rangers and the Ducks, which the Rangers won 4-1, the Rangers as a team boasted an xGF across all situations of 3.22, meaning that if the Ducks had league-average goaltending that night, they would have only allowed about 3 goals if we round to the closest whole number. At an individual level, Ryan McDonagh led all skaters with an xGF of 1.71 across all situations, meaning that the team boasted an xGF of 1.71 across the 20.1 minutes that McDonagh was on the ice.
EXPECTED GOALS AGAINST (XGA)
The amount of goals a team should have allowed with league-average goaltending, given the quality of scoring chances they allowed across a specified period of time. Similar to expected goals for, expected goals against can be used to show how a whole team does, or the impact a specific player has on the team’s ability to suppress scoring chances against.
INDIVIDUAL EXPECTED GOALS FOR (IXGF)
The amount of expected goals an individual player generates himself. Using the same example from the xGF definition, while the Rangers logged 1.71 expected goals for while Ryan McDonagh was on the ice, McDonagh himself only accumulated 0.26 expected goals. Similar to the shot-based metrics, there is no individual expected goals against measurement, as it is nearly impossible in most cases to accurately state whether a specific scoring chance was given up by one specific player.
One way you can use ixGF is to compare a player’s actual goals scored over a period of time to their individual expected goals for, in order to gain an understanding of whether they are under or over-performing in terms of goal production. Sean Tierney has a fantastic dynamic chart (which he calls Goals vs. Expectation) on his Tableau profile that charts players’ ixGF against their actual goal totals for each team. Below is the Rangers’ Goals vs/ Expectation chart, as of January 2, 2018.
A quick explanation of how to read the chart: the colors indicate how many actual goals the player has scored, ranging from dark red (0 goals) to dark blue (25). The x-axis (the only axis that matters) charts the difference between a player’s ixGF and their actual goal total (x axis = ixGF-goals). Also, if you are viewing the chart on the Tableau profile, you can hover over any chart to view the actual data. As you can see, Rick Nash has the worst differential, as he has only scored 9 goals, but has an ixGF of 13.23, good for a differential of -4.23. Michael Grabner is over-performing his expected goals total by the most, with 17 goals against 13.37 individual expected goals.
EXPECTED GOALS FOR PER-60 (XGF/60)
The amount of expected goals a team accumulates per-60 minutes of play. Similar to xGF and xGA, xGF/60 can be applied to see how the entire team did or to view the impact a specific player has on the team. Also, just a reminder, per-60 stats are not the same as per-game stats, so be sure not to confuse the two.
At an individual level, the stat shows the impact that a player has on his team’s ability to accumulate scoring chances while he is on the ice, and helps us account for the fact that some players get more ice time than others. For example, throughout the first 45 games of the Rangers 2017-2018 season, the Rangers have generated 24.45 expected goals with Pavel Buchnevich on the ice, 10th most on the team. However, when you look at the per-60 stats, the Rangers generate 2.92 expected goals per-60 minutes, the most on the team amongst players with at least 200 minutes of ice time.
Using rate statistics helps us account for playing time differences to help us understand who truly has the most significant impact on shot generation. It should be noted that rate statistics can be heavily influenced by deployment as well, so it is important to understand how a player is being utilized by his coach when discussing any rate statistic.
EXPECTED GOALS AGAINST PER-60 (XGA/60)
The amount of expected goals a team allows per-60 minutes of play. Similar to xGF and xGA, xGA/60 can be applied to see how the entire team did or to view the impact a specific player has on the team. All of the same context provided within the xGF/60 definition can be applied here, with the obvious exception being with xGA/60, we are talking about a team’s ability suppress scoring chances or a specific player’s impact on expected goals against.
EXPECTED GOALS FOR PERCENTAGE (XGF%)
The percentage of all expected goals accumulated by both teams that are generated by a specific team. The formula is xGF% = xGF/(xGF+xGA). xGF% can be used to demonstrate the percentage of expected goals generated by an entire team during a game, or it can be used to illustrate the percentage of expected goals generated by a team when a specific player in on the ice. An xGF% of 50% means that the team and the opponent generated the exact same number of expected goals over the specified period of time.
When I discussed the predictive power of expected goals in the preamble to this section, it was mainly with this specific stat in mind, expected goals for percentage.
One particularly useful way to use xGF% in team analysis is by examining a team’s standing in the league in xGF% compared to where they stand in terms of Corsi. Since shot quantity plays a role in accumulating expected goals, one can conclude that if a team has a much higher standing in xGF% than CF%, then a relatively high percentage of their shot attempts are those of a higher quality, and more likely to be converted to a goal. As of January 18, 2018, the Rangers are 19th in the NHL in 5v5 xGF% at 48.66%, but 30th in the NHL in CF% at 46.20%. Because of the fact that the Rangers are 11 spots higher in the xGF% standing than CF%, we can conclude that the Rangers are a team that typically waits for higher quality chances, as opposed to going by the old mantra, “get pucks to the net.”
RELATIVE EXPECTED GOALS FOR PERCENTAGE (RELXGF%)
The difference between a team’s expected goal differential when a player is on and off the ice. As I discussed in the Relative Statistics portion of this resource, relative statistics help us view a player’s impact on a team and mitigate the effects that a team has on a player. A player with a positive RelxGF% is one whose team performs better in terms of xGF% when the player is on the ice, compared to when he is off the ice.
Goals
I’m not going to waste anyone’s time defining what a goal is. However, I do want to point out that we can dissect goal data in the very same manner we do with expected goals and shot attempts. Broadcasts and box scores use the stat plus/minus as a way to explain goal differential when a player is on the ice. Plus/minus is a horrendous stat that counts goals scored by penalty killers and those on the ice with the opposing goalie pulled, but does not count power play goals. By doing this, plus/minus greatly favors players like Michael Grabner, who play a lot of penalty kill and 5v6 time, and hurts players like Kevin Shattenkirk, who log a lot of minutes on the power play.
However, just because plus/minus is an atrocious statistic, doesn’t mean that we shouldn’t examine goal differential. In fact, it’s quite the opposite. We should treat goal differential like expected goals, in that we should primarily focus at 5v5 data for a truer gauge of a player’s ability, and dissect in use rate and relative versions to help round out our analysis of a player.
GOALS FOR
The number of goals a team scored across a specified period of time (period, game, season etc.). Similar to expected goals, goals in this context can be applied to illustrate how a whole team does, or the impact a specific player has on the team’s ability to score goals.
GOALS AGAINST (GA)
The amount of goals a team allowed across a specified period of time. Similar to goals for, goals against can be used to show how a whole team does, or the impact a specific player has on the team’s ability to prevent goals against.
GOALS SCORED (G)
This is the statistic you all know and love, the amount of goals a player scored. In the expected goals and shot attempt metric sections, the individual player statistic was designated with an “i” before it, standing for “individual.” Here this is unnecessary, as we simply refer to this as Goals instead of individual goals.
GOALS FOR PER-60 (GF/60)
The amount of goals a team accumulates per-60 minutes of play. Similar to GF and GA, GF/60 can be applied to see how the entire team did or to view the impact a specific player has on the team. Also, just a reminder, per-60 stats are not the same as per-game stats, so be sure not to confuse the two.
GOALS AGAINST PER-60 (GA/60)
The amount of goals a team allows per-60 minutes of play. Similar to GF and GA, xGA/60 can be applied to see how the entire team did or to view the impact a specific player has on the team.
GOALS FOR PERCENTAGE (GF%)
The percentage of all goals accumulated by both teams that are generated by a specific team. The formula is GF% = GF/(GF+GA). GF% can be used to demonstrate the percentage of expected goals generated by an entire team during a game, or it can be used to illustrate the percentage of expected goals generated by a team when a specific player is on the ice. A GF% of 50% means that the team and the opponent generated the exact same number of expected goals over the specified period of time. This statistic is the one that should be used instead of plus/minus if you wish to evaluate a player’s impact on generating goals for and preventing goals against.
RELATIVE GOALS FOR PERCENTAGE (RELGF%)
The difference between a team’s goal differential when a player is on and off the ice. As I discussed in the Relative Statistics portion of this resource, relative statistics help us view a player’s impact on a team and mitigate the effects that a team has on a player. A player with a positive RelGF% is one whose team performs better in terms of GF% when the player is on the ice, compared to when he is off the ice.
Save and Shooting Accuracy Metrics
Currently, goaltender discussions are at the level that baseball discussions were 10 years ago, where many fans over-emphasize a team success stat like wins to evaluate a goalie’s performance and rely heavily on metrics that can be significantly impacted by the play of the defense. In the case of goalies, those metrics are simple save percentage and goals against averages, whereas for pitchers its ERA. All are flawed statistics that can be influenced by the play of the defense and only offer a specific perspective on how the player performed (yes, I know ERA doesn’t include runs scored as a result of an error, but the quality of your defense along with a litany of other factors absolutely plays a role here).
Now, I’m not saying that numbers such as save percentage for goaltenders and ERA for pitchers aren’t helpful, they certainly are. What I am saying, however, is that they are flawed, and there are other metrics out there that that help isolate the performance of the actual player, and strip out some of the background noise that may be influencing the more basic numbers.
Further, many of the same principles that make standard save percentage a flawed stat can be applied to standard shooting percentage. A player’s shooting percentage can be drastically influenced one way or another by the level of goaltending they face, which obviously is not something the skater can control. For these reasons, advance shooting metrics help us make more informed conclusions on a player’s shooting ability.
SHOOTING PERCENTAGE (SH%)
The percentage of all shots on goal that are goals. This is the standard shooting percentage that is typically being discussed when a broadcaster or beat writer is discussing a team’s or player’s shooting percentage.
SAVE PERCENTAGE (SV%)
The percentage of all shots on goal that are saved. This is the standard save percentage that is typically being discussed when a broadcaster or beat writer is discussing a team’s or player’s save percentage.
EXPECTED SHOOTING PERCENTAGE (XSH%)
The shooting percentage that a goalie (or team) should have with a league average performance given the quality of chances he faced. It is important to note that expected shooting percentage is NOT a measure of how well a skater or team has actually shot, it merely serves as a benchmark for the shooting percentage that an average skater or team should have posted, given the quality of chances he faced.
EXPECTED SAVE PERCENTAGE (XSV%)
The save percentage that a goalie (or team) should have with a league average performance given the quality of chances he faced. It is important to note that expected save percentage is NOT a measure of how the goalie actually performed, it merely serves as a benchmark for the save percentage that an average goalie should have posted, given the quality of chances he faced. With that said, I once saw a guy on Twitter make the argument that because Halak’s xSv% is greater than Lundqvist’s, that means Halak has had a better season—please don’t be that guy. That is a very false statement. What that does mean, however, is that the Islanders defense has performed better in front of Halak than Lundqvis has, so Halak has faced on average lesser quality scoring chances than Lundqvist.
ADJUSTED (DELTA) SHOOTING PERCENTAGE (DSH%)
The difference between a skater’s (or team’s) actual shooting percentage and his expected shooting percentage. The formula is dSh% = Sh% – xSh%. This stat helps show us whether a skater’s shooting percentage is sustainable or not, given the quality of scoring chances he has generated. If a player has a high dSh%, then his actual shooting percentage is higher than his expected shooting percentage. This means that he is shooting at a higher percentage than what would be expected of him given the quality of scoring chances he has generated, meaning that it is likely unsustainable and will regress downwards eventually.
ADJUSTED (DELTA) SAVE PERCENTAGE (DSV%)
This stat is the difference between a goalie’s (or team’s) actual save percentage and his (or its) expected save percentage. The formula is dSv% = Sv% – xSv%. This is a very valuable stat that helps show how much better (or worse) a goalie is doing compared to how an average goalie would have performed given the quality of shots faced. A dSv% of 0 means that a goalie has performed exactly to the level of an average goalie given the quality of shots faced. Using our Lundqvist and Halak example from before, as of January 18, 2018, Lundqvist has a dSv% of 0.93 during 5v5 play, good for 7th in the league among goalies with 1,000 minutes played. Halak however, has a dSv% of 0.71%, 9thth among qualified goaltenders. What this tells us is that Lundqvist has outperformed his expected save percentage 0.22 percentage points more than Halak to this point in the season.
CORSI SHOOTING PERCENTAGE (CSH%)
The percentage of all shot attempts that are goals. By comparison, standard shooting percentage only counts shots on goal, so a player’s (or team’s) Corsi shooting percentage is always lower than his (or its) standard shooting percentage.
CORSI SAVE PERCENTAGE (CSV%)
The percentage of all shot attempts that are saved. By comparison, standard save percentage only counts shots on goal, so a goalie’s (or team’s) Corsi save percentage is always higher than his (or its) standard save percentage.
FENWICK SHOOTING PERCENTAGE (FSH%)
The percentage of all unblocked shot attempts that are goals. By comparison, standard shooting percentage only counts shots on goal, so a player’s (or team’s) Fenwick shooting percentage is always lower than his (or its) standard shooting percentage. Because it strips out blocked shots, Fenwick shooting percentage is often the preferred calculation when discussing shot quality and individual save metrics by quality, such as low danger save percentage.
FENWICK SAVE PERCENTAGE (FSV%)
The percentage of all unblocked shot attempts that are saved. By comparison, standard save percentage only counts shots on goal, so a goalie’s (or team’s) Fenwick save percentage is always higher than his (or its) standard save percentage. Because it strips out blocked shots, Fenwick save percentage is often the preferred calculation when discussing shot quality and individual save metrics by quality, such as low danger save percentage.
EXPECTED FENWICK SHOOTING PERCENTAGE (XFSH%)
The Fenwick shooting percentage (percentage of all unblocked shots that convert to goals) that a player or team would have shot if the opposing goalie performed at a league-average level, given the quality of scoring chances generated. The formula for the stat is: expected Fenwick shooting percentage = expected goals for/Fenwick for.
EXPECTED FENWICK SAVE PERCENTAGE (XFSV%)
The Fenwick save parentage that a goalie (or team) should have with a league average performance given the quality of chances faced. The formula for the stat is: expected Fenwick save percentage = 1 – expected goals against/Fenwick against.
ADJUSTED (DELTA) FENWICK SHOOTING PERCENTAGE (DFSH%)
The difference between a skater’s (or team’s) actual Fenwick shooting percentage and his expected Fenwick shooting percentage. The formula is dFSh% = FSh% – xFSh%. All of the same context provided in the dSh% definition can be applied here, with the caveat that dFSh% considers all unblocked shot attempts, while dSh% considers only shots on goal.
ADJUSTED (DELTA) FENWICK SAVE PERCENTAGE (DFSV%)
The difference between a goalie’s (or team’s) actual Fenwick save percentage and his expected Fenwick save percentage. The formula is dFSv% = FSv% – xFSv%. All of the same context provided in the dSv% definition can be applied here, with the caveat that dFSv% considers all unblocked shot attempts, while dSv% considers only shots on goal.
GOALS SAVED ABOVE AVERAGE (GSAA)
This is a cumulative stat that represents the number of goals allowed by a goaltender compared to the number of goals that would have been allowed by a league average goalie. It is similar in nature to WAR in baseball that way, except it is specific to goals allowed by goaltenders.
As of January 18, 2018, Anaheim netminder John Gibson leads the NHL in all situation GSAA with 19.6. This means that if an average goalie was substituted in for Gibson for all of his time played this year, and the games unfolded exactly the same way, the Lightning would have yielded about 20 more goals throughout the season.
GOALS SAVED ABOVE AVERAGE PER-30 (GSAA/30)
This is a rate version of GSAA, and tells us how many goals saved above average a goalie has per every 30 shots he has faced. Think of this similarly to how you think of Corsi For per-60 or any other per-60 skater stat, in that it helps us account for differences in playing time to compare players. The primary difference between GSAA/30 and GSAA/60 in terms of application is that GSAA/30 helps equalize goalies who face varying workloads. Despite not accounting for workload, GSAA/60 is still useful, as goalies that can handle an increased workload at a consistently high level, such as Henrik Lundqvist, are extremely valuable.
GOALS SAVED ABOVE AVERAGE PER-60 (GSAA/60)
This is a rate version of GSAA, and tells us how many goals saved above average a goalie has per-60 minutes of 5v5 play. Think of this similarly to how you think of Corsi For per-60 or any other per-60 skater stat, in that it helps us account for differences in playing time when comparing players. The primary difference between GSAA/30 and GSAA/60 in terms of application is that GSAA/30 helps equalize goalies who face varying workloads. Despite not accounting for workload, GSAA/60 is still valuable, as goalies that can handle an useful workload at a consistently high level, such as Henrik Lundqvist, are extremely valuable.
PDO
PDO is the sum of a team’s shooting percentage and its save percentage, converted from percentages into whole numbers. This stat is almost always discussed in terms of 5v5 play only, but you can generate it for other situations. So, if a team has a 92% 5v5 save percentage, and an 8.5% 5v5 shooting percentage, they have a PDO of 100.5. PDO is often referred to as a measurement of a team’s luck, with 100 being average, anything over 100 being deemed lucky (the higher you go, the luckier the team is) and anything under 100 being deemed unlucky (the lower you go, the unluckier the team is).
The reason that PDO is considered a measurement of luck, is because over very large sample sizes, teams will sport a 5v5 save percentage of about 92%, and a 5v5 shooting percentage of about 8%. In theory, if a team has a high PDO, that means they are getting an abnormally high shooting or save percentage (or both), and in most cases over large sample sizes, this is unsustainable and you can expect it to eventually regress to the mean.
There are certainly outliers, and certain teams loaded with snipers or elite goalies have been able to sustain above-100 PDOs over multiple seasons. In fact, the New York Rangers have had a PDO of above 100 in each of the previous 3 seasons, and as of January 2, 2018, have a PDO of 101.31. This is likely because of three primary factors: Henrik Lundqvist is elite, Benoit Allaire is a backup goalie whisperer, and Alain Vigneault’s counter-attacking system has led to the team consistently having a shooting percentage closer to 9% than 8%.
So, while factors like elite (or dreadful) goaltending and specific systems can cause a team to be able to sustain a high or low PDO over multiple years, when you look at the data for the entire league, a much larger sample, you see that in fact, 5v5 save percentage and 5v5 shooting percentage tend to hover around 8% and 92%, respectively. PDO is not perfect, but it is one of many tools we can use in our analysis of a team.
Similar to many of the other stats we have already discussed, PDO can also be used in player analysis. We can look at a team’s PDO when a specific player is on the ice to help round out our analysis, particularly when discussing a player’s goal differential. For example, as of January 2, 2018, Boo Nieves leads the Rangers in 5v5 goals for % with a GF% of 73.33%, a remarkably impressive figure. However, when you look at his PDO (the team’s PDO when he is on the ice), we see he also leads the team here, with an unsustainably high PDO of 106.97. This tells us that Nieves’ high GF% is in part due to the team having unsustainably high shooting and/or save percentages while he is on the ice, so we can expect these numbers to come down as the season progresses.
We have discussed numerous times so far the role that shot quality has in player and team analysis, and we can include this information into our examination of PDO in a similar manner we do with goalie save percentage.
EXPECTED PDO (XPDO)
The sum of a team’s expected Fenwick shooting percentage and expected Fenwick save percentage, converted from percentages into whole numbers. Because most expected goals models use unblocked shot attempts (Fenwick), the expected PDO model also uses Fenwick shooting and save percentages instead of standard. Like expected goals, expected PDO helps us mitigate goaltender performance, and helps us understand how a team would have fared given league-average goaltending. A high expected PDO indicates that a team is either generating a lot of high quality scoring chances, or suppressing high quality scoring chances against (or both), and can therefore be expected to have a higher PDO than your average team
ADJUSTED (DELTA) PDO (DPDO)
The difference between a team’s actual PDO and their expected PDO. The formula is: dPDO = PDO – xPDO. This serves as a more accurate measure of a team’s (or skater’s) luck than normal PDO, because expected PDO considers the scoring chance quality of teams. So, if a team has a positive dPDO, it means that their actual PDO is higher than their expected PDO. This indicates that they are getting higher shooting and/or save percentage(s) than what should be expected, given the quality of scoring chances involved.
As of January 4, 2018, the Rangers have a relatively high PDO of 101.21, 7th highest in the NHL and fairly typical of the team during Alain Vigneault’s tenure. Their expected PDO, however, is 100.58, 4th highest in the NHL (expected PDO much more often hovers near 100 than actual PDO, so despite being closer to 100, it is actually further away from league average). This leaves the Rangers with an adjusted PDO of 0.63, good for 11th in the league. In other words, the Rangers have gotten slightly lucky so far throughout the 2017-2018 season, but are around the center of the pack in the NHL. For context, at the time of writing this, Tampa Bay led the league with a PDO of 2.86.
Catch-All Statistics
Catch-all statistics like baseball’s now popularized WAR are valuable metrics that help us quantify the overall value a player brings to the ice. Nobody will argue that these catch-all stats are perfect and that they should be the end all be all in player analysis (well, nobody worth listening to will argue that at least), but they most certainly are valuable analytical tools. Baseball is by far the easiest sport of the “big four” in the U.S. to have a catch-all statistic for, thanks to the nature of the game; it is more or less a series of individual events, making it much easier to assess exactly what was the cause and effect of a play. Free flowing sports such as basketball and hockey are much more difficult to assess in this manner, but that hasn’t stopped leading statisticians in each sport from creating their own models for a catch-all statistic.
The now-defunct WAR On Ice hockey stats site (both site creators were hired by NHL teams, so the site has since been shut down) made the first prominent foray into hockey catch all stats with their WAR statistic, which they published a series of posts about. More recent catch-all models include Dawson Sprigings’ (better known as DTM About Heart) Goals Above Replacement (GAR), Luke Solberg’s (Evolving Wild) Weighted Points Above Replacement (wPAR) and Manny Perry’s (Corsica creator) version of WAR. Analysts have also come up with catch-all metrics to evaluate entire teams—Manny Perry’s K Rating—and individual game performances: Dom Luszczyszyn’s Game Score.
None of these analysts would ever argue that their models are perfect, and in fact, it’s quite the opposite. In a Hockey Graphs podcast in early October, Dawson Sprigings flat out stated that he hopes people examine his GAR model and offer him suggestions with how they feel it can be improved. It is Dawson’s and these other analysts’ hope that these productive conversations, as well as improved access to data (such as player tracking or improved passing data for example), will lead to even better and more accurate models.
So please, whether you are a complete stats nerd or the most argent eye-tester ever, don’t be that guy that just looks at a stat and shouts “this is stupid!” Let’s have a conversation about it. Why do you think it is stupid? What about it do you think can be improved? What would you do differently? These sorts of conversations can lead us to better data and analysis that we all can enjoy.
WAR
Stands for Wins Above Replacement (just like baseball) and is one popularized model for a catch-all statistic to evaluate hockey player performance. Manny Perry’s WAR model data can be accessed on Corsica. WAR is only a player-level statistic, and it aims to quantify the total value a player brings to the team by accounting for a myriad of individual components. Manny’s WAR model consists of eight total components for skaters: offensive shot rates, defensive shot rates, offensive shot quality, defensive shot quality, shooting, penalties taken, penalties drawn and zonal transitions.
Manny also has a WAR statistic for goalies that attempts to measure their ability to prevent goals. Since WAR measures how much better a player is than a “replacement player,” it should be noted that Manny defines a replacement player as “one who can be signed at the league minimum salary.”
Corsica logically provides data on all of the individual components that comprise its WAR statistic for each player, along with the total WAR (which is a cumulative stat, similar to baseball’s version), WAR per-82 games and WAR per-60 minutes for every player. Analyzing the individual components is a great way to understand what aspects of the game various players excel at. As far as the individual components, here were the leaders in each throughout the 2016-2017 season (minimum 500 minutes played to qualify):
- Offensive Shot Rates – Patrice Bergeron
- Defensive Shot Rates – TJ Brodie
- WAR Rate (shot rate differential) – Patrice Bergeron
- Offensive Shot Quality – Connor McDavid
- Defensive Shot Quality – Ryan Suter
- WAR Quality (shot quality differential) – Connor McDavid
- Shooting – Evgeni Malkin
- Penalties Taken – Oscar Klefbom
- Penalties Drawn – Connor McDavid
- WAR Penalties (penalty differential) – Connor McDavid
- Zonal Transitions – Tom Wilson
In terms of the overall WAR statistics, the top-3 skaters in each were (minimum 500 minutes played):
- WAR – Sidney Crosby, Connor McDavid and Nikita Kucherov
- WAR per-82 games – Sidney Crosby, Nikita Kucherov and Patrick Laine
- WAR per-60 minutes – Sidney Crosby, Nikita Kucherov and Patrick Laine
- WAR Goalie – Sergei Bobrovsky, Cam Talbot and John Gibson
GAR
Stands for Goals Above Replacement and was first created by Dawson Sprigings (DTM About Heart). Unfortunately for us fans (and fortunately for Dawson, who is brilliant and deserves all the success in the world), Dawson was hired in the fall by a sports data company, and his methodology and data is no longer available. However, GAR is still worth talking about, and Manny Perry incorporated his own version of GAR into Corsica.
What GAR does at its core, is it takes the wins above replacement stat (WAR), and converts it to goals above replacement, with the methodology being that scoring a goal is the ultimate goal of any play in hockey. Because Dawson’s data is not available, I won’t dive into it with the level I did WAR, but it is worth pointing out that it functions similarly to WAR, in that it analyzes individual components of a player’s game in order to generate the single catch-all stat.
WPAR
Stands for Weighted Points About Replacement and was created by Luke Solberg. Luke (better know as EvolvingWild) released the methodology and data for his new aggregated value model in the summer of 2017, and it follows a similar construct to both predecessors, albeit with slightly different components and weights to make it unique. Like his predecessors, Luke was very open about the fact that this was simply a starting point for him, and he hopes to continuously improve it as we learn more about the game and get access to better data.
The biggest difference between wPAR and its predecessors is its linear regression model, which to be completely honest is above even my head (I was a financial economics major in college so I have a better understanding of statistics than most, but admittedly I’m not quite to the point where I can sit here and explain to you how exactly this impacts the final stat compared to previous regression models used).
If you’d like to know more about the specifics of how the model is calculated, I encourage you to read the methodology, which I linked above. Another difference that is worth noting here is that WAR focuses on wins added by player, GAR focuses on goals added, and wPAR focuses on point totals. Lastly, wPAR scales the data to consider the season in order to account for the fact that the value of the “replacement” player may differ from one season to the next.
The individual components that wPAR consists of and weights individually include goals, primary assists, secondary assists, tango shots (shots that miss the net; in other words, an individual players Corsi minus their shots on goal), individual expected goals, relative Corsi differential (RelCF%), penalties taken, penalties drawn and faceoff differential. The model also includes interaction variables, which Luke notes are very useful for accounting for the effect that independent variables may have on one another. Similar to WAR, wPAR is a cumulative stat, and it also comes in the form of wPAR per-60 to provide us with a method of comparing players with varying amounts of ice time.
Analyzing the individual components is a great way to understand what aspects of the game various players excel at. In terms of the player data that Luke shared, he grouped the components into five categories: Counts (includes goals, primary and secondary assists, tango shots, iXG and the interaction variables), Differential (RelCF%), Penalties Taken, Penalties Drawn and Faceoff Differential. As far as the individual components, here were the leaders in each throughout the 2016-2017 season (minimum 500 minutes played to qualify):
- Counts – Connor McDavid
- Differential – Patrice Bergeron
- Penalties Taken – Oscar Klefbom
- Penalties Drawn – Connor McDavid
- Faceoff Differential – Patrice Bergeron
In terms of the overall wPAR statistics, the top-3 skaters in each were (minimum 500 minutes played):
- wPAR – Connor McDavid, Sidney Crosby, Nikita Kucherov
- wPAR per-60 – Connor McDavid, Sidney Crosby, Nikita Kucherov
K RATING
This is “a composite tailored regression model” created by Manny Perry of Corsica. In layman’s terms, it is a comprehensive model that, similar to the catch-all statistics provided above, accounts for a variety of individual components that are each “optimally accounted for.” However, unlike WAR, GAR and wPAR, K Rating is a team-level statistic that attempts to gauge the overall quality of a team with one metric. Think of K Rating as the team-level version of WAR. In Manny’s methodology for K Rating, he demonstrates the predictive power of the metric, and he flat out states in the conclusion, “given its predictive power, I strongly believe K will be the single best publicly available team metric in hockey.”
You can access the K Ratings on Corsica from a dedicated tab on the Team Stats page. The K Ratings tab provides all data in a table with 14 columns with the data on the individual components comprising the model and of course the overall K Rating. Similar to the player-level catch-all statistics, there can be a lot of value derived from analyzing the individual components of K Rating. Here are the teams that lead each individual component as well as the overall K Rating throughout the 2016-2017 season at 5v5 play:
- Shot Rates For (RF) – Toronto Maple Leafs
- Shot Rates Against (RA) – Los Angeles Kings
- Shot Rate Differential (Rates) – Boston Bruins
- Shot Quality For (QF) – Pittsburgh Penguins
- Shot Quality Against (QA) – Minnesota Wild
- Shot Quality Differential (Qual) – Minnesota Wild
- Shooting – Washington Capitals
- Goalie – Columbus Blue Jackets
- Penalties Taken (PT) – Carolina Hurricanes
- Penalties Drawn (PD) – Philadelphia Flyers
- Penalty Differential (Pens) – Carolina Hurricanes
- Offensive K Rating (OK) – Pittsburgh Penguins
- Defensive K Rating (DK) – Washington Capitals
- K Rating (K) – Washington Capitals
GAME SCORE
Game Score is a catch-all statistic created by Dom Luszczyszyn that quantifies the total value of a player’s productivity from a single game. Dom notes in his methodology that he got the idea for Game Score from noted basketball analyst John Hollinger (also the creator of PER), and that the original Game Score statistic is in fact from famous baseball statistician Bill James. Dom’s NHL version of Game Score incorporates all of the following stats in an attempt to quantify the overall performance of a player: goals, primary assists, secondary assists, shots on goal, blocked shots, penalty differential, faceoffs, 5v5 Corsi differential and 5v5 goal differential. However, as you all know, not all stats carry the same importance, so Dom assigned weights to each of the aforementioned stats to come up with the following formula for Game Score:
Skater Game Score = (0.75 * G) + (0.7 * A1) + (0.55 * A2) + (0.075 * SOG) + (0.05 * BLK) + (0.15 * PD) – (0.15 * PT) + (0.01 * FOW) – (0.01 * FOL) + (0.05 * CF) – (0.05 * CA) + (0.15 * GF) – (0.15* GA)
Dom also created a Game Score model to assess goaltender performance, which included two stats—goals against and saves—which are also weighted according to importance. The goalie model is as follows:
Goalie Game Score = (-0.75 * GA) + (0.1 * SV)
Game Score has multiple applications, and Dom frequently uses it in his writing when assessing player and team performance across single games as well as entire seasons (and everything in between). Dom notes in his methodology that, “there’s many applications for Game Score across hockey analysis that I think can further our understanding of the sport and how players work at the game level. Consistency, streakiness, clutchiness; whether they’re real or random is a question a stat like Game Score can help answer and one that we perhaps couldn’t answer properly beforehand.”
Dom also recently created another stat based off of Game Score, which he calls Game Score Value Added (GSVA). GSVA is a three-year version of Game Score that is translated to its value in wins. In other words, GSVA is similar to WAR, GAR and wPAR, in that it communicates player value in terms of wins, as opposed to points or any other production metric. One particularly useful application of GSVA isDom’s use of the model in the pre-season to project individual player performance, and then aggregating these performances by team to project team performance.
On January 10, 2018, Dom published an article for The Athletic that discussed how each team and player had performed to that point in the season compared to his pre-season expectations. The following infographic shows how the Rangers had performed to about the midway point of the season, and it includes the team-level data at the top, and a table beneath displaying the 2017-2018 average Game Score for each player, along with their pre-season GSVA projection, their current GSVA and the differential between the two.
As you can see, Mika Zibanejad has the highest average Game Score, Pavel Buchnevich has exceeded his pre-season projection by the greatest amount, and Kevin Shattenkirk has been the greatest disappointment. It should be noted that despite being a disappointment, Shattenkirk still ranks as the second-best defenseman on the team, but his GSVA is a whopping 0.66 below his expected GSVA. I have included an image of the color key as well, which explains that the darkest blue indicates top-tier production, while red are replacement level numbers.
Resources
There are a number of fantastic resources available for free that can help hockey fans understand many aspects of the game, ranging from comprehensive analytic database sites to player and team contract and salary cap websites. Below is a list of some of the sites I personally find to be very helpful. If there are any sites you like that are missing from the list, please let me know and I will make sure they get added.
All-3-Zone Player Comparison Tool – A new player comparison tool created by CJ Turtoro using Corey Sznajder’s excellent passing and zone entry/exit data to help fans easily compare players’ abilities at generating shots themselves, setting up teammates, entering the offensive zone and exiting their own defensive zone.
Blue Seats Blogs – Rangers blog run by David Shapiro that has a number of great writers, including Josh Khalfin, The Suit and many others. Most notably, the blog has a series of fantastic posts that help breakdown various hockey systems, including those employed by the Rangers.
Blueshirt Banter – SB Nation’s New York Rangers website, run by Joe Fortunato. The site has a number of excellent writers and contributors, including Adam Herman, Shayna Goldman, HockeyStatMiner and many others.
Canucks Army – Excellent Vancouver-focused hockey website and blog with a heavy focus on advanced stats and data analysis.
Canucks Army – Excellent Vancouver-focused hockey website and blog with a heavy focus on advanced stats and data analysis. The blog includes a variety of fantastic resources, including detailed prospect rankings, and is home to a number of excellent writers such as Ryan Biech, J.D. Burke, Jeremy Davis, Vanessa Jang and Jackson McDonald.
Cap Friendly – Player and team contract and salary cap data, including both AHL and NHL players.
Clear Sight Analytics – Website for Steve Valiquette’s excellent hockey data and tracking company. Includes team and player-level shot quality, scoring chance and finishing ability data and much more.
Corsica – Hockey stats website run by Manny Perry that includes player (skater and goalie) and team-level metrics as well as a live games feature that provides near real-time advanced stats on games. Player data includes WAR and expected goals models while team data includes K Ratings. Also includes line and pairing stats, WOWY analysis (not currently live yet), a blog and game predictions.
CrowdScout Hockey – Analytics and crowdsourcing hockey data website created by Cole Anderson that also includes fantastic goalie and skater comparisons tools as well as an expected goals model and a wealth of skater and goalie data.
Dispelling Voodoo – Goalie analysis-focused website created by Ian Fleming that houses the SAVE goalie comparison and evaluation tool that incorporates a variety of advanced goaltender metrics, including save percentage breakdowns by shot quality, expected goals against data and goals saved above average information.
Elite Hockey Prospects – Hockey prospects website and database that includes a ton of valuable information pertaining to hockey prospects from around the world as well as drafts, awards, free agents and country program information.
EvolvingWild’s Website – Presents the excellent RAPM and expected goals data and charts (and more to come) generated from EvolvingWild’s proprietary statistical models.
Hockey Abstract – The Hockey Abstract is Rob Vollman’s website as well as the name of his fantastic series of hockey analytics books, which he updates on a yearly basis. The site is also home to his recently released Player Usage Charts tool, which we discussed in detail within the Deployment section of the Lexicon.
Hockey Reference – Similar to its basketball and baseball counterparts, Hockey Reference is a solid site for hockey data. It does not contain near the level of advanced metrics as Corsica or Natural Stat Trick, but it offers some of the more basic fancy stats and all of the standard statistics that casual fans will look for.
Hockey-Graphs – Analytics-focused hockey website that features statistical analysis posts from a number of leading hockey statisticians, including Garret Hohl, Carolyn Wilke, Matt Cane and many others.
HockeyViz – Offers fantastic team and player visualizations that provide digestible information on a number of important concepts, including game flow, team and player shot heat maps, WOWY analysis, deployment information and much more. Run by Micah Blake McCurdy.
MetaHockey – A hockey analytics repository created by Prashanth Iyer and Mike Gallimore that houses a vast array of hockey analytics-focused publications and resources along with information on related events.
Natural Stat Trick – Hockey stats website, includes player (skater and goalie) and team-level metrics including data on individual games. Player and team data includes valuable shot/scoring chance quality information. Includes WOWY analysis, data on how players did against specific opposing skaters and line combination data.
NHL Drafter –NHL draft and prospect analysis website that also puts out a variety of additional noteworthy content, including a Backup Goalie Report. Also features a great Twitter account that provides a lot of the great content to followers.
Own the Puck HERO Charts – Intuitive player evaluation tool created by Domenic Galamini Jr. that allows users to quickly analyze a player’s production and shot attempt impact on his team. The left-side of the chart presents scoring tier line probabilities based on a player’s per-60 primary point production while the right-side demonstrates the player’s offensive and defensive usage-adjusted shot attempt impacts, relative to the positional average.
PuckPedia – This resource-rich site provides (but is not limited to) player contract and team salary cap information, agent profiles and client information, CBA and salary cap details and history, a litany of NHL news and information (includes trades, transactions, injuries, scores and schedule), prior and future draft information, team coaches, recommended follows/resources at the team and league-level, and a variety of help resources including FAQs and a fantastic Ask the Capologist feature that allows users to submit questions directly to the site’s carpologist, who in turn provides a quick and thorough answer for everyone to see.
SKATR Comparison Tool – Intuitive player comparison and evaluation tool created by Bill Comeau that functions similarly to HERO Charts. Users simply must select the players and seasons they want to compare from the dropdowns at the top in order to view a side-by-side comparison of the two skaters across nine individual statistics and 11 on-ice metrics.
Spotrac – Player salary and team salary cap data, including both AHL and NHL players.
Tape to Tape Tracker – Brand new tracking system designed to help standardize the way that passing and zone entry/exit data is tracked. The tool was launched by Prashanth Iyer, Rushil Ram and Mike Gallimore, and it is a free, public point-and-click tracking system that uses crowd sourcing to add passing data to events already tracked by the NHL (such as shot attempts and goals).
The Athletic – Subscription-based sports coverage website home to multiple leading NHL writers (many of whom are also leading analytics writers), such as James Mirtle, Craig Custance, Pierre LeBrun, Tyler Dellow, Sean Tierney, Dom Luszczyszyn and many more.
Twitter Follows
There are number of great people on Twitter that have been a tremendous help to us and many others with respect to learning more about hockey analytics and the game in general. Below is a list of individuals that have been a great resource for us and that we would recommend to anyone looking to learn more. As always, please let us know if you think of someone else worthy of being added to the list. The first group of individuals are followers specifically for Ranger fans (although, we’d highly recommend any hockey fan to give them a follow, because they are all worth listening to) while the larger second group are more general NHL follows.
NEW YORK RANGERS-FOCUSED FOLLOWS
Brandon Fitzpatrick – Super cool guy and smart writer for Gotham Sports Network. He also creates fantastic GIFs of key plays and moments of Ranger games that he posts to his Twitter timeline.
Joe Fortunato – Managing editor of Blueshirt Banter and host of the Bantering the Blueshirts podcast. Big analytics writer and proponent who is very active on Twitter and willing to interact with hockey fans.
Shayna Goldman – Very smart writer for The Athletic NYC, Hockey-Graphs, Blueshirt Banter and FanRag Sports NHL. She also creates fantastic GIFs of key plays and moments of Ranger games that she posts to her Twitter timeline.
Adam Herman – Analytics and prospects expert and writer for Blueshirt Banter and Sporting News NHL.
HockeyStatMiner – Contributor to Blueshirt Banter, CBA and NHL salary cap expert, and an excellent follow on Twitter who frequently engages with fans to help them learn.
Josh Khalfin – Prospects expert and a well-rounded analytical mind. Maintains an excellent Tableau Public profile with all sorts of prospect and analytical data visualizations.
Alex Nunn – European prospect expert and contributor to Blueshirt Banter and Pro Hockey News.
Dave Shapiro – Owner of the Rangers blog Blue Seats Blogs and very smart and analytically-inclined Rangers writer.
Steve Valiquette – CEO of Clear Sight Analytics, analyst for the New York Rangers on MSG and one of the best studio analysts in all of sports.
NHL FOLLOWS
Cole Anderson – Creator of CrowdScout Hockey, former collegiate goalie, goaltender and analytics expert and one of the best follows on twitter for anyone who wants to learn from and interact with an analytics expert.
Matt Barlowe – Data analysis expert and proprietor of Barlowe Analytics, which is a fantastic website for individuals looking to learn about data analysis and related tools, such as Python, R and Tableau. The site also features Matt’s expected goals model, whose data is also available on the Barlowe Analytics Twitter account. Also an excellent follow on Twitter for anyone looking to learn more about hockey or data analysis in general.
Ryan Biech – NHL prospects expert and contributor to Sportsnet 650, The Athletic Vancouver and Canucks Army. Prospects analysis includes an excellent mix of advanced stats analysis and video breakdowns.
Justin Bourne – Senior NHL Columnist for The Athletic, Toronto Maple Leafs analyst for Sportsnet, former Toronto Marlies video coach and former collegiate and minor league hockey player.
Mitch Brown – NHL prospects analyst and writer for The Athletic Montreal. Creator of an excellent CHL Data Tracking Project which tracks passing and zone entry/exit data for prominent junior hockey teams.
Stephen Burtch – Hockey analytics writer for Sportsnet and one of the more active analytics experts on Twitter.
Matt Cane – Analytics expert, editor for Hockey-Graphs and creator of the analytics website Puck++ which houses much of his great work.
Bill Comeau – Creator of a variety of excellent hockey visualization tools, including the SKATR player comparison tool and the STACKS goal share visualizer.
Colin Cudmore – NHL prospect analyst and writer for Silver Seven, the Ottawa Senators’ SB Nation affiliate. Maintains an excellent Tableau Public profile that houses a variety of excellent hockey visualizations, including charts for the game prediction models posted on Corsica, and for junior league and minor league player performance.
Craig Custance – Editor-in-Chief of The Athletic Detroit, host of the hockey podcast The Full 60 and author of the excellent hockey book Behind the Bench: Inside the Minds of Hockey’s Greatest Coaches.
Tyler Dellow – NHL columnist at The Athletic, former analytics employee of the Edmonton Oilers and been at the forefront of the development of hockey analytics for over 10 years.
Dimitri Filipovic – Extremely smart hockey writer for Sportsnet and host of the excellent Hockey PDOcast podcast.
Ian Fleming – Creator of the Dispelling Voodoo goalie analysis website, which contains the SAVE goalie evaluation and comparison tool. Contributor to Hockey-Graphs and NHL Numbers.
Domenic Galamini Jr – Hockey analytics expert and creator of the HERO Charts player comparison and evaluation tool.
Garret Hohl – CTO and co-founder of HockeyData Inc., consultant for hockey teams, players and agencies at varying levels and co-founder of Hockey-Graphs.
Prashanth Iyer – Hockey statistics and systems expert, contributor to Hockey-Graphs and The Athletic Detroit and co-creator of MetaHockey and the Tape to Tape Tracker tool.
Pierre LeBrun – NHL insider, writer and analyst for TSN and the Senior NHL Columnist for The Athletic’s Toronto branch.
Dom Luszczyszyn – Analytics-focused hockey writer for The Athletic and contributor at Hockey-Graphs. Most notably, Dom created the Game Score statistical model.
Jeff Marek – Sportnet NHL and CHL TV host, hockey prospects expert and writer for Sportsnet, host of the former Marek vs. Wyshynski podcast and current 31 Thoughts: The Podcast with Elliot Friedman.
Micah Blake McCurdy – One of the leading hockey statisticians and proprietor of HockeyViz. Very active on Twitter and frequently posts his great data visualizations to his page.
Bob McKenzie – One of the leading NHL insiders, TSN writer and tv analyst/insider and host of his own podcast, The Bobcast.
Nick Mercadante – Goalie and analytics expert, contributor to Hockey-Graphs and frequent guest on Dimitri Filipovic’s Hockey PDOcast podcast.
James Mirtle – Editor-in-Chief of The Athletic NHL and The Athletic Toronto and analyst for TSN.
Namita Nandakumar – Friend of the Blueshirts Breakaway podcast, hockey analytics expert, editor at Hockey-Graphs and a diehard Philly fan.
Emmanuel (Manny) Perry – Proprietor of the comprehensive hockey analytics database site Corsica. Also one of the most noteworthy hockey statisticians in the industry, having created his own WAR and expected goals models, as well as multiple unique models such as K Rating and a game prediction model.
Mike Pfeil – Hockey analytics and systems expert, contributor to Hockey-Graphs and very active on Twitter and willing to help fans.
Corey Pronman – NHL prospects writer and expert for The Athletic.
Catherine Silverman – Goalie expert and writer for InGoal Magazine and The Athletic Chicago. Also one of the single best people to follow on Twitter if you wish to learn more about goaltenders, and she frequently engages with fans to help them learn.
Luke Solberg (EvolvingWild) – Noted hockey statistician, creator of the wPAR catch-all stat and contributor to Hockey-Graphs.
Dawson Sprigings (DTM About Heart) – Recently hired by a professional sports data company so he is no longer very active in the hockey analytics community, but he is one of the most notable hockey statisticians and creator of the catch-all stat GAR.
Ryan Stimson – Editor for Hockey-Graphs and the creator and manager of the Passing Project which aims to gather data and provide insight on how players contribute to shot attempts as well as new methods for evaluating goalies and how teams score goals.
Corey Sznajder – NHL micro-stat extraordinaire who specializes in passing data. Contributor to Arizona Coyotes blog Five for Howling and proprietor of The Energy Line.
Sean Tierney – Friend of the Blueshirts Breakaway podcast, analytics writer for Hockey-Graphs and The Athletic. Also one of the single best follows on Twitter, and he frequently creates excellent data visualizations that are easy to understand and shares them on his Tableau Public profile Twitter.
CJ Turtoro – Analytics writer for SB Nation’s New Jersey Devils blog All About the Jersey. Most notably, he is the creator of the All-3-Zone Player Comparison Tool.
Connor Tompkins – Analytics and data visualizations expert and contributor to Hockey-Graphs.
Rob Vollman – Noted hockey statistician, hockey writer for ESPN and NHL, and author of the excellent hockey analytics books Hockey Abstract and Stat Shot.
Carolyn Wilke – Self-described “hockey analytics hedgewitch.” Director of Social Media and writer for Hockey-Graphs who is best known for her salary and player valuations. Also the co-host of Dallas Stars-focused podcast Deep in the Heart of Hockey.
Kent Wilson – Analytics expert and Flames-focused writer for The Athletic Calgary, former writer at a number of prominent publications such as the Calgary Herald who has been at the forefront of hockey analytics for over 10 years.