A Journey in Hockey Analytics and the Shortcomings of the Discipline

By, Brendan Collins

The evolution of the term “analytics” in hockey circles has expanded dramatically in the past decade or so. Other than the basic goals, assists, faceoff percentage and plus/minus; most teams in the NHL didn’t even have a statistician on staff let alone an analytics department until very recently. The book by Michael Lewis called Moneyball which later became a movie starring Brad Pitt, changed the way professional sports owners looked at statistics/analytics. Part of Michael Lewis’ skill as a writer is that he can break down complex topics, such as the financial crisis of 2008 in the US (The Big Short) or the inside of US government agencies (The Fifth Risk) and make it easy and entertaining to read. This is both good and bad; it helps us understand complicated topics but it gives the false impression that these are less complicated than they actually are.

My journey in analytics started at Connecticut College where I studied statistics and economics and was fascinated by Bill James, who is at the heart of the story in Moneyball. While Billy Beane is the focus of the story because he was the first major league GM to implement the analytics, it was actually the work of Bill James and his study/creation of sabermetrics for which the book is actually based. It’s important to note that James’ entire focus was on trying to utilize available data to determine how teams win games and lose games. He looked for patterns, he analyzed all the variables he could find, ran regressions and found certain data points that had strong correlations to the outcome of the game and from that built an entire discipline that is still used in baseball today. I, with only the hubris an ignorant 20-year-old could have, read his book, felt I was now an expert and was going to do the same thing for the sport of hockey. I’d use my undergraduate math prowess paired with my knowledge of the game of hockey and be the next Bill James. 

In my second year at college I took a course called Linear Regressions and our first project was to analyze a large set of data and find correlations for a cause and effect relationship. If I study the perspiration of 100 people in different scenarios and I look at the relationship between heat and sweat I could then make a fairly accurate assertion that the higher degree of heat the more perspiration. The next step is to find other variables such as body movement, nerves, etc and  how does perspiration relate to both heat and exercise for example as well as multiple other factors. Someone who is hydrated, running up a hill in 70 degree fahrenheit weather is going to have a different level than someone still, dehydrated in 80 degree fahrenheit temperature so heat alone doesn’t explain the cause of the perspiration.

I,of course, focused my project on hockey and used NCAA Division 1 Men’s hockey as the sample size and started with power play scoring percentage and winning as my two starting variables. I found right away that in that particular year 15 of the top 20 teams in the league were in the top 20 in scoring percentage! Walaa!  I ran some regressions and was able to show a significant correlation between power play percentage and winning games. I thought I had made a breakthrough and submitted my paper to the professor with a sincere feeling of accomplishment only to be peppered with more and more questions to get more and more granular with the data. He wanted to know how that season related to the previous five seasons data; he wanted to know  what were the penalty kill percentages of the teams they had played; he wanted to know what was the goalies save percentage and how many goals were scored on the starting goalies as opposed to backups, he wanted to know the percentage of home games vs away games; etc. He essentially was challenging me to put in more relevant variables that I had not accounted for to get a more clear picture of the relationship between power play scoring percentage and winning games. 

I rushed back, inputted all the new data, got a slightly less correlation but was excited to hand it back in. Then, he wanted to know the impact on 5v5 goals, the impact of strength of schedule, the size of the players on the team, the age of the players, the conference they played in, etc. What was once a single variable study turned into 25 variables and I had a lot more data but it was showing less and less correlation between winning and power play percentage. What this meant is that I couldn’t simply build an algorithm that said plug in the teams power play scoring percentage and a few other details and it would spit me out a fairly close prediction of their win percentage on the season. I use this example because this is the journey of analytics; you have a question and you try to understand all the elements the best you can to find patterns in the dataset. To do this accurately you need to make sure the math is right, that the sample sizes are both large enough and randomized to avoid bias, you have to make sure you are factoring in the correct variables and calculate their relationship to the primary question (how are games won) and their relationship to to the other variables.

The project made me realize how little I really knew about analytics in hockey and how difficult it is. Understanding its importance in the game, one of our first tasks at Neutral Zone was to build out an entire analytics/education focus to track how minor hockey players performed at the junior level and how junior level players performed at NCAA or CHL level and how those players performed at the NHL, etc. We broke down particular skill sets like skating or agility, speed, stickhandling, hockey IQ, etc and gave them different weights and tweaked them and developed a model. All of this in an effort to have the most accurate rankings using both subjective and objective data. What we found was that our subjective data, scouts in the stands, had a much higher rate of accuracy than our objective data. Some of that was because we were asking the wrong questions which is typically the first mistake made in analytics, the other was that there wasn’t enough quality data for amateur hockey at the time. 

The study of analytics is important and it’s a great tool to help analyze team performance as well as player performance and every year with improved technology there is more and more data being generated. In our years of learning and developing models internally, talking with NHL Scouting Departments, talking with NHL Analytics departments, listening to sports analytics conferences; I have come to find that while great advances are being made and we are consistently utilizing the data that is available, there are two major issues in the current hockey analytics landscape today and for these reasons, we at Neutral Zone, take a very careful approach to analytics in our analysis of player performance and projectability.  

#1 Using Past Performance to Predict Future Outcomes – this has a low yield across most all disciplines but particularly in hockey. If you are analyzing how a player on the Calgary Flames (NHL) would do on the Colorado Avalanche (NHL) that will have a higher level of accuracy as opposed to  analyzing a player from the OHL, Minnesota High School or Swedish Elite League and trying to determine given their analytics how that is going to translate to NHL success. In fact, trying to evaluate across leagues and predict future success at a higher level is so inaccurate that it’s a potentially detrimental exercise. Why? Because the league plays are vastly different in style, in ability, in pace of play, age, size and strength of opponents, etc. Sometimes the attitude is more data is better and that’s not always the case and I see more and more analytics looking at NHL Draft prospects from various leagues by people who don’t know the differences and are treating them as the same or putting in their own subjective weighting system to account for different league values. I am even seeing companies that know who the best say 50 or 100 players are and reverse engineer the analytics to fit a narrative and explain why these players are the best. Essentially people who don’t understand the importance of bias, sample sizing, randomized methodology, etc. who are just trying to use objective data to support their subjective narrative.

I spoke with one team this summer who had analytics on how each amateur league had produced NHL players and the percentages of High School players’ success rate versus CHL success rate versus Europeans, etc.  I asked how do you factor what round they were selected in because it obviously wouldn’t make sense to compare a 7th round High School player to a 1st round OHL player. They said they take the historical percentage of success at the given draft slot; i.e the 55th draft selection has a 48% chance of playing at least 100 games in the NHL (just an example not an actual stat). They use that in connection with the league the player is coming out of to see if there are patterns where particular leagues generate a higher percentage of NHL players than their historical averages. Then they factor in on top of that the players points, +/- and other data points to layer that in as well. They do this in order to provide a historical context when they are watching a player and trying to figure out where they fit in. How have Minnesota HS players who scored 40 goals in their draft year done at higher levels would be a question they are looking to solve. 

The key word here is “historical” as it provides a context for the team, but it cannot and does not attempt to predict how this particular player this year in Minnesota HS will project as a professional player. It’s helpful for a team to know that the last 25 Minnesota HS players drafted in the NHL who are 5’11” 185 lbs in their draft year who scored 30+ goals that season only 12% of them made it to 100 games in the NHL. Some teams will factor that more than others but that’s an interesting context not only for the local scout but the scouting director who is going to be responsible for where that player falls on their draft board. However, when you give too much weight to the PAST performance as an indicator or predictor of future success it yields very low percentage success rates. While every NHL team feels they have the “secret sauce” a close analysis of NHL Draft success would clearly show; nobody has found the secret. At the end of the day analytics does a great job providing data points for past performance but taking those data points and trying to predict future success is a dangerous game and has yielded a very low rate of success. Why is that? One major reason takes us to #2.

#2  Focusing too much on the individual player output and not the relationship to the other players on the ice – this is the biggest hurdle or problem with analytics as they stand today. There are some outstanding resources that can break down game film, break down zone entries, first touches, puck battles, faceoff percentages, etc. You can pull up players on InStat for example and get all their individual shifts, all their goals, all their +/- plays, puck battles, etc and data is generated from that. The issue is that it doesn’t explain the relationship to the other players on the ice which is essential in having a true understanding of what the numbers actually represent.

There are many great examples of the impact of playing on a line with a top player but we’ll use Rob Brown for this particular case who played on a line with Mario Lemiuex in the 1988-89 season and 1989-90 season and accumulated 195 points, +17 in 148 games (1.32 PPG). He was traded to Harford and over the next two seasons had 105 pts and -22 rating in 136 games (.77 PPG). If you look at that in context, the difference between scoring 1.32 points per game versus 0.77 points per game in the NHL today that is the difference between a top 10 scorer versus #124th. He’s just one example of who you play on a line with impacts your statistics in all areas. Players on good teams will have better analytics than players on bad teams and the same in the opposite. Players who are playing with talented linemates where there is either a star player or great chemistry are going to have better analytics than someone who is playing with weaker players with low chemistry. 

One metric that has been scrutinized in years past was +/- for this very reason as it is seen as more of a team stat than an individual player stat. So when I brought this up to an NHL analytics team this year I asked do you look at the raw number or do you look at the players plus minus in relation to his teammates plus minus. For example, if the team is weak and the average is -17 and he’s -5 that could be seen as better than a player on a good team that’s team average is +16 and he’s only +7. The team’s response was that they go a step further and have a metric that will calculate the player’s distance from the puck 15 seconds before a goal is scored so they can assign if the minus had anything to do with them and same for the plus. He showed me an example of a player who made a great effort play off the cycle, cut to the net and made a centering pass for a catch and shoot goal. He gave the credit for the plus just to the player who dropped it down on the cycle, the player who made the pass and the player who scored so three of the five players on the offense. They gave minuses to the center and defender who got walked on the cycle and the weak side defender who saw the pass go across the slot. While that nuanced approach to calculating plus/minus is exactly the kinds of steps that need to be made to generate more accurate data; this case was an alarming example of the shortcomings in the current process.When you rewind the clip you can clearly see the goal scorer was the weak side defender who collapsed from the blue line and came down into the slot all-alone because the opposing teams winger tasked with covering him at the point had his back to the play and didn’t pick up his man. He wasn’t given a minus here because he was never within 10 feet of the puck but any coach would tell you he’s the primary cause of the goal against. This is another example of the data side of hockey not fully grasping the hockey concepts.

You can get as granular as you want and most cases in the sport of hockey are related to actions of other players on the ice. However, even if you breakdown faceoff percentage or 50/50 battle win rates that are 1v1 scenarios; it still has intricacies that will impact that data as a whole. For example, who is the player facing off against? Are they going against the opposing team’s top center most of the time? Is the top center a 60% win rate or a 40% win rate? In a 50/50 scrum for a loose puck is the player 5’7” 150 lbs or is he 6’3” 215 lbs?  To take that a step further, if you are drafting a player for the NHL then how relevant are puck battles he won against players who are 140-165 lbs? How many players is he going to face in the NHL who are that size and build? Not many. How relevant is faceoff data against a centerman who is actually a wing and filling in because of an injury? 

Goal scoring might be the best example of this as most analytics account that a goal is a goal. Well anyone in hockey knows that’s not the case; there is a big difference between a player who scores a tap in far post or a loose rebound that just happened to land on his stick in the right spot versus someone who beat a defender, was taking contact and still managed to score on a goalie who had position and time to react. Those goals both go in as “1” but the circumstances are vastly different as is the difficulty. Who bears the most credit for a goal scoring event, the player who created a turnover, went end to end and fed a pass cross slot for his teammate to tap it in or the player who tapped it in?  There is a significant difference between a goal that goes in on an average OHL goalie and a goal that goes in on an NHL goalie. If you are an NHL team drafting a goal scorer for your team, how relevant are the goals he scored that were against 5’8” goalies or goalies with sub .900 SV% or against backups, etc.? He won’t face anyone that caliber at the next level so total goals scored on a season is not the most accurate predictor of goals at the next level. In fact, if a team brought on a goalie expert at the NHL level they may find a a player with 15 goals scored could have had more NHL quality goals than someone with 30 total goals on the season; and that would likely be a better indicator of future scoring potential. 

While goals and faceoffs have been two major statistics in the past; there are two data points that have been growing in popularity in recent years which are first touches and zone entries. Like goal scoring, this has context as well. Did the initial pass on the breakout come to the player in stride and on the tape or did they get that pass in their skate or a foot behind them and had to slow down or stop to catch it? Was the pass delivered from the best defenseman on the team where they gave their teammate space to skate once they received it and had touch and accuracy behind it or was it rushed by the weakest defenseman on the team who telegraphed the pass and the second they receive it they have opponents in their face? Those two scenarios will have vastly different outcomes and more importantly, the delta or difference between the two outcomes has very little to do with the player being analyzed. And therein lies the problem or the shortcomings with analyzing data without real context or understanding of the relationship between the other players on the ice.

On a zone entry is the player going against the top pair defenseman who may be aggressive and stand up opponents at the blue lines or are they going against the bottom pair who don’t trust their feet and back up to the faceoff circles before they make any kind of attempt to steal or check or pressure the puck carrier. Those nuances in each scenario matter as well. Is the weak side winger creating space by attacking the net splitting the defense and is the center reading how the defense plays it and moves to get open or are the other linemates gliding in a straight line across the ice giving the puck carrier no real option for a rebound or a pass or space to operate? All of these situational intricacies are a factor as they impact the data. Without going into depth on every scenario like this; it’s fair to say that in order to understand what the data actually represents you have to understand the relationship between the other players on the ice, not just the individual player. 

When I brought this up to teams the analytics staff would say that is why they use large sample sizes and the details should shake out in the aggregate. Well that’s not true; if you are on a bad line all season or on a bad team then more data doesn’t lead to more accuracy. Another growing concern in the “secret sauce” concoction is that teams are combining subjective data with objective data with the intention it provides more context to the objective data and provides more accuracy to the subjective data. This could be true and it’s what is going on in every NHL team’s back-offices. However, the same could be true in the reverse: it could muddy the waters and present less accuracy in both areas.

Like anything in its early stages there are bumps along the way and with AI and other technological advances in regards to video quality, the analytics are only going to get better and better and become a more integrated tool by talent evaluators. WIth that being said, be careful on making future predictions based on past performance and be careful on context as a players analytics are greatly impacted by his linemates, his team and the opponent he’s matched up against in nearly every measured statistic. Ignoring those intricacies, as is the case today in most all analytics today, can be inaccurate and misleading.

Photo Credit: Dan Hickling, Hickling Images