Scouting: Finding the right balance between projectable skills and point projections
Analyzing the trade-off between skills projectability and point equivalencies
A few months ago, a former colleague of mine with the McGill Martlets asked me a question about point equivalencies. Having just discovered Byron Bader’s NHLe model, she was wondering how point equivalencies were used to add value in our CEGEP scouting.
At the time, I simply replied that the equivalency models we had been using add value to our scouting as they allow us to leverage more of the tools available to evaluate how players would project into our lineup. In other words, point equivalencies allowed us to find a balance between traditional scouting methods and statistical models when analyzing different players.
The idea behind using point equivalencies as a statistical tool for scouting is that players who are able to dominate at their current level (i.e., put up points) have a higher probability of making and being successful at the next level.
But more recently, this discussion from the start of the year got me thinking about another question: what is the right balance between traditional scouting methods and point equivalencies to optimize talent evaluation decisions in women’s hockey?
Defining the metrics to use
Traditional scouting methods encompass in-person viewings and video analysis among other things. In its essence, the goal of these traditional scouting methods is to identify players who have projectable skills (i.e., skills that will transfer well at the next level).
Point equivalencies are quantitative in nature. But to answer our question, we would need to quantify the projectability of skills.
To this end, we can use a metric we recently developed with a few women’s hockey coaches and analysts called SARAH (Scouting Automated Ratings Analyzing Habits). In short, SARAH is a tool that quantifies the foundational habits and skills of different women’s hockey skaters. It calculates the probability that different projectable habits are completed successfully by players based on their micro-stats.
As for point equivalencies, we will use an updated version of the N-WHKYe model which standardizes offensive production across 40+ leagues in the world.
Data exploration
First things first, we can assess the direct relationship between the projectability of skills (scores derived from SARAH) and production equivalency (N-WHKYe), looking at individual player data for the 2021-2022 season. The players considered in the graphs below participated in the most recent World Championship and/or Olympic Games.
Separating skaters by position, we also excluded outliers from the graphs below as certain players skewed the results (yes, Marie-Philip Poulin broke yet another statistical model).
Running a simple OLS regression yields interesting results. In both cases, we observe a positive correlation between our 2 variables. Put simply, players that have more projectable skills are likely to have a stronger offensive impact.
These positive relationships are not surprising, but overall, the coefficients of determination between 0.5 and 0.6 (R-squared) suggest a moderate correlation between the projectability of skills and production equivalencies.
However, the relationship between our 2 variables seems stronger for forwards than for defenders. While projectability of habits explains 59% of the variation of offensive production for forwards, it only explains 53% of it for defenders.
This could be due to the fact that offensive production is driven by different skills for forwards as opposed to defenders. To test this hypothesis, we could perform the same experiment looking at the different skill sets outlined in the SARAH project (passing, skating, shooting, stickhandling, reception, physical play, defending).
From this R-squared analysis at the skill set level, we can derive a few insights:
For forwards, distributing the puck is one of the most important drivers of offensive production. Skating, shooting, stickhandling & puck reception are also skill sets that have a low to moderate impact on offensive production. On the contrary, physicality and defensive ability have very little impact on offensive production, as it was to be expected.
For defenders, the main difference is that stickhandling skills (such as loading the puck to the hip pocket) arrive on top of the list. Defenders who have strong stickhandling skills are more likely to contribute offensively both off the rush and in-zone by escaping pressure more easily to open valuable ice.
However, as we were able to observe with the overall SARAH score, the relationship between skill sets and offensive production is weaker for each category when analyzing defenders as opposed to forwards.
Also, when comparing the R-squared in the graphs above, we note that the overall appreciation of a player’s projectable habits is a better linear predictor for offensive production than when broken down into skill sets. In the next step, we will look at how much importance should be put on the overall SARAH score as opposed to production equivalency, in order to answer our initial question.
Building the models
When thinking about the importance of point equivalencies compared to projectability of habits, one interesting aspect to consider in our analysis is the variation in the level of play in different leagues.
Based on the average N-WHKYe over a 5-year period, we can identify 3 different tiers of pro WHKY leagues:
Tier 1 with leagues that have an average N-WHKYe > 0.5: PWHPA & PHF
Tier 2 with leagues that have an average N-WHKYe between 0.25 and 0.5: SDHL
Tier 3 with leagues that have an average N-WHKYe below 0.25: EWHL, SWHL & Naisten Liiga
College leagues (NCAA & USports) are excluded from the analysis in addition to other U23 players. Given that the goal of this study is to uncover optimal talent evaluation strategies, we exclude players coming from teams that feed into the pro leagues of interest in our models. Most of these players have not yet gotten the opportunity to reach (or not reach) the pro levels.
With these different tiers, we separate the modelling process in 3 parts. Each model attempts to predict the probability that different players “make” a league (GP > 7) based on point equivalencies and projectability of habits.
Model 1 relates to tier 1 and classifies players as either successfully making or not making the PWHPA & PHF. Model 2 relates to tier 2 and attempts to predict whether players make the SDHL or not, while Model 3 pertains to the remaining pro leagues of Tier 3 (EWHL, SWHL and Naisten Liiga).
In these binary classification models (logistic regression), we also adjust for other factors such as age and position. These logistic regressions incorporate an l2 regularization to account for the moderate multi-collinearity that exists between our predictors.
Results & Conclusions
The resulting coefficients for our variables of interest (Projectability of Skills and Point Equivalencies) are summarized in the table below, for our 3 models:
The main takeaway from the results above is that as league strength increases, projectability of habits becomes more and more important as a predictor of successfully making the next level.
In stronger leagues such as the PWHPA & PHF, scouting decisions can be optimized by relying more on the projectability of habits than production equivalency. On the contrary, in leagues with a lower level of play, production equivalency grows in importance as a predictor of successfully reaching the next level.
Therefore, when scouting to identify the best of the best in the world, putting up points in the lower levels is important. But the main way to gain competitive advantage to reach the highest levels is by doing things the right way on the ice and developing technical skills that will transfer well from one level to the next, as most players at these levels will already be great at what they do.