A big gap in our knowledge today is the relationship between what players do and what scouts see. To what extent are the two domains complementary? How do they differ? How much do they overlap?
In a recent paper Luca Pappalardo found that the performance ratings given by sports journalists are largely dependent on a few highly salient performance metrics like goals scored, shots saved and so on. He showed that the ratings depend on a few memorable incidents and that more subtle features of the performance are unimportant. (He called this effect the noticeability heuristic.)
At first blush, this finding has some uncomfortable implications for the scouting process. The dataset was quite extensive and the analysis first-rate, and in fact Pappalardo was able to construct a machine learning algorithm that predicted journalist ratings quite accurately. If his algorithm works for club scouts as well as journalists, there would seem to be very little need for scouts at all.
But there are some reasons to be hopeful! First, the analysis only recognized three categories of outfield player – Forward, Midfielder, and Defender. But broad categories like these aren’t much use to a club seeking to recruit. If narrower categories had been used, other influential performance indicators might have been spotted . Also, we might expect club scouts to be more sensitive to specific skills than football journalists, who are only considering a player’s overall contribution to the match result, and not really thinking about how he would fit into a team, or whether his skill set would plug a particular gap. We might also argue that different journalists are influenced by different aspects of performance, and tend to cancel each other out, while scouts from the same club would use similar criteria for good performance, and more performance metrics would therefore show up in the results.
To get started on this problem I will look at the relationship between one aspect of what players do – event data – and scout evaluations in a real club situation. The context of this study is a closer fit to a real-world recruitment situation than Papparlardo’s. On the downside, I only have about 1/10th as much data, and not very well-distributed at that, so there is not much point in a very detailed analysis.
However, some of the results turned out to be quite interesting, and I think it is worth reporting the findings.
Data and Analysis
I had 853 ratings by 19 scouts on 458 players in 199 different matches. Players were rated between 1 and 12 times. Table 1 shows the number of players in each category and the number of ratings.
Table 1. Player Categories
|Player Category||Number of Players||Number of Ratings|
Players were rated from 1 to 4 in each match. Objective performance was measured by per 90 counts of Opta events.
I calculated correlations between the scout rating and a selection of Opta-based metrics (Because many players were observed more than once, I used the Bland-Altman method.)
Each player role was analysed individually. The results are shown in Figure 1. Positive correlations are shown bristling outwards from the circle, and negative correlations inwardly Statistically significant correlations are shown in darker colours (but remembering there are different numbers of players in each category, a given size of correlation might be significant for one category but not for another.)
Figure 1. Event Data and Scout Ratings for Seven Types of Player
Most of the metrics in Figure 2 are straightforward, but some need explaining.
- The Team segment contains measures of team rather than individual performance. The purpose was to see if match-level variables such as goal difference affect the ratings.
- Defensive play is a composite variable counting the total number of blocks, clearances and long balls (These three Opta events generally go together.)
- The Possession Adjusted variables (Possn. Adj.) are the Opta metric divided by the number of opposition team touches; for example Fouls Conceded (Possn. Adj.) is the number of fouls conceded per opposition team touch.
- % Pass contribution is the number of passes made by the player divided by the total number of passes made by his team.
- Pass Visibility is the number of passes made by the player divided by the total number of touches by both teams. It is a rough measure of a player’s overall involvement in the match, excluding memorable events like goals or fouls.
Space (and most peoples’ attention span) is limited so I won’t discuss the findings in detail. Mostly they can speak for themselves, but I would make a few points.
First, we shouldn’t necessarily expect that scouts, looking at individuals, would necessarily pick up features that are relevant to performance at the team level. For example, it could be that possession adjusted statistics are more predictive of team performance than the unadjusted statistics. However, the correlations with ratings are similar in every case, saving that for Centre Backs, where the adjusted fouls metric just creeps over the significance line.
Generally speaking, the scouts in this sample do seem to be assessing performance on relevant criteria – wingers and strikers on Key Passes, central midfielders but not strikers on Passing, and so forth.
Importantly, and contrary to what Lappardo found, some ‘non-memorable’ events like through balls, losing aerial duels, and some of the percentage metrics do influence ratings. In addition, shots on target seems to influence striker ratings more than goals.
That said, the results are subject to some limitations. The modest sample sizes in some player cateories mean that we didn’t have much power to detect correlations for rare events such as goals. Also the poor distribution of the ratings probably limited the robustness of some of our correlation coefficients. And importantly, I’m limited to event data, and I don’t have any tracking data, so information about pace, one of the player attributes most frequently mentioned by scouts is missing. So the results should be regarded as exploratory rather than conclusive.
The Bottom Line
What players do and what scouts see clearly overlap. But of course, scouting cannot simply be reduced to counting different kinds of events. Neither event data, nor even tracking data, can capture the subtleties of decision-making and positioning on the pitch, player attitude, teamwork, leadership, guile and a host of other intangibles, at least in any way that we can yet interpret. It is here that the eye of the seasoned scout is indispensable.
Nevertheless, understanding how subjective ratings and event data are related could still be very useful to clubs and scouting teams. A great deal of the scouting process seems to depend on ‘implicit’ knowledge – expertise gained by long experience that is locked up inside scouts heads, unarticulated, and inaccessible to clubs and perhaps even to the scouts themselves. Understanding more about the rating process and the factors that influence it could help make that process more explicit, enabling the scouting team pool their knowledge and experience, and develop a shared view of performance. For scouts too, information about statistical correlates of the rating process could help them identify and counteract individual biases.
Integrating objective performance statistics and the subjective ratings of players is a key challenge in football, and the emergence of objective performance measures gives us the opportunity to explore the scouting process in ways not previously possible. Clubs that get this right will have a competitive edge, and I hope they will be inspired to conduct research in this area.
Overall I found the results rather encouraging and I think they give some credence to the validity of scouting observations. Fortunately, it appears that scouts reporting on player performance in a recruitment context seem more attuned to the technical nuances of the game than journalists reporting how matches unfold for an audience of fans.