IN THIS POST I will be drawing on some ideas from the operational research literature to examine how the concept of efficiency can be applied in football analytics.
Efficiency Analysis in Football
Efficiency can be defined as the ratio of outputs to inputs- what you get out divided by what you put in. We use efficiency measures all the time in football, though we are not usually aware of it. Familiar metrics such as goals per 90 or conversion ratio are in fact efficiency measures. Goals per 90 is the rate at which a valuable input resource (minutes of playing time) is converted into a desired output (goals), while the conversion ratio measures the efficiency of turning shots into goals.
But in reality football is more complex than this. The basic unit of performance in football is the player, and generally speaking players consume multiple inputs, and deliver multiple outputs. A forward for example consumes playing time and the attacking opportunities created by his team, and delivers shots on goal and key passes. But it is not obvious how to measure efficiency when we have multiple inputs and outputs. Is Ronaldo, who produces more goals and fewer assists, more efficient than Messi, who produces fewer goals but more assists for the same amount of playing time? We would often use two separate measures here, however each performance indicator represents only one aspect of performance, and because of finite limitations on the inputs, the various aspects of performance are unlikely to be independent.
How we can derive a player efficiency measure in such cases?
Combining Multiple Efficiencies
I will illustrate the concept of efficiency analysis using a one input-two output example, because this problem can be solved graphically without complex maths. The input will be playing time, and the two outputs will be shots and key passes, and I will use data from the La Liga 2014 season.
The chart below shows the efficiencies for shots and key passes plotted against each other for all forwards who played 720 or more minutes during the season. We see that Ronaldo has the highest shooting efficiency, and Nolito the highest key pass efficiency. The segmented boundary that encompasses the data has been extended by drawing a horizontal line from Ronaldo to the y-axis. This is justified because if Ronaldo can produce around 6 shots and 1.8 key passes per 90 minutes, he can produce around 6 shots and no key passes per 90 minutes. Similar reasoning means we can drop a vertical from Nolito to the x-axis. The green area now defines the limits of performance efficiencies in the data sample, that is, all the feasible input-output combinations evident in La Liga in the 2014 season.
Figure 1. Forward Performance in La Liga
Existing players lying on the boundary (in this case, Ronaldo and Nolito) - or hypothetical players that would lie on the boundary if they existed - are given an efficiency of 100%. It is easy to see why Ronaldo and Nolito are efficient; each is best on one of the two measures under consideration. But it isn’t necessary to top the rankings in any one measure to be 100% efficient. Messi is almost on the boundary, and very nearly 100% efficient. With a slight increase in either shots or key passes, he would also be 100% efficient. Incidentally, the large empty gap between the boundary defined by these three players, and rest of the La Liga forwards is a stark reminder of how dominant they are in the Spanish game.
Figure 2 shows a simplified version of the data. One difference between the players evident from Figure 2 is that each favours a different performance mix m, defined as the ratio of shots to key passes; Ronaldo (m=3.9) goes for shots rather than assisting, Nolito and Piatti place more emphasis on assisting than shooting (m=1.3 and 0.6 respectively) while Messi (m= 2.5) and Bale (m=2.6) have, by comparison, a more balanced strategy. m can be regarded as an index of a player’s role or style. The geometry of Figure 2 implies that all hypothetical players that lie on the line from the origin through a player such as Suarez would have the same performance mix (m = 1.7) as he does. So by projecting a line from the origin through Suarez to the boundary, we can construct a virtual player (V) that, were he to exist, would have the same balance between shots and key passes, as Suarez, but an efficiency of 100%.
Figure 2. Calculating Player Efficiencies and Targets
We can now calculate Suarez’s efficiency by evaluating him against the maximally efficient player with the same mix, m; the efficiency is L1 /(L1 + L2) or 78%. We can also define two further quantities Kt and St. These are the shots and key passes for player V, who we already know has a performance mix of 1.7 and is 100% efficient. Kt and St can therefore be regarded as targets for Suarez, that is, the maximal combination of shots and key passes that is theoretically attainable without changing his performance mix. In the same way we can describe the efficiencies and set targets for all players in the data set while respecting differences in their roles.
The table below shows the efficiencies of the 20 most efficient forwards in La Liga 2014-2015.
Shot-Key Pass Efficiencies by Minutes Played: La Liga 2014-15
|Rank||Player||Shot-Key Pass Efficiency|
Efficiency in the Premier League 2015-2016
Results for the EPL 2015-16 season are shown below, this time using goals and assists as the outputs, and players with at least 900 minutes playing time.
Figure 3. EPL Efficiencies for Goals and Assists by Minutes Played
If we were looking for a support striker, Deulofeu would be a good choice. He has a 100% efficiency and a strong focus on assists; and the gap separating him from the other players in the chart reflects his superior performance as a provider. For a pure goal-scorer, Aguero is the best performer, while for a more balanced contribution, Vardy, Giroud and Firmino are the most efficient players.
The efficiencies for the top 20 players are shown below.
Goal-Assist Efficiencies by Minutes Played: EPL 2015-16
A Different Choice of Input
The efficiencies we find depend on the inputs we choose as well as the outputs. The midfield dominance of stronger teams provides more opportunities for forwards than those playing for weaker teams enjoy. The efficiency measures discussed so far therefore contain an element of team performance as well as player performance. To get an efficiency which is more representative of player contribution alone, we could use a metric like team possession or total team possession as a second input. Efficiencies with two inputs and two outputs can be computed, but can’t be represented graphically, so I won’t consider this here. However, we can still use the graphic method if we replace the time-played input by a team-based measure, such as the total number of passes made by the team while the player is on the pitch. This is a rough proxy for the support provided by the team. Better measures might be the number of attacks or final third entries, but total passes is a widely available measure and serves to illustrate the principle.
The chart below shows goal-assist efficencies per 100 successful team passes.
Figure 3. EPL Efficiencies for Goals and Assists by Team Passes
This makes quite a difference. On this definition of efficiency, Vardy is now 100% efficient. Interestingly, Defoe now emerges as a highly efficient player, with a similar mix to Aguero, but scoring more goals for the same amount of team possession.
At this point I suspect some readers will be clutching their pearls and feeling faint - surely this can’t be right? Er, yes it can. It doesn’t mean that Defoe is a better player than Aguero, because whatever being a “better player” means, it cannot be reduced to a single statistic, especially one based a single season. But the numbers don’t lie; on this particular performance measure, Defoe is at least comparable to Aguero, if not slightly better. It could be argued that Defoe’s efficiency is inflated, because he is the only decent striker Sunderland have, while Agueuro’s Manchester City have several highly effective strikers. This in turn would mean that Manchester City’s possession is shared between more strikers, so Aguero’s inputs are overestimated compared to Defoe’s. But this argument doesn’t work. Defoe and Aguero are both equally important strike forces within their teams; in 2015, Defoe scored 31% of Sunderland’s goals and Aguero scored 34% of City’s. (Squakwa incidentally ranked Aguero No. 3 and Defoe No. 6 amongst forwards.)
Efficiencies for the top 20 EPL players are shown below.
Goal-Assist Efficiencies by Team Passes: EPL 2015-16
Some Final Thoughts
The discussion has been limited to one-input two-output efficiency, and I have assumed a linear relationship between inputs and outputs. However, operational research scholars have developed linear programming methods for computing efficiencies in these more complex cases. Such methods allow more inputs and outputs to be used, and can handle non-linear relationships between inputs and outputs such as the law of diminishing returns, or economies of scale.
No system is perfect, but the ability of efficiency analysis to compare like with like according to a player’s performance mix, makes it a powerful tool for player evaluation in a scouting context. The ability to set credible targets for a given performance mix should also prove useful.
Finally, although this post has focused on player efficiency, the same methods could be used to analyse team efficiency as well.