Tracking data has long been used by football clubs to measure physical activity on the field, such as distance covered, sprints and top speed. In this post I show how TRACAB spatial location data can be visualised and quantified, and used to compare and contrast players.
The data for this post was a sample of five Premier League matches played in 2015.
An obvious first step is to map the player locations as points on the pitch. To illustrate, I show three players in the match between Hull City and Manchester United which took place at the KC Stadium on the last day of the 2014-15 season. Hull fought strongly to avoid relegation, but to no avail, and the match finished goalless.
Figure 1 shows the point maps of three players in that match (whether they were in possession or not):
- Dawson the Hull captain and defender
- Manchester United’s attacking midfielder Juan Mata
- Manchester United’s right back Antonio Valencia
Figure 1. Point maps
|1a. Dawson (Hull)||1b. Mata (Man Utd.)||1c. Valencia (Man Utd.)|
To avoid overcrowding the plot I show the locations sampled at 2.5 times per second instead of 25 times. This lets us see the main features of the player’s activity without unnecessary detail. In all the plots the direction of play is left to right.
We can see that Dawson played centrally in Hull’s half of the field, but he was also active in Manchester United’s penalty area. Mata and Valencia obviously played on the right, but the difference between them is not terribly clear.
Point maps are useful visualisations, but are not especially revealing. In the next section I show how point maps can be quantified and used to develop more insightful visualisations and player metrics.
I define a player’s “Range” as the area encompassing most of his point map. ‘Most’ can be defined however we wish, for example 80%, or 85% or 90%. In this briefing I will be using mainly 80%. Importantly, I don’t restrict the Range to being a single continuous region. As we shall see, for certain players the Range consists of distinct areas or islands.
In the next set of diagrams, I have drawn the computed Ranges on top of the point maps. In each case the Range covers 80% of the player’s locations on the field.
Figure 2. Point Maps and Ranges
|2a. Dawson (Hull)||2b. Mata (Man Utd.)||2c. Valencia (Man Utd.)|
Here we see that Dawson’s Range consists of two regions; a main region in the defensive area of the field, and a smaller region in the penalty box as well. We can also see the difference between Mata and Valencia more clearly; Mata ranges deeper into the opposition half than Valenica, while Valencia is more active on the right wing and in the centre of his own penalty area.
At this stage, you might be wondering why not just draw heat maps? Heat maps are fine if you want to visualise a single player, but they don’t let you visualise multiple players, and they don’t lend themselves to metrics. I consider these ideas next.
VISUALISING MULTIPLE PLAYERS
If we want to visualise how two or more players work together on the field, we can superimpose their Ranges. The next map shows Liverpool’s back four in their match against Stoke. (this time I have smoothed the Range boundaries to simplify the shapes, and to highlight the player roles I have plotted the 60% Range.)
Figure 3a. Liverpool Back Four (v Stoke)
We can also use Ranges to map the activity of a whole team as shown in the next figure.
Figure 3b. Hull City (v Manchester United)
Importantly – and this is an advance on current analytics – we can also go further and quantify the Ranges and the extent of their interactions.
One basic metric we can derive is the Range Area. The table below shows the Range Areas for the players considered so far:
Table 1. Typical Range Areas
80% Range Area
|Liverpool Back Four||Sakho||1255|
We can also quantify interactions between players. For example:
a) Player gaps.
We can calculate the percentage of time any pair of players are within a certain distance of each other. The table for the Liverpool back four is shown below.
Table 2. Percentage of time Liverpool back four are within 10 metres of each other
b) Range Overlaps
We can also calculate the degree of overlap between Ranges. Range overlaps for the Liverpool back four are shown below.
Table 3. Liverpool Back 4: Percentage Overlap between Ranges
Perhaps one of the most powerful applications of the techniques described above is comparing players. The degree of overlap between their Ranges is a measure of player similarity. For instance, the map below compares the Ranges of Vardy and Aguero.
Figure 4. Comparing Players: Vardy versus Aguero
The map shows that Vardy’s Range (1999 square metres) is a little larger than Aguero’s (1734 square metres); the degree of overlap (82%) measures their similarity, and shows they operate in very similar areas of the pitch.
We can also overlay narrower Ranges e.g. 40% to visualize and contrast the “core” locations of each player. The next map shows the comparison for Vardy and Aguero, with the core ranges coloured in.
Figure 5. Comparing Players: Vardy versus Aguero: Including 40% Core Ranges
Vardy’s core Range is 552 sq. m., and Aguero’s is 424 sq. m. The core Ranges overlap by 68%, and we can see that when Aguero goes forward he goes deep into the box in the centre of the goal, while Vardy prefers a bilateral region on the edge of the penalty box.
This kind of analysis could be useful in recruitment or match analysis.
In the same way we could compare a particular player in different matches, or even in different phases of the same match. When combined with OPTA KPI data, these spatial metrics add further insight to the performance of individual teams or players.
SOME PROPERTIES OF PLAYER RANGES
If the Range is a meaningful concept, we would expect its metrics to show consistent differences across positions. In fact, they do. The table below shows the average Range areas and standard deviations for Goalkeepers, Defenders, Midfielders and Forwards.
Table 4. Average Range Areas for Different Positions
|Position||Number of Players||80% Range Area (sq. m.)||Std dev.|
As expected, Goalkeepers have much smaller Ranges than the outfield players. More interestingly, Defenders have smaller Ranges than Midfielders and Forwards (the difference is statistically significant). However, there are also differences within each position, as shown in the next table.
Table5. Three Smallest and Three Largest Ranges in Each Position
|Player||Team||80-% Range Area (sq. m)|
|Di Marma||Man Utd||1281|
There are considerable differences in the Ranges within each position; for instance, the widest-ranging forwards cover an area more than twice as large as the lowest-ranging forwards. (The significance of this will be the subject of future research.)
USING RANGES TO CLUSTER PLAYERS
Finally, we can use Ranges to derive similarity measures and cluster or classify players. Here we define the similarity between two players as the degree of overlap between their Ranges. I used a clustering technique called multi-dimensional scaling, although I could have used any clustering algorithm. Multi-dimensional scaling positions players on a 2-D map according to their similarity. Players whose Ranges are similar (i.e. overlap to a considerable extent) appear close together on the map. Players whose Ranges do not overlap much appear far apart.
The map that emerges represents the clustering of players. In the map below player names are coloured according to their OPTA position.
Figure 7. Cluster Mapping: Players with Similar Ranges appear Close Together
We can see a considerable degree of organisation emerged from the analysis. Goalkeepers tend to cluster towards the bottom left corner of the map, and Forwards towards the top right, and bands of Defenders and Midfielders appear in succession as we move diagonally across the map from bottom left to top right. Right-sided players appear on the right-hand side of the diagonal, and left-sided players on the left.
This kind of map enables us to see where players fit in. For example, in the matches in our sample, Gerrard, Coutinho and Lambert, who are classified as Midfielders by OPTA, were indistinguishable from Forwards in their use of space. Rooney and Bony are shown as playing deeper than traditional forwards in the matches I examined.
THE BOTTOM LINE
This post has shown that spatial location data can be used to develop cogent and useful metrics for evaluating and classifying players. The definition of a simple metric to quantify player locations and their interactions may have considerable potential for scouting and match analysis.