An old cricketing adage says that “Form is temporary, class is permanent.”  As we know from football, even a good player can experience a dip in performance or a loss of confidence, while his innate skill or ability is presumed not to have changed.  In psychology the  distinction between a fluctuating ‘state’ and a persisting characteristic is known as the State-Trait distinction.

A Trait refers to a fairly stable characteristic of a person, or a persistent pattern of behaviour they display.  For example individuals high on Trait anxiety will routinely be more anxious than people low on Trait anxiety, across a wide variety of situations.  On the other hand, a State represents an individual’s experience or behaviour in a given moment, or particular situation.  For example, State anxiety tends to rise in the presence of stressful conditions such as the diagnosis of an illness or money problems.  This kind of anxiety comes and goes. Statistically expressed, a state is determined by interaction between a person and an occasion and reflects the individual’s particular adaptation to the situation.

In this post I will use Generalizability Theory (G-theory) to examine the State-Trait distinction for some well-known football metrics. G-theory allows us to decompose the variance of a variable into separate components. Here I will use a simple three component model to examine the stability of these metrics across occasions.  The first source of variance, and first component of the G-theory model is the person, in this case the Player. Football metrics vary because (amongst other things) players differ in their ability, and the Player component is the Trait component.  The second source of variance in the model is Occasion; football metrics may on average differ across occasions.  The third source of variation is the Player x Occasion interaction. This component arises because differences in performance across occasions are not necessarily constant, but vary by player.  The Player x Occasion component represents the State variance of the metric.

To formalise the calculations, I use a Trait Component Index (TCI) and a State Component Index (SCI) which are defined as follows (see Medvedev et al., 2017):

TCI = {\sigma^2_P \over \sigma^2_P + \sigma^2_{PS}}

SCI = {\sigma^2_S \over \sigma^2_P + \sigma^2_{PS}}

Here \sigma^2_P is the Player or Trait variance component, and \sigma^2_{PS} is the Player x Season interaction variance, which represents the State component.  A metric with a high Trait component and a low State component is a metric that predominantly measures player ability; for the purposes of player evaluation and recruitment, such metrics are preferable to those with a high State component, because they will be more likely to be reproduced on different occasions.

In the first analysis I use Season as the occasion. Table 1 shows the variance components for six different goal-scoring metrics.  The data is taken from the 2016, 2017, 2018 seasons of the Big 5 European leagues, and is for players whose predominant position is either Forward or Attacking Midfielder.  Seasons in which a player played fewer than 540 minutes are excluded.

Table 1. Variance component analysis for nine football metrics (Occasion = Season)

DV Player (Trait) Season Player*Season (State) State+Trait TCI SCI
Total Goals 9.908 0.011 8.37 18.278 54.2% 45.8%
Goals per 90 minutes 0.011 2.05E-05 0.014 0.025 44.0% 56.0%
Goals per 100 passes 0.261 0 0.236 0.497 52.5% 47.5%
Total xG 7.747 0.031 4.133 11.88 65.2% 34.8%
xG per 90 minutes 0.008 7.74E-05 0.005 0.013 61.5% 38.5%
xG per 100 passes 0.239 0.002 0.105 0.344 69.5% 30.5%
Total Assists 107.361 0.008 61.287 168.648 63.7% 36.3%
Assists per 90 minutes 0.124 0 0.071 0.195 63.6% 36.4%
Assists per 100 passes 3.51 0.011 1.359 4.869 72.1% 27.9%

First, Table 1 shows that the variance component of Season is vanishingly small. That is what we would expect, because there are no characteristics of particular seasons that affect average performance; in other words Season is not a meaningful source of variance. Next, the variance components for the per 90- metrics are a little bit counter-intuitive. It might be expected that the Trait component for a per 90 metric would be higher than the Trait component for a totals metric, because the per 90 includes a factor (minutes played) expected to reduce non-ability differences in player scores.  However, the TCI for Goals per 90 is actually lower than the TCI for Total Goals, and the TCI for xG per 90 is lower than the TCI for Total xG, the reverse of what we would expect.

Nevertheless, an important takeaway from Table 1 is that the three xG metrics are more Trait-like than the corresponding Goals metrics. The TCIs for the goals metrics range from 44.0% to  54.2%, while the TCIs for the xG metrics range from 61.5% to 69.5%.  The TCI for xG per 100 seems to have a considerably stronger Trait component than the often-used Goals per 90.  But the key finding here is that the per 100 metrics consistently have a higher TCI than the per 90 metrics.  This means that season-to-season, the per 100 metrics are more reliable indicators of player performance than the per 90 metrics.

In the next analysis I look at transfers, and use Team as the occasion instead of Season. By the same logic as before, a high Player component indicates a strong Trait characteristic, and a high Player x Team interaction indicates a strong State characteristic.

Table 2. Variance component analysis for nine football metrics (Occasion = Team)

DV Player (Trait) Team Player*Team (State) State+Trait TCI SCI %Team
Total Goals 21.494 13.388 37.62 59.114 36.4% 63.6% 18.5%
Goals per 90 minutes 0.005 4.00E-03 0.012 0.017 29.4% 70.6% 19.0%
Goals per 100 passes 0.207 0.027 0.21 0.417 49.6% 50.4% 6.1%
Total xG 19.384 8.647 26.052 45.436 42.7% 57.3% 16.0%
xG per 90 minutes 0.005 2.00E-03 0.005 0.01 50.0% 50.0% 16.7%
xG per 100 passes 0.193 0.014 0.116 0.309 62.5% 37.5% 4.3%
Total Assists 266.584 103.35 376.589 643.173 41.4% 58.6% 13.8%
Assists per 90 minutes 0.095 0.023 0.068 0.163 58.3% 41.7% 12.4%
Assists per 100 passes 3.218 0.091 1.387 4.605 69.9% 30.1% 1.9%

The pattern is similar to the previous analysis.  First the unexpected drop in the TCI of Goals per 90 (29.4%) as compared to Total Goals (36.4%), is repeated here, but this time the TCI for xG per 90 (50%) is higher than the TCI for the corresponding Total xG (42.7%). Secondly, the xG metrics are once again more Trait-like than their Goals counterparts, and xG per 100 passes (TCI = 62.5%) looks the most Trait-like of all.

Finally, the per 100 metrics all have a higher TCI than their per 90 counterparts. This has implications for technical scouting.  When moving to a different team, the per 100 metrics will be more robust to change than the per 90 metrics.

Why per 90 metrics are contaminated

A natural question to ask is why player per 90 metrics are more subject to change than their per 100 counterparts.  One part of the answer is that per 90 metrics are more influenced by the player’s team than per 100 metrics.  To see this look at the %Team column in Table 2 which shows the percentage of variance in the DV due to the Team factor. For all three measures, goals, xG and assists, Team is much more influential for the per 90 metric than for the per 100 metric.

A possible explanation for this is that players get more opportunities in strong teams than in weak teams, thus boosting their metrics and creating variability when they transfer between teams.  To test this possibility I looked at the effect of team strength on the metrics, while controlling for player. The average team strengths for each season were calculated from Club Elo scores.   I estimated two mixed models with random intercepts for player:

{Goals_{ijk} =  \beta_0  +u{0i} + \beta_1Elo_{jk} + \beta_2Season_k + \beta_3Minutes Played_{ijk} + \beta_4(Minutes Played_{ijk} * Elo_{jk}) + e_{ijk}} … (1)

{Goals_{ijk} =  \beta_0  +u{0i} + \beta_1Elo_{jk} + \beta_2Season_k + \beta_3Passes Received_{ijk} + \beta_4(Passes Received_{ijk} * Elo_{jk}) + e_{ijk}} … (2)

where i = player, j = team and k = season.  Goalsijk, Minutes Playedijk, Passes Receivedijk and eijk are respectively the Goals scored, Minutes Played, Passes Received and error term for player i in season j at team k.  Elojk is the average Elo rating in season j for team kβ0 is the overall intercept and u0i is the intercept for player iβ1, β2, β3 and β4 are fixed effects regression coefficients (For simplicity the coefficients for the season dummy effects are represented by the single coefficient β2).  Equation (1) is the model corresponding to the per 90 metric, and equation (2) is the model corresponding to the per 100 metric.

The key results are shown in Figure 1 below.

Figure 1. Mixed Model Effects Plots

The left-hand chart shows the results for the per 90 metric.  The dotted vertical line at x = 0 represents the results for players with an average number of minutes played.  We can see that the goals scored differs for each panel. In a team with a very low Elo rating, the average goals scored is about 2.5.  In a team with a very high Elo rating, the average is about 7.5.

The right-hand chart shows the results for the per 100 metric.  Here we can see that the goals scored for players with an average number of passes received is about the same whatever the Elo rating of the team.

The Bottom Line

It is easy to understand why the xG metrics are more Trait-like than their Goals counterparts. The Goals metrics are subject to an extra source of noise, caused by, amongst other things, small random effects in the strike, and the goalkeeper response.  This makes Goals-based metrics less reproducible across Occasions than the comparable xG-based metrics, even when aggregated across a whole season or within a spell at a club.  Interestingly, three out of the five per 90 metrics are less Trait-like than their Totals counterparts.  This is an unexpected result and at present I don’t have an explanation.

The main finding is that per 100 metrics are more stable across teams and seasons than per 90 metrics. This is due to differences in team strength, which affect player per 90 ratings noticeably more than per 100 ratings.  Technical scouting departments should be aware that the per 90 ratings examined here flatter players in high ranking teams, and are not therefore ideal for comparing players.  The recommendation is to use per 100 metrics instead.

Leader Board for a Trait-heavy Metric.

One clear finding is that the per 100 metrics are consistently more Trait-like than the corresponding per 90 metrics, and they are therefore probably better indicators of skill and ability.  With this in mind,  Table 3 lists the top 30 players for Goals per 100 passes received. The numbers are averages over the 2016,2017 and 2018 seasons, and are for open play goals, and players must have played a minimum of 1800 minutes (20 matches equivalent).

Table 3. Goals per 100 Passes: Top 30 Players.

PlayerGoals/100 RankGoals/100Goals/90 RankGoals/90
Edinson Cavani14.630.71
Moussa Dembele23.2110.58
Paco Alcacer33.1280.59
Pierre-Emerick Aubameyang43.0950.62
Mariano53.08170.52
Mauro Icardi63.03500.41
Krzysztof Piatek72.88130.56
Kylian Mbappe82.7410.82
Cedric Bakambu92.59110.58
Robert Lewandowski102.5150.62
Jamie Vardy112.4863.50.37
Robert Beric122.4757.50.39
Michy Batshuayi132.4619.50.51
Harry Kane142.4214.50.53
Jean-Kevin Augustin152.3814.50.53
Raul de Tomas162.3157.50.39
Loren Moron172.2363.50.37
Anthony Modeste182.22170.52
Bafetimbi Gomis192.1535.50.46
Roger Marti202.1357.50.39
Arkadiusz Milik212.12230.5
Cheick Diabate222.1142.50.44
Gonzalo Higuain23.52.07230.5
Gabriel Jesus23.52.0780.59
Sergio Aguero25.52.05110.58
Mohamed Salah25.52.0550.62
Angel Rodriguez272.0257.50.39
Luis Suarez29280.59
Sergio Leon292121.50.31
Anastasios Donis292460.43

Table 3 highlights some differences between the rankings for  Goals per 90 minutes and Goals per 100 passes.  Mo Salah for example is ranked 5th on the per 90 metric, and Jamie Vardy is ranked 63rd equal.  However, when we look at their per 100 rankings, the order changes; Vardy is ranked 11th and Salah 25th equal. What this suggests is that Goals per 90 is more contextually dependent than Goals per 100 passes, and tends to flatter players in the best teams.  This would be consistent with the idea that Goals per 100 passes is a more Trait-like metric.