I had intended Part 2 of this series to be about using expected goals (xG) models to predict match scores. But something more important came up.
Jan Mullenberg recently discovered that the Opta Big Chance indicator was a very powerful predictor of goals. In Part 1 of this series, I found that the Big Chance indicator by itself predicts goals almost as well as a basic expected goal (xG) model containing several location and pattern of play features. David Sumpter also found that the Opta Big Chance feature just on its own predicted goals almost as well as a model with 18 features in it. These kind of findings call into question the value of xG models. Do they really contain any valuable information? Are they little more than glorified counters of easy goal-scoring opportunities?
Nils Mackay benchmarked a number of well-known xG models and found that models with a Big Chance indicator systematically outperformed models without the indicator. As there is no formal definition of a Big Chance, and the designation of an attempt as a Big Chance is at the discretion of the Opta match coders, there has been some speculation that the indicator might be contaminated with outcome bias; in other words, match coders are more likely to tag a shot as a Big Chance after they have seen the goal attempt succeed. This would certainly explain its power to predict goals, and rescue the xG model, which after all cannot be expected to outperform an indicator whose value is partly determined by the thing it is trying to predict.
But natural justice demands that Opta match encoders be regarded as innocent before being proven guilty!
So in this post I want to present the case for the, er… defence. First I want to see if the Big Chance indicator is consistently related to more objective features of the attempt on goal. If it is, then I would be less inclined to believe it is a biased variable.
The Shots Dataset(s)
In this analysis I use shot features from two data suppliers, Opta and Stratagem. Opta do not currently collect defensive metrics for an attempt; however, Stratagem do. I used shots from the 2016-17 seasons of the EPL and La Liga. Out of 18,879 shots recorded by Opta I was able to identify 12, 579 shots with both an Opta Big Chance indicator and Stratagem defensive metrics. Shots were deemed to be matched if they were made by the same player in the same match with 15 seconds and 5 metres of each other in both datasets. One reason why more shots were not matched is that there is a difference of focus between Opta and Stratagem; Opta are focussed on the sports analytics industry and record shots, and Stratagem are focussed on sports trading and record chances. So for example, Stratagem record ‘Dangerous Moments’, where a player just fails to make contact with a ball close to goal; these are not recorded as shots by Opta. Similarly, shots from distance that are blocked early are recorded by Opta because they are shots but Stratagem don’t record them as such because they are not meaningful ‘chances’ .
Characteristics of Big Chances
In this section I exclude penalties and free kicks. Penalties are always big chances and free kicks are never big chances, so including these unconditional categories of Big Chance in the analysis might mask any outcome bias in the Big Chance indicator. This left 12,391 shots for analysis.
I begin by proposing that a Big Chance is a chance emerging a) close to goal b) with a clear sight of goal and c) space and time to make the shot. But what do the numbers say?
We have indicators of all three features in our combined dataset. Closeness to goal is measured by the x-coordinate; path to goal is represented by Players in front of Goal, and space and time to shoot is measured by Defensive Pressure. (The x coordinate is recorded by Opta and runs from 0 to 100, with 100 representing the goal line, and 0 opposition goal line. Defensive Pressure and Players in front of Goal are recorded by Stratagem and are defined here.) The next three figures show how these three metrics vary between Big Chances and regular shots on goal.
Figure 1 shows that Big Chances fall almost entirely goal-side of the 18-yard line(edge of the area); only about half of regular shots do.
Figure 2 shows that Big Chances are more likely than regular chances to be contested by light Defensive Pressure or none at all. Regular chances are more likely to be contested by low to intense pressure.
Figure 3 shows that over 60% of Big chances are taken with only one defender between the shooter and the goal. (This defender is almost always the goalkeeper.)
So far so good. These figures indicate systematic differences between Big Chances and regular chances that conform to expectations, and raises the question can we predict if a shot is a Big Chance from its features?
The Case for the Defence: A Predictive Model for Big Chances
In this section I develop a predictive model for Big Chances. Once again I exclude penalties and free kicks because of their consistent relationship with Big Chances. The model outcome is the Big Chance indicator, and the shot features used as predictors are:
Shot location features: (i) Opta x coordinate (ii) Distance from mid-line (0 = mid-line, 50 = touchline) (iii) Angle subtended between ball and goal-posts
Defensive Pressure (0 = None … 5= “Intense”)
(Opposition) Players In Front of Goal
To check whether the influence of the defensive metrics depends on the shot location, I also include their interactions with the location features in the model.
For model fitting purposes I split the data at random into a training set (70%) and a test set (30%) respectively, and I fitted the training data with a logistic regression model.
The model fit was very good. The fit statistics in the test sample NYGoodHealth were: MAE (Mean Absoloute Error) 0.125; RMSEA (Root Mean Square Error) 0.25; McFadden’s Pseudo R2 0.49 – an exceptionally high value for this metric; and Area Under the ROC Curve (AUC) 94%, also a commendably high value. (I got even slightly better fit using a gradient boosting model, but that’a a story for another day.) The ROC curve is shown in Figure 4. (See here for a tutorial on ROC curves if you haven’t come across them before.)
The conclusion I draw is that the Big Chance indicator is not arbitrary; it exhibits a strong relationship with features of the attempt like its location and associated defensive parameters recorded by independent observers. While this doesn’t disprove the existence of outcome bias, it certainly offers a persuasive alternative explanation; attempts are designated as Big Chances – and consistently so – because they embody a certain combination of location and defensive features.
This opens up an interesting possibility. Can we use defensive features to build an xG model that performs as well as, or even better than, a model with a Big Chance indicator? Defensive metrics are applied to all shots, we might see an all-round improvement. I explore this in the next section.
Expected Goals Models
I evaluated four models, this time including penalties and free kicks:
The Standard Model. This is a typical basic xG model which has three location features (x,y, angle) and three types of play feature (assisted/non-assisted;, header/non-header; shooting context). Shooting context is one of the seven types defined by Opta; Penalty, Regular play, Fast break, Set piece, From corner, Free kick, Throw-in set piece.
Big Chance Model. One predictor only – the Opta Big Chance feature.
Standard+Big. Same predictors as Standard Model, but with the Opta Big Chance feature added
Standard + Defensive metrics. Same predictors as the Standard Model, but with defensive features added.
Once again I used logistic regression to estimate the models. Here is a summary of the fit measures:
|Standard + Big Chance||0.19||0.30||0.26||82.0%|
|Standard + Defensive||0.20||0.31||0.23||81.4%|
There are several features to note in this table. First, all four models are virtually indistinguishable on the MAE and RMSE measures. The Big Chances model does surprisingly well for a single predictor model, but its AUC looks somewhat inferior to the other three. Overall there is not much to choose between the Standard Model and its two augmented versions, but the AUC suggests the augmented versions fit slightly better. Figure 5 compares the ROC curves for the three variations on the Standard Model. They are all fairly close, but augmenting the Standard Model with either Big Chances or defensive metrics does seem to add some discriminatory power.
Finally, it seems that you can replace the Big Chances indicator with associated defensive metrics without sacrificing much predictive power. I suspect that if Big Chances were subject to outcome bias you would lose predictive power. Modellers suspicious of Big Chances do have an alternative – at least in theory; use defensive metrics instead.
What Have We Learned?
The most exciting finding for me is that the Big Chances designation can be predicted by more objective shot features. OK, assigning a number to Defensive Pressure is still a subjective exercise, but the metric has an easily observable zero, and is likely to be quite accurate at the lower end at least. Counting the number of defenders in front of the goal is of course also subject to human error, but again it is likely to be quite accurate when the number of defenders is low.
The idea that Big Chances include outcome bias originally arose from a misunderstanding about modelling benchmarks in a recent test of several different xG models. Models with the Big Chance indicator seemed to be performing better than a supposedly ‘perfect’ model that was used to define the upper bound of performance. This led to some anxiety and confusion among xG’ers as you can hear in this podcast (about 5 minutes in.) But the supposedly perfect model wasn’t a perfect model at all, it was simply a model that was assessed on the data used to generate it. There is therefore no need to think the Big Chance indicator possesses any dark supernatural powers allowing it to improve on perfection. Big Chances are simply shots with a very high probability of scoring, which is why they do such a good job of predicting actual goals.
I hoped that adding defensive metrics to the predictor mix would improve the fit of my basic xG model more than it actually did. I know my Standard Model is pretty simple compared to some, and is no doubt outperformed by other more complex models. But my feeling is we may already be close to the limits of what xG models can do.
Think about the difference between the fit of the Big Chances model and the fit of the xG models I have estimated in this blog post: predicting a Big Chance is far easier than predicting a goal. I think the critical distinction is that unlike a shot on goal, a big chance call arises from a perceptual process that is not subject to occasional catastrophic failure. Close calls may be assigned to the wrong category, but there is nothing equivalent to the massive predictive error that occurs when a player facing an open goal kicks the ball wildly into the stands, or tamely straight at the goalkeeper. Perhaps there is a substantial and irreducible degree of noise in goal scoring and we are in a region of diminishing returns where we can tinker a bit at the edges of our models without seeing much increase in fit.
That said, small improvements or changes in xG models may have more noticeable effects when predicting long-run performance. It would certainly be worth investigating whether augmenting an xG model with defensive features predicts long-run performance better than augmentation by Big Chances, and whether outlying performances are brought closer to their predicted scores.
This article was written with the aid of StrataData, which is property of Stratagem Technologies. StrataData powers the StrataBet Sports Trading Platform, in addition to StrataBet Premium Recommendations