This post will describe some of the properties of possession sequences, using data from seasons 2015-2017 of the Big 5 European domestic leagues. In particular I will look at how the probability of shooting develops as a possession progresses. We will see some differences between strong and weak teams even after controlling for game state.
A possession is defined as a succession of events in which a team retains control of the ball. A possession ends either when the opposition team regains control, or when a shot is taken, or of course at the end of a period. Possession may continue across breaks in play, for example a foul by the opposing team would not terminate a possession, nor would an unsuccessful pass that hits an opposition player sending the ball out of play for a throw in. On the other hand, an unsuccessful pass that goes straight out of play or is gathered by an opposition player would end the possession. This concept of possessions is intended to convey a continuous period of control of the game. Most possessions are actually quite short; about 78% consist 5 or fewer actions. In a typical match, each team will have about 115 possessions.
As shown in Figure 1, the probability density of possession lengths defined in this way follows a geometric distribution quite well. This accords with theoretical expectations. A geometric distribution models the number of binary success/fail events before the first failure. In our case a successful event is defined as continuation of the possession which corresponds roughly to a successful pass, and failure corresponds roughly to an unsuccessful pass. The probability density function P(k) of the geometric distribution is:
where k = (0, 1, 2, 3 ….} is the number of events preceding the terminating event and p is the probability of failure (unsuccessful pass)
Because the geometric distribution models the number of events before the terminating event, note that possession lengths start at zero. A solitary unsuccessful pass for example has a length of zero, because zero events precede termination of the possession. In Figure 1, the fitted distribution has p = 0.217. This is very close to the actual pass failure rate of 0.228.
1. Determinants of Possession length
I looked at two factors which might be expected to determine possession length; team strength and game state. Team strengths were measured by a team’s Elo rating at the start of the 2015 season; above average teams were classified as High strength, and below average teams as Low strength. For game state I used the current goal difference in the match. Possession lengths were modelled using Gamma regression (which is appropriate for the kind of right-skewed data depicted in Figure 1.)
Figure 2 shows the results. The average possession for strong teams is significantly longer than for weak teams at all game states. However both strong and weak teams show the same overall pattern, with possessions becoming longer when a team goes behind. This is probably related to other well-established effects of game state that are associated with losing teams attacking more as they chase the game. The lengthening effect of going behind is however noticeably smaller for strong teams; perhaps this is because they already have longer possessions. One curious feature of the results is that the asymmetry between being behind and being in front. Possession lengths increase as teams go further behind, but the converse is not true. Possession length decreases with a one goal lead, then increases again when a team goes further in front. More detailed analysis is needed to understand this point.
Figure 3 illustrates the difference between strong and weak teams in more detail. We can see that High and Low strength teams produce about the same number of short possessions, but High strength teams produce considerably more long possessions. This is important because as discussed next, longer possessions are more effective.
2. Effectiveness of Possessions
We define the effectiveness of a possession as the probability it ends with a shot on goal. It has long been known that longer possession sequences are more effective than shorter ones. Hughes and Franks for example in their 2005 article Analysis of Passing Sequences, Shots and Goals in Soccer, concluded that “Teams produced significantly more shots per possession for … longer passing sequences …”. The authors pointed out that Reep and Benjamin – arguably the statistical fathers of the long-ball game – had failed to understand this point, and their landmark 1968 paper Skill and chance in association football was deeply flawed. An excellent discussion of this can be found in Anderson and Sally’s book The Numbers Game.
Figure 4 shows how effectiveness increases with possession length in the present data set. The data did not seem to fit a continuous curve well, and instead are fitted by a segmented logistic regression, in which the break points are estimated statistically. This suggests a rule of thumb (which should not be taken too seriously) that for short possessions the probability of ending with a shot increases quite rapidly with each additional pass; for intermediate length possessions it increases more slowly, and for longer possessions it hardly changes at all.
Figure 5 shows the data split by team strength. We can see that possessions are generally less effective for weaker teams, and the difference appears to be greatest for longer possessions.
Another way to analyse possession data is to compute the probability of a shot being taken after the nth pass in a possession. “Patient” teams wait until n is quite large; impatient teams shoot early, when n is quite small. It might appear from Figure 5 in the previous section that strong teams are less patient, because they are more likely to shoot at the end of a possession. However, this would be a mistake, as Figure 6 shows.
Figure 6 shows that given a level scoreline, Low strength teams are more likely to shoot after n passes than High strength teams. At first sight this looks contradictory to the situation depicted in Figure 5. However, Figures 5 and 6 are displaying two different metrics. The probabilities of Figure 5 come from counting just the final events of possessions. So for a length of say 10, when we calculate the probabilities, we ignore the tenth pass in a sequence unless it was also the end of a possession. In Figure 6 however, the probabilities are derived from the tenth passes in every sequence, whether or not the sequence ended at that point. (This is why the probabilities in Figure 6 are much lower than the probabilities in Figure 5.)
The conclusion here is that strong teams are more patient than weaker teams.
The “pressure” associated with any pass n in a sequence is measured by an expected goals (xG) metric. The pressure at point n is a location-only xG, based on the origin coordinates of pass n. Of course I am not suggesting the xG implies any real-world expectation of shooting; I use it merely to indicate the pressure associated with being in control of the ball at a particular location. To see pressure more clearly, the average xG is calculated over groups of passes as shown in Figure 7 below.
Figure 7 shows that Low strength teams produce xG-pressure quite quickly and apply higher levels of pressure than strong teams early in the possession sequence; however, as the possession develops, pressure tends to drop off. Stronger teams on the other hand apply pressure more slowly, and continue to build it over a long period of possession. This is quite an interesting result.
Figure 8 shows the xG-pressure build up for teams in the EPL. Data points with fewer than 30 possessions are excluded.
Figure 9 shows the build up for four selected teams.
6. xG-Pressure and Probability of Shooting
Finally, Figure 10 shows the relationship between xG-pressure and the probability of shooting.
Figure 10 confirms that strong teams are more patient than weaker ones. At a given level of xG within a possession, (and controlling for game state) strong teams are less likely to shoot. Of course, in the absence of tracking data, we don’t know the position of off-the-ball players, which may differ for strong and weak teams. However, at a given location xG, it is unlikely that strong teams would be in a worse attacking position and it therefore seems unlikely that off-the-ball differences can account for their reduced disposition to shoot. The conclusion is that strong teams are making more considered shooting decisions.
The Bottom Line
I started looking at possessions because I wanted to understand how apparently small differences in pass completion rates between teams can have such large effects on the number of shots produced. Now I know a bit more about possessions, I will explore how possessions depend on pass completion rates and how that translates into shots on goal, and ultimately winning matches. Hopefully I will be able to complete the circle in the next post.