Over the last few seasons, teams in the EPL have used a variety of different playing formations. OPTA currently recognise 16. In the four seasons 2010-2013, EPL teams used 3,819 different positional set-ups, which equates to each team using 1.26 formations per match.

Here’s the list of the OPTA formations used, starting with the most popular in terms of minutes played:

## Team Formations used in EPL 2010-2013

Formation | Minutes Played | Percent usage |
---|---|---|

4231 | 85,248 | 29.7% |

442 | 69,976 | 24.4% |

4411 | 45,572 | 15.9% |

451 | 30,103 | 10.5% |

433 | 29,005 | 10.1% |

4141 | 11,228 | 3.9% |

343 | 3,179 | 1.1% |

41212 | 3,194 | 1.1% |

352 | 2,614 | 0.9% |

541 | 2,068 | 0.7% |

532 | 1,572 | 0.5% |

3511 | 914 | 0.3% |

3412 | 525 | 0.2% |

3421 | 591 | 0.2% |

4222 | 506 | 0.2% |

4321 | 293 | 0.1% |

The 4213 and 442 formations together account for 54% of play in the EPL. But are these the best formations for creating good shooting chances?

We can answer that by calculating the expected goals (*Xg*) produced in the different formations. Goals scored is a notoriously noisy statistic because of the low numbers involved; using expected goals rather than actual goals removes some of the noise in the data, and gives a more accurate picture of the production of good scoring chances independent of the finishing quality of individual players.

The first step was to calculate the expected goals from regular play for each of the 3,819 set-up periods. As the prime focus of the analysis was the production of chances, I used a simple pre-shot model based on shot location.

It would be tempting to simply calculate the average expected goals for each formation, but there are reasons why that would be misleading. First of all, teams might systematically adopt different formations home and away, so the effect of formation on *Xg* would be confounded with the effect of venue. Similarly, weaker teams might generally tend to adopt more defensive formations, confounding the effects of formation and overall team strength.

For these reasons, I used a statistical model to control for these other variables. For the technically minded I used a Gamma distribution which fitted the observed values quite well. The control variables were venue, team and opposition, and an offset term was included to adjust for the different time periods in each formation.

A preliminary analysis showed that* Xg* production for the five at the back formations, 534 and 541 were very similar so I combined these into a new formation labelled “5–“. Similarly, 3511 and 352 were combined into a “35-” category. I then eliminated the remaining rarely used formations 3412, 3421, 4222, 4321 which together only accounted for less than 1% of playing time.

The next table shows the *Xg* production rates for the various formations. 442 is the reference formation, and is given a production rate of 100%.

## Production Rates of Expected Goals

Formation | Xg Production Rate |
---|---|

442 | 100% |

41212 | 100.5% |

35- | 91.2% |

4411 | 92.0% * |

343 | 88.3% |

4231 | 85.0% *** |

433 | 83.2% *** |

451 | 78.4% *** |

4141 | 76.5% *** |

5-- | 57.2% *** |

The results show there are clear differences in productive efficiency. The only formation outperforming 442 is 41212, but the difference is negligble and not statistically significant. The next best formations are 35-, 4411, and 343 which are about 90% as efficient at producing expected goals. But because of their sample sizes, the differences for 35- and 343 don’t reach statistical significance. All the other formations however are significantly less productive than 442. The least productive formations are those with five at the back, which produce fewer than 60% of the expected goals generated when playing 442.

Finally the figure below shows the expected goals per 90 minutes for an average team playing at home against an average opponent under various different formations.

# Expected Goals Per 90 Minutes

Of course the picture I’ve presented is incomplete because I haven’t looked at goals conceded. It’s very likely that there are significant differences here as well. I’ll pick this up at a later date, and see which of the formations fare best on that metric.