## The myth of the Weakest Link - Part 3

In my previous post I described a failure to replicate Anderson and Sally’s finding that weak links matter more in football than strong ones.  Now I want to explore this further.

Improving the  specification

A  rather awkward aspect of the Anderson-Sally specification is that only the strongest and weakest players are used as predictors in their regression model, while the remaining nine team members are excluded. Now in the real world, the middle  nine players in a team are clearly an important driver of team performance,  and the worry is that excluding them causes omitted variable bias.  In a model with omitted variable bias, the  coefficients can substantially over- or under-estimated.  So any conclusions we might draw about the relative importance of strong and weak links from a mis-specified model are open to question.

In theory there is a simple way round this; we can simply include the average quality of the middle nine players as a third predictor.  But if we try to do that directly, we hit a problem.  The three predictors are strongly correlated with each other; the average correlation between Mid-9 quality, strongest link and weakest link  is .81.  With this degree of correlation between the predictors, regression coefficients are highly unstable, and it is difficult to interpret the results. So  to incorporate  both the average and the extremes in the model, I use a transformation. I construct two new variables to replace the strong and weak link scores:

STRONG-GAP = [Strongest link] - [Mid-9 average]
WEAK-GAP = [Mid-9 average] -[Weakest link]

STRONG-GAP measures how much better the strongest player is than his team-mates, and WEAK-GAP measures how much worse the weakest player is. This is quite a good way of representing the strong and weak links in a team, and has the advantage that the correlation among the predictors we want to use in the model is much lower (.31).

The Effect of Strong and Weak Links

Next I ran a series of regression models using the new predictor set. As before, I kept dummy variable in the model to account for league effects. As well as running a model for Points, I also ran  models for Goals Scored and Goals Conceded.  The results are very interesting when examined togther:

Coefficients for
POINTS
Coefficients for
GOALS SCORED
Coefficients for
GOALS CONCEDED
STRONG-GAP.18***.35***.05
MID-9 AVERAGE.90***.70***-.90***
WEAK-GAP-.05.08.15*
Ligue 1.16**.07-.26***
Bundesliga.04.09.07
Serie A.17**.08-.24***
La Liga.00-.02.06
Premiership [Reference]000
*** P < .001,** p <.01
* p <.05
R-square= 83%R-Square=77%R-square=77%
Standardized regression coefficients for three models

Looking first at the column for Points we see that as expected,  the average quality of the team has a strong effect on the points won.  And teams with a big quality gap at the top end ( a star player) accumulate even more points.  The weak link effect however has all but disappeared. The quality gap at the bottom end of the team  has no effect on points won. The same pattern is true for Goals Scored; a big quality gap at the top end of the team translates into more goals, but the quality gap at the bottom end has no significant impact.

However, the model for Goals Conceded tells a different story. As shown by the negative coefficient for Mid-9 average,  stronger teams leak fewer goals.  But in this case, strong links in the team have no  effect, and it is here that the weakest link comes into play. The positive coefficient for WEAK-GAP shows that teams with a big quality gap at the bottom end of the team concede more goals than teams with a small gap.

The Bottom Line

These results cast doubt on the notion that football is simply a weak link game.  First of all, the results of the mis-specified model used by Anderson and Sally, and the same one that I used in Part 2 of this series disagree.  Anderson and Sally find a weak link effect for the 2010/11 season and I find a strong link effect for the 2012/13 season.  Assuming neither of us has done anything stupid, this discrepancy alone weakens the evidence for the weak link story quite considerably.

But if we go further, as I believe we should, and specify a model that corresponds more nearly to the real world, quality gaps at the top of the scale appear to be rather more important than gaps at the bottom.  Whether that would also be the case with Anderson and Sally’s data set I don’t know. But what I find persuasive are the different findings for goals scored and conceded.

In the terminology of group dynamics, defending is a conjunctive task.  In conjunctive tasks the team’s performance is limited by the ability of it’s weakest member.  All the team members must performing well to achieve a positive team outcome.