Archive: August 2014

The myth of the Weakest Link – Part 3

In my previous post I described a failure to replicate Anderson and Sally’s finding that weak links matter more in football than strong ones.  Now I want to explore this further.

Improving the  specification

A  rather awkward aspect of the Anderson-Sally specification is that only the strongest and weakest players are used as predictors in their regression model, while the remaining nine team members are excluded. Now in the real world, the middle  nine players in a team are clearly an important driver of team performance,  and the worry is that excluding them causes omitted variable bias.  In a model with omitted variable bias, the  coefficients can substantially over- or under-estimated.  So any conclusions we might draw about the relative importance of strong and weak links from a mis-specified model are open to question.

In theory there is a simple way round this; we can simply include the average quality of the middle nine players as a third predictor.  But if we try to do that directly, we hit a problem.  The three predictors are strongly correlated with each other; the average correlation between Mid-9 quality, strongest link and weakest link  is .81.  With this degree of correlation between the predictors, regression coefficients are highly unstable, and it is difficult to interpret the results. So  to incorporate  both the average and the extremes in the model, I use a transformation. I construct two new variables to replace the strong and weak link scores:

STRONG-GAP = [Strongest link] – [Mid-9 average]
WEAK-GAP = [Mid-9 average] -[Weakest link]

STRONG-GAP measures how much better the strongest player is than his team-mates, and WEAK-GAP measures how much worse the weakest player is. This is quite a good way of representing the strong and weak links in a team, and has the advantage that the correlation among the predictors we want to use in the model is much lower (.31).

The Effect of Strong and Weak Links

Next I ran a series of regression models using the new predictor set. As before, I kept dummy variable in the model to account for league effects. As well as running a model for Points, I also ran  models for Goals Scored and Goals Conceded.  The results are very interesting when examined togther:

Coefficients for
POINTS
Coefficients for
GOALS SCORED
Coefficients for
GOALS CONCEDED
STRONG-GAP.18***.35***.05
MID-9 AVERAGE.90***.70***-.90***
WEAK-GAP-.05.08.15*
Ligue 1.16**.07-.26***
Bundesliga.04.09.07
Serie A.17**.08-.24***
La Liga.00-.02.06
Premiership [Reference]000
*** P < .001,** p <.01
* p <.05
R-square= 83%R-Square=77%R-square=77%
Standardized regression coefficients for three models

Looking first at the column for Points we see that as expected,  the average quality of the team has a strong effect on the points won.  And teams with a big quality gap at the top end ( a star player) accumulate even more points.  The weak link effect however has all but disappeared. The quality gap at the bottom end of the team  has no effect on points won. The same pattern is true for Goals Scored; a big quality gap at the top end of the team translates into more goals, but the quality gap at the bottom end has no significant impact.

However, the model for Goals Conceded tells a different story. As shown by the negative coefficient for Mid-9 average,  stronger teams leak fewer goals.  But in this case, strong links in the team have no  effect, and it is here that the weakest link comes into play. The positive coefficient for WEAK-GAP shows that teams with a big quality gap at the bottom end of the team concede more goals than teams with a small gap.

The Bottom Line

These results cast doubt on the notion that football is simply a weak link game.  First of all, the results of the mis-specified model used by Anderson and Sally, and the same one that I used in Part 2 of this series disagree.  Anderson and Sally find a weak link effect for the 2010/11 season and I find a strong link effect for the 2012/13 season.  Assuming neither of us has done anything stupid, this discrepancy alone weakens the evidence for the weak link story quite considerably.

But if we go further, as I believe we should, and specify a model that corresponds more nearly to the real world, quality gaps at the top of the scale appear to be rather more important than gaps at the bottom.  Whether that would also be the case with Anderson and Sally’s data set I don’t know. But what I find persuasive are the different findings for goals scored and conceded.

In the terminology of group dynamics, defending is a conjunctive task.  In conjunctive tasks the team’s performance is limited by the ability of it’s weakest member.  All the team members must performing well to achieve a positive team outcome.

A well-known conjunctive task is the popular quiz show The Weakest Link. The quiz begins with a team of nine contestants who take turns answering general knowledge questions, and collaborate to accumulate sums of money. The object of each round is to create a chain of consecutive correct answers, and each correct answer increases the amount of money earned. An incorrect answer however breaks the chain and loses any money earned in that chain.  However, before their question is asked, a contestant can elect to “bank” the money earned in the chain so far, after which it is transferred to a safe pot and the chain starts afresh with zero funds. When the allotted time for each round ends, any money not banked is lost.  A contestant’s decision to attempt to answer the upcoming question allows the money to grow, as each successive correct answer earns proportionally more money, but failing to answer correctly, or banking too frequently, reduces the total earnings of the group. At the end of each round, the contestants vote to decide which of them contributed the least and who will therefore be eliminated.  The host then dismisses that contestant from the show with the words “You are the Weakest Link – Goodbye.”

Now I would argue that defending is a conjunctive task, because any weakness in the defence will be sought out and exploited by the opposition. If so we can see why having a big quality gap at the bottom end of the team is detrimental to defending and results in conceding more goals.

But attacking is rather different.  Attacking seems to me more like a disjunctive task. An example of a disjunctive task is answering a question on University Challenge.  Here team performance depends on the strongest member, which in the case of University Challenge, is the one with the most knowledge.  A person who has studied a subject for four years is likely to beat a team of four people who have each studied for a year  In the same way, I suspect that one strong and one weak attacker are more likely to break down a defence than two mediocre attackers of the same average ability.  This would explain why a big quality gap at the top if the team leads to more goals scored.

Do I have the definitive answer? No. I’d like to analyse more data, and maybe use a different measure of player quality to be sure of the conclusions.  But I am ready to say “You are the Weakest Link (Theory) – Goodbye”.