AS we are in the middle of the Olympics and contemplating the national medal tables, I thought it would be useful to resurrect a piece I did on population size and sporting success. It was originally part of a paper (Gelade & Dobson, 2007) on the strength of national football teams, but the arguments apply to any sport.

Other things being equal, we would expect that countries with a large pool of talent to select from will produce stronger teams than those with a smaller pool to select from; more populous countries should therefore outperform smaller ones in competitive sports. Previous studies of national sporting success support this view. For example, Bernard and Busse (2004), and Johnson and Ali (2004) found that larger countries win more Olympic medals.

Intuitively and empirically therefore, we can write *S* = *f*(*N*), where *S* is the strength of the national team, and *N* is the size of the talent pool. But what is the appropriate functional form for *f*?

We first assume that individual sporting ability is distributed according to a common probability distribution which has the same parameters in all countries. (Matthew Syed will probably kill me for this.)

A natural choice for this probability distribution would be a Gaussian (normal) distribution; however, the integral of the normal distribution does not have a closed form, and for simplicity of subsequent calculation, we assume a logistic distribution. (As can be seen in the figure below, the logistic has a very similar shape to the normal and is close enough to normal for our purposes.) Without loss of generality, we may assume this distribution to have a mean of 0 and a scale parameter of 1 in all countries.

We next assume that the strength of a country’s team is determined by the abilities of its *n* most able players, where *n* is some small number, representing the effective size of the national squad. To allow for tactical variations, replacements in the event of injury, and to encourage effective competition for team places and so forth, we would expect *n* to be somewhat larger than the size of the squad (on-field players and substitutes) for any single match. However, the precise value of *n* is unimportant; as we show below, the only consideration is that it is the same for all countries, and much smaller than the size of the pool, *N*, from which the players are selected.

Figure 1 shows the logistic distribution of individual football ability in a country, with the national squad depicted by the shaded area.

**Hypothetical Distribution of Ability**

We shall assume here that the strength, *S*, of the national team is proportional to the median ability *A** _{m}* of the players in the squad. Determining

*S*then reduces to the problem of calculating

*A*

*, the value of*

_{m}*A*such the best

*n*/2 players in the population have

*A*≥

*A*

*.*

_{m}*A** _{m}* can be derived from the inverse cumulative distribution function of the probability distribution. This is denoted

*F*

*, and gives the value of a random variable*

^{-1}(p)*X,*such that the probability that

*X*≤

*F*

*(*

^{-1}*p*) is

*p*. For the logistic distribution with mean 0 and scale parameter 1:

* F^{-1}(p) = ln(p/[1-p]) * …(1)

Let the pool size be *N*; then the area under the distribution to the right of *A** _{m}*, which corresponds to 1-

*p,*is

*n*/ 2

*N*, so we can write

*p*= 1-

*n*/ 2

*N*.

Substituting for *p* in (1) and re-arranging yields:

*F ^{-1}(p) = A_{m} = ln(2N-n) – ln(n)*

Because *n* << *N*, and because ln(*n*) is a constant (*k*) for all countries, we can approximate as follows:

*A _{m} = 2ln(N) – k*

Thus given the same (logistic) distribution of ability, and the same squad size, in each country, the median squad ability and hence the strength of the national team, will be a linear function of the logarithm of the pool size from which the team is chosen.

In the paper I go on to show the theory works in practice; logarithmic measures of pool size (such as total population, total number of registered players) predict FIFA ratings much more strongly than do the corresponding linear measures.

References

Bernard, A. B., and M. E. Busse. 2004. “Who wins the Olympic Games? Economic Resources and Medal Totals.” *The Review of Economics and Statistics* 86: 413–417.

Gelade, G. A., & Dobson, P. (2007). Predicting the comparative strengths of national football teams. *Social Science Quarterly*, *88*(1), 244-258.

Johnson, D. K. N., and A. Ali. 2004. “A Tale of Two Seasons: Participation and Medal Counts at the Summer and Winter Olympic Games.” *Social Science Quarterly* 85: 974-993.

Does your theory take other factors such as economic strength or tradition of a sport in a country into account?

If we just look at the population, China and India should be dominating powers in world sports and e.g. Indonesia should play a major role.

As you suggest, the number of active players is certainly a better starting point.

Hi Matthias,

Yes, tradition and GDP are both taken into account; and they both contribute positively to reanking.