Recently a club asked me for some advice on psychometric tests that would be useful in a football environment. This prompted me to dust down a bit of research I did a few years ago on the relationship between scores on a psychometric test called the RESTQ and performance on the field. I will describe the main findings in this post.
What is Psychometrics?
Psychometrics is the measurement of mental capabilities and processes. Psychometric instruments can roughly be divided into tests of ability or mental capacity such as IQ tests, and tests of personality characteristics or temperament, for example Extraversion/Introversion. Psychologists have developed a huge range of instruments over the years, and they are aggressively marketed by a number of commercial publishers in a market that Time Magazine valued at $2billion in 2015.
Psychometric tests are used extensively by commercial and government organizations in recruitment, succession planning, team-working, personal development and more, with the aims of identifying and nurturing talent, reducing recruitment costs, and increasing productivity and customer satisfaction. About 80 % of US Fortune 500 and 75 % of UK Times 100 companies use them.
In recent years, psychometrics has benefited from advances in technology and test theory; for instance computer adaptive testing allows the test items to be varied according to the responses of the individual test taker, resulting in shorter, but equivalent tests. This will be of interest to sports organizations who need to administer the same tests repeatedly for monitoring purposes, and want to reduce the effort required by the test-takers.
The RESTQ (Recovery and Stress Questionnaire) is designed to monitor the physical and mental aspect of training and performance stress and to facilitate strategies for enhancing recovery. The RESTQ comes in two forms: the standard form consists of 76 items and the short form 52 items. It measures 12 General Stress and Recovery scales and seven Sport-specific Stress and Recovery scales. The General Stress and Recovery component includes such factors as emotional stress, social stress, fatigue, and physical complaints, sleep quality, physical recovery and general well-being. The Sport-specific component includes scales measuring factors like emotional exhaustion (the desire to give up the sport), injury (being physically hampered), perceived fitness, and self-regulation (goal-setting, mental training).
In this form, the questionnaire is rather unwieldy and the large number of items makes it unsuitable for repeated administration in a club context. My first step was therefore to examine the psychometric properties of the instrument and simplify and shorten it if possible.
Re-Engineering the RESTQ
The initial data consisted of 338 administrations of the 52-item RESTQ on a total of 39 players from the first team, reserve and youth squads. I first examined the quality of the 19 scales using a measure of scale reliability known as Cronbach Alpha. This is a measure of internal consistency amongst the scale items and ranges between 0 and 1 with values above 0.7 deemed acceptable. Only 9 of the 19 scales met this criterion. I also used confirmatory factor analysis to check the proposed organization of the 19 RESTQ scales into their four higher order dimensions (General Stress, General Recovery, Sport-specific Stress and Sport-specific Recovery.) It turned out that this structure was not a good fit to the data. In this sample then, the 52-item RESTQ looked to be psychometrically unsound, and I decided to re-engineer the questionnaire.
Exploratory Factor Analysis suggested six dimensions, which after eliminating one low-reliability dimension and one with a relatively narrow scope left four dimensions. Further trimming of the dimensions guided by congeneric factor analysis resulted in a 30-item/four factor instrument as shown in table 1.
Table 1. Scales of the RESTQ-30
The Confidence and Happiness scales are correlated (r = .69) and form the higher-order Recovery domain of the RESTQ-30; similarly, Frustration and Fatigue are correlated (r = .59) and form the higher order Stress domain. The Recovery and Stress domains are negatively correlated (r = -.29).
RESTQ scores were available for 23 first team players, and performance scores were taken from 30 Premiership matches. On average each player completed the RESTQ 15 times. The RESTQ was administered within one day either side of match day, except in two cases where there was a gap of two days between the match and the administration of the questionnaire. Player performance ratings in each match were taken from the WhoScored website. In addition, key performance indicators (KPIs) for 18 of the matches were available from the club’s Prozone database.
Performance was measured by player ratings from the WhoScored website, and also from the KPIs. Three KPI indicators (Number of Passes Made, Number of Passes Received, Time in Possession) were combined to form a metric I called Involvement. These three KPIs were fairly highly correlated, and the composite Involvement scale had a Cronbach Alpha of .93. While this metric is reasonably applicable for all outfield players, it does not work for goalkeepers; thus goalkeepers were excluded from this element of the analysis.
I conducted a number of regression analyses in which each of the dependent variables (either the WhoScored rating or Involvement) were regressed on each of the four RESTQ-30 dimensions. Because players were measured multiple times, I used a ‘multi-level’ or ‘mixed’ model with player as a random effect. What this does is to process the data as a set of individual within-player regressions, as illustrated below.
There were no statistically significant associations between the WhoScored ratings and any of the RESTQ-30 scales. However, there were some significant relationships between the two recovery scales and the KPIs. As an illustration of what is happening in the analysis, Figure 1 shows the within-player relationships between Involvement and Confidence for the sixteen outfield players who had played at least one match. (To avoid bunching up on the x-axis the confidence scale has been normalised to have a mean of zero and a standard deviation of one for each player.)
Figure 1. Involvement and Confidence
We can see that for fourteen out of the sixteen players, an increase in Confidence is associated with an increase in Involvement, the only exceptions being players M and N. Table 2 shows the regression coefficients and their significance levels for Involvement and its component KPIs.
Table 2. Fixed Effect Coefficients for Regression of KPI’s on RESTQ-30 Recovery Scales
|No. of Passes||0.072***||0.035*|
|No. of Passes Received||0.065**||.038*|
|Time in Possession||0.074||0.078*|
Note: Significance levels. *** p <= .001, ** p <= .01, * p <= .05.
As can be seen, all the coefficients are positive, and most are significant, indicating that higher scores on the Recovery domain of the RESTQ are associated with more involvement in matches.
Next, I looked at selection. I compared the RESTQ scores for players when they had been selected to start and when they had not (i.e., either on the bench or not in the squad at all.) There was a significant increase in the Frustration score when a player was not selected (about a quarter of a standard deviation, p < .01 in a multilevel regression analysis.)
Summary and Conclusions
This study used convenience data, so the study design was somewhat haphazard. First, data collection ran from January to January rather than over a single season. Secondly, not all players completed the questionnaire on every occasion. Finally, administration of the RESTQ sometimes preceded and sometimes followed the match. This means it is not possible to ascribe any causal direction to the relationships found.
My first finding was a significant relationship between Involvement and Recovery. For now, it seems plausible that the relationship is circular; positive player attitudes lead to higher involvement on the pitch, and higher involvement leads to more positive attitudes. However, I failed to find any relationship between performance and Stress or any relationship with the quality of performance as measured by the WhoScored ratings.
My second finding revealed raised levels of frustration in players when they were not included in the starting line-up. However there was no evidence that this frustration built up throughout the analysis period.
Overall I found the results quite encouraging; although by no means conclusive, they provide evidence of the validity of psychometric instruments in a footballing population.
Psychometric instruments are not routinely used in football, but a good case can be made for doing so. It is generally accepted there is a huge mental component in elite performance, and while enormous effort goes into monitoring the physical well-being of players, it seems that their corresponding mental well-being receives much less attention. One study for example found that about a quarter of young players from eight different academies showed signs of stress and burnout. It is possible that burnout could affect senior players as well, and detecting this in its early stages could allow interventions to be made. Other studies have found that tests of so-called “executive function” are able to distinguish between average and elite football players.
Psychometrics could also be used to support the personal development plans of young players, both by identifying areas of weakness, and increasing levels of self-awareness. Understanding the psychology of individual players can also assist managers to incentivise and motivate them appropriately, and build them into an effective team.
I am currently working with Dr. Chris Gibbons of the Psychometrics Centre at Cambridge University to explore the possibilities of using psychometrics and their adaptive testing platform in sports organizations including football clubs. I think there are exciting prospects for clubs willing to explore such tools. But perhaps just a word of caution; some commercial test publishers market psychometrically unsound or inappropriate tests, so it is important to have sound impartial advice before embarking on a program of psychometric testing.