Theaim of this paper was to critically evaluate the Resilience Scale (RS). The RS is a standardized 25-item self-report assessment tool that measures the degree of individual resilience focusing on positive psychological characteristics instead of deficits. Participants are required to rate, using a 7-point Likert item, how much they agree or disagree with the statements and how much they identify with them; higher scores reflect higher levels of resilience. The test authors suggest that five dimensions underpin the RS: equanimity, perseverance, meaningfulness, self-reliance, and existential aloneness, and the scale loads onto two factors described as personal competence and acceptance of self and life. However, there is little empirical support for the conceptual framework. The tool has been translated and validated in several languages as well as administered to over 3 million people around the world in 150 countries, making it the most widely used resilience measure. Nevertheless, there are questions with regards to the underlying construct and content validity, since the proposed theoretical constructs underpinning the scale are open to debate. Despite its popularity and apparent reliability, there are potential difficulties with the measure which are presented here. Finally, it is suggested that the scale would benefit from further examination of the underlying constructs which contribute to resilience.
The origins of the concept of resilience can be found in two main bodies of literature: the physiological aspects of stress and the psychological aspects of coping (Tusaie & Dyer, 2004). Resilience has evolved from a variety of earlier concepts including hardiness (e.g., Kobasa, 1979), adaptability to change (e.g., Rutter, 1985), and the concept of ego-resilience incorporated in early personality inventories such as the MMPI (Hathaway & McKinley, 1943). Perhaps the core constructs which typically emerge across definitions are self-efficacy, adaptability, and problem-solving; none of which are used to describe the proposed dimensions within the RS. Applying new labels to pre-existing constructs is generally unhelpful in psychological measurement. For these to be acceptable in measurement terms, they must be empirically based.
It is evident that interest in the concept of resilience is growing, however, due to the recognized complexity of the construct, and the little consensus among researchers on the definitions and measurement, it has been a challenge to develop a single operational definition of resilience (Luthar & Cicchetti, 2000; Luthar et al., 2000; Wagnild, 2009; Windle et al., 2011). To tackle this issue, authors and work programs have conducted reviews of the literature and concept analyses to provide a benchmark to allow the operationalization of the concept and the ability to measure it (Windle et al., 2011).
Notwithstanding the above, it is important to note that none of the tests scored highly in the quality assessment, suggesting that there may not be a gold standard on which to judge the quality of resilience scales. It is, however, important to note that there were substantial omissions in the ratings despite available evidence. Given this and the fact that across measures anything from one to twelve dimensions were suggested as being the basis for resilience, then it seems unlikely that the content or construct validity has been established in a sufficiently meaningful way to enable this to be scored within a quality assessment.
The aim of this paper is to critically evaluate the Resilience Scale. It will achieve this by first providing an overview of the measure with reference to how the measure was developed, including some initial critique points. The review will then progress to focus more specifically on the psychometric properties of the measure. This will include information regarding the level of measurement, the self-report nature of the measure, and the norms and populations used in the development of the measure. The reliability and validity of the measure will then be discussed. The article will conclude by providing suggestions regarding the clinical application of the measure with reference to the limitations outlined in the critique. Recommendations for further research will then be made.
Factors I and II contained factor loadings at 0.40 or higher, explaining a total of 44.0% of the variance (Wagnild & Young, 1993). Factor I, labeled Personal Competence, comprised 17 items and suggested self-reliance, determination, independence, mastery, invincibility, resourcefulness, and perseverance. Factor II, labeled Acceptance of Self and Life, encompassed 8 items and suggested adaptability, flexibility, balance and a well-adjusted perspective of life. Both factors reflected, according to the authors, definitions of resilience and provided support to the construct validity of the scale. Arguably, both factors could have been labeled in various ways and the list of suggested constructs does not appear to be supported by the item content (e.g., invincibility). Subsequent analyses (albeit not their own) suggested that the RS items constitute a unitary construct (Wagnild, 2016). It would appear that the psychometric evaluation did not support the constructs in the way the authors proposed. Given that resilience is the product of various internal and external influences, it is not surprising that a factor analysis would yield one general factor which likely reflected the core of resilience, while the second factor could be more of a trait construct showing elements of both extraversion and low neuroticism. This would be aligned with a number of studies that have shown resilience to be fundamentally related to these two core features of personality (e.g., Oshio et al., 2018). The items and their factors loadings can be seen in Table 1.
The Resilience Scale has been translated and validated in several languages including, but not limited to Chinese, English, French, German, Greek, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Tamil, Turkish, and Urdu. Since 2006, more than 6000 researchers have requested permission to use the RS, administering it to over 3 million people around the world in 150 countries, making it the most widely used resilience measure (Wagnild, 2016, 2017). According to the author, at the time that the RS was developed there was no validated resilience measure, so the RS was the first scale to measure the resilience construct (Wagnild, 2017).
Kline (1986, 2000) suggests that a good psychological test requires the following characteristics: 1) needs at least an interval scale (although this is not always achievable within psychological measurement as scores often represent a construct that is ordinal), 2) needs to be reliable, 3) needs to be valid, 4) needs to be discriminating, and 5) needs to have appropriate normative data. Essentially, the test should measure the intended construct both accurately and steadily (Kline, 1986).
The RS is a self-report assessment, which simplifies the administration of the scale. However, this can result in limitations to the instrument such as response set bias, especially social desirability and acquiescence (Paulhus, 1984, 2017). Wagnild acknowledges that responses to the RS tend to be negatively skewed, with most participants scoring in the upper range of the scale (i.e., maximum achievable is 175, and the average for most samples is between 140 and 148). Moreover, it is acknowledged that the most desirable/adequate responses to the RS may be obvious to most participants. Arguably, these limitations may not be so much due to social desirability, as they could be a flaw in the item response format, given that 1 to 3 are all disagreement, and for the items in the test it appears that the levels of disagreement are basically viewed the same by respondents with no clear discrimination between the lower levels. There is likely no plausible degrees of disagreement for the test items and half the Likert response could be considered redundant. The other issue is that all items are worded positively and keyed in the same direction, which means the scale is particularly vulnerable to the effects of an acquiescence response bias (Paulhus, 2017; Wagnild, 2009).
In order to overcome these biases, Wagnild (2009) suggests the rewording of statements and negatively keying some of the current items. Wagnild further advises that revising the current response set to enable a forced-choice format might also minimize some of these response biases. For example, instead of allowing seven options (including a neutral response), there could be only four possible options to each statement, thus forcing the participant to endorse one side only of a particular statement. Lastly, it is necessary to ensure anonymity to reduce the likelihood of social desirability bias occurring (Wagnild, 2009).
In the development of psychometric measures, normative populations or references are useful for researchers and practitioners to interpret the meaning of the individual scores. Moreover, the norms describe the range of scores that should be expected from the population being tested (Kline, 2000). Without norms, the interpretation at individual and group levels becomes meaningless (Kline, 2000).
In a review of 12 studies using the RS, the author of the scale concluded that the scale had been used with a variety of age groups ranging from adolescents to elderly (16 to 103 years old) (Wagnild, 2009). She reported that, in all studies, there were no age-related differences on RS scores and that the predominant group being studied was European American, highlighting the need to study the RS with respect to race and ethnicity (Wagnild, 2009).
Face validity concerns the extent to which a test appears to be measuring what it claims to measure (Kline, 2000). Clear wording (designed to be easy to understand for the intended population to be tested) can improve the face validity of a test. In contrast, if items are too complex, participants may be discouraged and disengage from completing the measure (Kline, 2000).
3a8082e126