"Deliberate Practice and Performance in Music, Games, Sports,
Education, and Professions: A Meta-Analysis"
https://dl.dropboxusercontent.com/u/182368464/2014-macnamara.pdf ;
Macnamara, Hambrick, Oswald 2014:
> More than 20 years ago, researchers proposed that individual differences in performance in such domains as music, sports, and games largely reflect individual differences in amount of "deliberate practice", which was defined as engagement in structured activities created specifically to improve performance in a domain. This view is a frequent topic of popular-science writing-but is it supported by empirical evidence? To answer this question, we conducted a meta-analysis covering all major domains in which deliberate practice has been investigated. We found that deliberate practice explained 26% of the variance in performance for games, 21% for music, 18% for sports, 4% for education, and less than 1% for professions. We conclude that deliberate practice is important, but not as important as has been argued.
>
> ...Ericsson et al. (1993) concluded that "high levels of deliberate practice are necessary to attain expert level performance" and added, "Our theoretical framework can *also provide a sufficient account* [emphasis added] of the major facts about the nature and scarcity of exceptional performance. Our account does not depend on scarcity of innate ability (talent) . . ." (p. 392). They continued, "We argue that the differences between expert performers and normal adults reflect a life-long period of deliberate effort to improve performance in a specific domain" (p. 400). Ericsson (2007) reiterated this perspective when he claimed that "the distinctive characteristics of elite performers are adaptations to extended and intense practice activities that selectively activate dormant genes that all healthy children's DNA contain[s]" (p. 4). The deliberate-practice view has inspired a great deal of interest in expert performance. A Google Scholar search in April 2014 showed that the article by Ericsson et al. (1993) has been cited more than 4,200 times (
http://scholar.google.com/scholar?cites=11519303805153777449&as_sdt=20000005&sciodt=0,21&hl=en), and their research has been discussed in a number of popular books, including Gladwell's (2008) _Outliers_, Levitt and Dubner's (2009) _SuperFreakonomics_, and Colvin's (2008) _Talent Is Overrated_. Ericsson et al.'s findings were also the inspiration for what Gladwell termed the "10,000-hour rule"- the idea that it takes 10,000 hr of practice to become an expert.
> At the same time, the deliberate-practice view has been sharply criticized in the scientific literature. Gardner (1995) commented that the view requires a "blindness . . . to decades of psychological theorizing" (p. 802), and Sternberg (1996) observed that "deliberate practice may be correlated with success because it is a proxy for ability: We stop doing what we do not do well and feel unrewarded for" (p. 350). Anderson (2000) stated that "Ericsson and Krampe's research does not really establish the case that a great deal of practice is sufficient for great talent" (p. 324), and Marcus (2012) concluded that "it would be a logical error to infer from the importance of practice that talent is somehow irrelevant, as if the two were in mutual opposition" (p. 94).
> Furthermore, although deliberate practice is important, growing evidence indicates that it is not as important as Ericsson and colleagues (Ericsson, 2007; Ericsson et al., 1993; Ericsson & Moxley, 2012) have argued. Gobet and Campitelli (2007) found a large amount of variability in total amount of deliberate practice even among master-level chess players-from slightly more than 3,000 hr to more than 23,000 hr. In a recent reanalysis of previous findings, Hambrick et al. (2014) found that deliberate practice accounted for about one third of the reliable variance in performance in chess and music. Thus, in these domains, a large proportion of the variance in performance is explainable by factors other than deliberate practice.
>
> ...Our meta-analysis is a broad investigation of studies relevant to the deliberate-practice view. It is the first formal meta-analysis of the relationship between deliberate practice and human performance, and we cover all major domains in which this relationship has been studied: music, games, sports, professions, and education. Our first goal was to estimate the overall correlation
> between amount of deliberate practice and performance.
> ...Our second goal was to investigate factors that might moderate the relationship between deliberate practice and performance. The first set of factors, which we term theoretical moderators, included domain (music, games, sports, professions, or education 1 ) and predictability of the task environment (i.e., the degree to which the task environment can change while the performer is planning and executing an action and the range of possible actions)...The second set of factors, which we term methodological moderators, included (a) the method used to assess deliberate practice-retrospective questionnaire, retrospective interview, or log-and (b) the method used to assess performance-expert rating of performance, standardized objective measure of performance (e.g., chess rating), group membership (e.g., amateur vs. professional), or performance on an objectively scored laboratory task. When a retrospective method is used to assess deliberate practice (questionnaire or interview), participants are asked to recall and estimate their past engagement in deliberate practice. By contrast, when the log method is used, deliberate practice is recorded on an ongoing basis, either by the participant in a diary or by a computer.
>
> ...Our search and e-mail request yielded 9,331 potentially relevant articles. After examining these articles and discarding irrelevant ones (e.g., literature reviews, commentaries), we identified 88 studies that met all the inclusion criteria. We coded each study and the measures collected in it for reference information, methodological characteristics, and results (the data file is openly available at
https://osf.io/rhfsk ). These studies included 111 independent samples, with 157 effect sizes and a total sample size of 11,135 participants. For a list of studies included in the meta-analysis, see the Supplemental Method and Results in the Supplemental Material available online.
>
> ...The meta-analysis involved four steps. The first step was to obtain correlations between time spent in one or more activities interpretable as deliberate practice and performance, along with their sampling error variances. The second step was to search for extreme values. One effect size exceeded 1.0 (r = 1.15); we judged this effect size to be invalid and deleted it. There also were four outliers- effect sizes whose residuals had z scores of 3 or greater (rs = .91, .90, .90, and .84); we Winsorized these values to z scores equaling 2.99 (rs = .83, .83, .84, and .83, respectively). The third step was to estimate overall effects and heterogeneity in the effect sizes using random-effects meta-analysis modeling, and then to test whether some of the heterogeneity was predictable from moderator variables using mixed-effects meta-analysis modeling. The final step was to perform publication-bias analyses. We used the Comprehensive Meta Analysis (Version 2; Biostat, Englewood, NJ) software package to conduct the meta-analyses and publication-bias analyses. (See also Methodological Details and Screen Shots of Results, Figs. S3−S16, in the Supplemental Method and Results in the Supplemental Material.)
>
> ...Results
> Figure 2 shows that nearly all correlations between deliberate practice and performance were positive: High levels of deliberate practice were associated with high levels of performance. Of the small number of negative correlations (10 of 157), only 2 (< 1.5% of all correlations) were statistically significant (p < .05).
> The meta-analytic average correlation between deliberate practice and performance was .35, 95% confidence interval (CI) = [.30, .39], which indicates that deliberate practice explained 0.35^2 = 12% of the variance in performance, 95% CI = [9%, 15%]; thus, 88% of the variance was unexplained. However, as indicated by the I 2 statistic, which specifies the percentage of the between-study variability in effect sizes that is due to heterogeneity rather than random error, there was a high degree of heterogeneity in the effect sizes, I 2 = 84.90. We investigated the source of this heterogeneity through the moderator analyses reported next.
> Moderator analyses
> Theoretical moderators. Domain was a statistically significant moderator, Q(4) = 49.09, p < .001. Percentage of variance in performance explained by deliberate practice was 26% for games ( r = .51, p < .001), 21% for music ( r = .46, p < .001), 18% for sports ( r = .42, p < .001), 4% for education ( r = .21, p < .001), and less than 1% for professions ( r = .05, p = .62; see Fig. 3). Predictability of the task environment was also a statistically significant moderator, Q(1) = 20.49, b = 0.14, T 2 = .05, p < .001. As hypothesized, the percentage of variance in performance explained by deliberate practice was largest (24%) for activities high in predictability (r = .49), intermediate (12%) for activities moderate in predictability (r = .35), and smallest (4%) for activities low in predictability (r = .21; see also Fig. S1 in the Supplemental Method and Results in the Supplemental Material).
> Methodological moderators. The method used to assess deliberate practice was a statistically significant moderator, Q(2) = 16.19, p < .001. The percentage of variance in performance explained by deliberate practice was 20% for studies that used a retrospective interview (r = .45, p < .001), 12% for studies that used a retrospective questionnaire (r = .34, p < .001), and 5% for studies that used a log method (r = .22, p < .001). 4
[If I'm understanding this correctly, this finding is very troubling:
the log should be more accurate than a consistent structured post
questionnaire, and a consistent structured questionnaire more accurate
than a freeform post interview - and the more accurate the method, the
less the DP correlates! This suggests the potential for researcher
allegiance to be playing a role too: interviews offer more scope for
unconscious bias to skew the numbers than a questionnaire, and
likewise a questionnaire than subject-kept log. Possibly it's also
bias from the subjects themselves, as the experts play up how
'meritocratic' the system is and how they worked *sooo* hard while
trying to become experts and *definitely* deserve it and weren't
talented or naturally gifted or anything like that.]
> ...We ran three additional models. The first model excluded the 38 effect sizes for team sports, leaving 119 effect sizes (games: 11, music: 28, individual sports: 22, education: 51, professions: 7). We ran this model because interpretation of correlations between deliberate practice and performance in team sports is complicated by the fact that an individual's performance is not independent of the team's performance (Hutchinson, Sachs-Ericsson, & Ericsson, 2013). The overall percentage of variance explained by deliberate practice was 11% in this model (games: 26%, music: 21%, sports: 19%, and education: 4%, all p's < .001; professions: < 1%, p = .62). The second model included only the 59 effect sizes for solitary deliberate practice (games: 6; music: 9; sports: 14; education: 30; professions: 0). We tested this model to address the question of whether deliberate practice must be performed in isolation to be maximally effective (Charness, Tuffiash, Krampe, Reingold, & Vasyukova, 2005; Ericsson et al., 1993). The overall percentage of variance explained by deliberate practice was 11% in this model (games: 23%; music: 23%; sports: 22%; and education: 3%; all p's < .001), which indicates that solitary deliberate practice is not a stronger predictor of performance than deliberate practice with other people. The third model included only the 53 effect sizes for solitary deliberate practice available after excluding effect sizes for team sports (games: 6; music: 9; individual sports: 8; education: 30). The overall percentage of variance explained by deliberate practice was 10% in this model (games: 23%; music: 23%; sports: 28%; and education: 3%; all p's < .001). Thus, results of the additional analyses were similar and consistent with the overall analysis, indicating that deliberate practice explained a considerable amount of the variance in performance, but a large amount of the variance remains unexplained.
>
> ...We first inspected a funnel plot depicting the relationship between standard error and effect size; it was approximately symmetrical, suggesting that smaller-sample studies with weak effect sizes were not missing from our meta-analysis (see Fig. S2 in Additional Publication-Bias Analyses in the Supplemental Method and Results in the Supplemental Material).
>
> ...Moderator analyses further revealed that the effect of deliberate practice on performance tended to be larger for activities that are highly predictable (e.g., running) than for activities that are less predictable (e.g., handling an aviation emergency), as we hypothesized. Furthermore, the effect of deliberate practice on performance was stronger for studies that used retrospective methods to elicit estimates of deliberate practice than for those that used a log method. In fact, for studies using the log method, which presumably yields more valid estimates than retrospective methods do, deliberate practice accounted for only 5% of the variance in performance. This finding suggests that the use of what Ericsson (2014) termed a "high-fidelity" (p. 13) approach to assessing deliberate practice (e.g., video monitoring) might reveal that the relationship between deliberate practice and performance is weaker than the results of this meta-analysis indicate. Finally, the relationship between deliberate practice and performance was weaker for studies that used a standardized objective measure of performance (e.g., chess rating) than for studies that used group membership as the measure of performance.
>
> ...We did not correct individual effect sizes for the attenuating effect of measurement error (i.e., measurement unreliability), because very few studies in the meta-analysis reported a reliability estimate for both deliberate practice and performance. However, measures of both deliberate practice and performance are typically found to have acceptable or better reliability (≥ .70). For example, Tuffiash et al. (2007) stated that test-retest reliabilities for self-report practice estimates in sports and music are typically at or above .80, and Hambrick et al. (2014) found reliability of .91 for chess ratings. Furthermore, the percentage of variance in performance explained by deliberate practice is smaller than the percentage of variance not explained by deliberate practice 5 across a wide range of reliability assumptions (see Table S1 in the Supplemental Method and Results in the Supplemental Material). For example, if it is assumed that reliability of both deliberate practice and performance is .80, the mean overall correlation between deliberate practice and performance is .43 after correction for unreliability. This correlation indicates that deliberate practice accounts for 19% of the reliable variance and that 81% of the reliable variance is potentially explainable by other factors; corresponding percentages of variance explained are 41% for games, 33% for music, 28% for sports, 7% for education, and less than 1% for professions.
>
> ...Ericsson and his colleagues' (1993) deliberate-practice view has generated a great deal of interest in expert performance, but their claim that individual differences in performance are largely accounted for by individual differences in amount of deliberate practice is not supported by the available empirical evidence. An important goal for future research on expert performance is to draw on existing theories of individual differences (e.g., Ackerman, 1987; Gagné, 2009; Schmidt, 2014; Simonton, 2014) to identify basic abilities and other individual difference factors that explain variance in performance and to estimate their importance as predictor variables relative to deliberate practice.
--
gwern
http://www.gwern.net