Thanks likeprestige.
On Thu, Oct 7, 2010 at 5:25 PM, Reece <rock...@hotmail.com> wrote:
The first study covered in the paper seems to've been the same study
covered by the _Psychonomics_ poster & cited in Jaeggi 2009
(http://www.gwern.net/N-back%20FAQ.html#jaeggi-2009)
This one included many details missing from the poster, including the
crucial details about how the matrix tests were administered.
They were administered speeded *again* (10min). But at least this time
there's a better attempt at justification for speeding the APM, though
the BOMAT comes with the same lame time excuse.
The second study is better in this respect - 16 minutes for BOMAT
compared to 10.
But I wonder if I'm interpreting the charts on page 7 correctly? It
seems that in study 2, each section of the APM had 18 questions and no
one answered more than 14 or 15 questions at the start, but after
n-back training both n-back groups had at least one person answer all
18 questions. Given that the score increase for the APM was around 1
or 2 questions... This seems to be also true of the BOMAT, but it had
more questions so no one ever answered them all. Does the data
distinguish between the subjects becoming more accurate in the
questions they answer, or just whether they answered more questions
with a similar level of accuracy?
So an interesting paper as usual, but I am very disappointed that the
speed question seems to remain unanswered.
--
gwern
http://www.gwern.net/
I don't know how you can ask this given how many times we've argued
over this and my careful explanation of my interpretation of the data.
Study 1 has the IQ tests speeded and provides little details about
before-after scores, so I don't criticize that.
Study 2 has the IQ tests still speeded, but provides some more
details. Before training, the fastest person could not finish the APM
questions, and so the # of correct answers is necessarily low. After
training, the fastest person finished all the APM questions, and the #
of correct answers went up. Similarly for the BOMAT.
Did that person(s) get smarter - be more likely to answer correctly on
the very hardest questions - or did they just get faster, and would
have had nigh-identical scores before-after if they had had time to
answer/guess at all questions provided? As presented, we don't know.
I don't think speed on an IQ test is the same thing as intelligence; I
think speed and accuracy are two different things (although closely
related), and it is possible to intervene to improve speed but not
accuracy. Yes, they mentioned a study that speeded APM was comparable
to unspeeded APM - but that was not in a context of a task which may
be improving speed! To reuse my analogy, it's like saying that the
verbal subtest of Wechsler is well-correlated with the rest of it
(true), therefore we can save time by only administering that subtest
and then we prove that reading the dictionary vastly increases IQ
(false).
The authors did not cite Moody, though they apparently are aware of
the issue; but the best approach would have been to simply not speeded
the tests! This is getting very frustrating.
--
gwern
http://www.gwern.net/
I've never suggested a conspiracy; I don't know why you bring the idea
up. I don't speculate about the inner life of Jaeggi and her coworkers
- I just note that the data do not say to me what they say to her.
Even if I thought this persistent use of speeding were non-kosher, I
still don't need to postulate a conspiracy or some bizarre misfiring
of the peer review process. Peer review is not perfect; frequent
failure is baked into it just by significance alone (5% = at least 5
out of 100 studies are completely wrong, no?), peer review is weak
against collusion/dishonesty (http://arxiv.org/abs/1008.4324v1), and
in sum, *most* findings may be wrong as
http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0020124
provocatively concludes. I take to heart the reminder of my professor
that when 2 psych studies agree, one is wrong.
--
gwern
http://www.gwern.net/
I assume you mean 'how much of the additional ~2 correct answers is
explained by speed improvements'. Anywhere from 0 to ~2...
--
gwern
http://www.gwern.net/
So you mean variation from person to person? I have no idea how much
is due to speed, and I don't think it matters very much. The question
is how much of the post-n-backing increase is due to speed increase.
--
gwern
http://www.gwern.net/
And the verbal subtest has a 0.x correlation with the rest of the
Wechsler test...
> Meaning that if I improve my speed with a
> factor 3 I would be able to increase my score by 30% that was the
> average improvement.
Really, you're inferring such a causation from your correlations?
> Given the large amount of people taking the test
> wouldn't there be some people who has optimal speed before training
> and hence don't show any improvement? Why do we not see these cases?
Why do none of them seem to finish all the questions in the pre-test?
After all, there should be some people who have optimal speed before
as well as after, if n-backing is improving IQ and not speed.
--
gwern
http://www.gwern.net/
This is what you are claiming: 10-min rapm correlates with IQ only if
you haven't trained on speed. Then you suggest that speed training
increases the score but the test variance doesn't depend on speed.
Isn't this a contradiction. The raw performance can't both depend on
speed and not. The conclusion is that speed rapm doesn't depend on
speed but rather other functions such as attentional control, updating
executive function and so on.
One is increasing speed all over the place - speed of switching one's
attention around, of refocusing on a new round, of fixating on the
squares, and so on.
> This is what you are claiming: 10-min rapm correlates with IQ only if
> you haven't trained on speed. Then you suggest that speed training
> increases the score
I followed you up to here, but past this I don't know what you're talking about.
> but the test variance doesn't depend on speed.
> Isn't this a contradiction. The raw performance can't both depend on
> speed and not. The conclusion is that speed rapm doesn't depend on
> speed but rather other functions such as attentional control, updating
> executive function and so on
--
gwern
http://www.gwern.net/
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2630965/
From a quick look, this is again a correlation. Again, my vocab
example is very simple and very real.
DNB research ought to be held to the highest standards. We all know
how very important IQ is, how it affects every part of life. If DNB
genuinely could boost IQ by ~10 points reliably in all sorts of
people, this would be one of the most important discoveries ever in
psychology. This would be the sort of thing you change society for,
and with: it would be educational malpractice not to have an hour or
two a week (at least!) of n-backing. Not to do so would be the
equivalent of not having fluoride in the drinking water, iodide in the
salt, or vitamin D in the milk. And so on.
Sloppy practices like speeding make me very angry: if DNB is for real,
it delays any general adoption, it delays any sensible cost-benefit
calculation, and it wastes our time discussing whether the speeding is
a major issue or not. And obviously if DNB is not for real, it delays
discussion and assessment of what DNB *is* good for. (I suspect that a
good fraction of n-backers would not be n-backing if all it did was
increase WM.)
--
gwern
http://www.gwern.net/
I suggest you study the frequency for completing RAPM problems in the
article I will upload in a second, "working memory capacity and fluid
intelligence abilities."
Here are a few key facts:
The correlation between gf and speed is around 0.2
The correlation between rapm and speed rapm is a lot higher. Meaning
that there has to be something else to the business.
The difficulty of the problems are not linearly increasing but rather
are quite equal in load except for the last one which shows a 5%
completion rate. That they were more difficult were the whole
rationale for not accepting a speed rapm. I agree if you give someone
"easy questions" they would not be able to reach their "true level"
and therefore the test would correlate with speed instead of
IQ, but since the correlation is so much higher for speed rapm than
for speed this cannot be the case.
That's all I am saying.
All this taken together, RAPM might very well be suited for training
since it's a test that depends on accuracy over several equally
difficult problems (more or less). Attentional control might very well
make people more accurate and hence be able to solve more problems
correctly and thereby increasing the score. I believe this is the
case, because the speed theory doesn't hold.
I think the whole problem is that some have a very emotional loaded
view on intelligence, it's a matter where we project a lot of our own
believes and values into. While Gf-tests usually deals with low level
working memory operations and rules applied to visual figures. It's
not everything that makes us good humans but rather a small fragment
that in many cases doesn't play that great role compared to other
human qualities.
Think about it, the tests were developed in the 50's from intiution.
It only the last years people have tried to understand intelligence as
"biological computer" while the rest still holds abstract
philosophical discussions of it's nature and correlations. Naturally
people like Mensa would not welcome and increase of IQ of 10-20 points
it would ruin the whole society. It's almost like we should be
forbidden to understand intelligence and what IQ-tests measure.
Another interesting article that will, perhaps, strike a little closer
to home because it discusses problems in psychology:
"The Truth Wears Off: Is there something wrong with the scientific method?"
http://www.newyorker.com/reporting/2010/12/13/101213fa_fact_lehrer?currentPage=all
> But the data presented at the Brussels meeting made it clear that something strange was happening: the therapeutic power of the drugs appeared to be steadily waning. A recent study showed an effect that was less than half of that documented in the first trials, in the early nineteen-nineties. Many researchers began to argue that the expensive pharmaceuticals weren’t any better than first-generation antipsychotics, which have been in use since the fifties. “In fact, sometimes they now look even worse,” John Davis, a professor of psychiatry at the University of Illinois at Chicago, told me.
...
> [Psychology professor] Schooler tried to put the problem out of his mind; his colleagues assured him that such things happened all the time. Over the next few years, he found new research questions, got married and had kids. But his replication problem kept on getting worse. His first attempt at replicating the 1990 study, in 1995, resulted in an effect that was thirty per cent smaller. The next year, the size of the effect shrank another thirty per cent. When other labs repeated Schooler’s experiments, they got a similar spread of data, with a distinct downward trend. “This was profoundly frustrating,” he says. “It was as if nature gave me this great result and then tried to take it back.” In private, Schooler began referring to the problem as “cosmic habituation,” by analogy to the decrease in response that occurs when individuals habituate to particular stimuli. “Habituation is why you don’t notice the stuff that’s always there,” Schooler says. “It’s an inevitable process of adjustment, a ratcheting down of excitement. I started joking that it was like the cosmos was habituating to my ideas. I took it very personally.”
...
> The craziness of the hypothesis was the point: Schooler knows that precognition lacks a scientific explanation. But he wasn’t testing extrasensory powers; he was testing the decline effect. “At first, the data looked amazing, just as we’d expected,” Schooler says. “I couldn’t believe the amount of precognition we were finding. But then, as we kept on running subjects, the effect size”—a standard statistical measure—“kept on getting smaller and smaller.” The scientists eventually tested more than two thousand undergraduates. “In the end, our results looked just like Rhine’s,” Schooler said. “We found this strong paranormal effect, but it disappeared on us.”
...
> Then the theory started to fall apart. In 1994, there were fourteen published tests of symmetry and sexual selection, and only eight found a correlation. In 1995, there were eight papers on the subject, and only four got a positive result. By 1998, when there were twelve additional investigations of fluctuating asymmetry, only a third of them confirmed the theory. Worse still, even the studies that yielded some positive result showed a steadily declining effect size. Between 1992 and 1997, the average effect size shrank by eighty per cent.
>
> And it’s not just fluctuating asymmetry. In 2001, Michael Jennions, a biologist at the Australian National University, set out to analyze “temporal trends” across a wide range of subjects in ecology and evolutionary biology. He looked at hundreds of papers and forty-four meta-analyses (that is, statistical syntheses of related studies), and discovered a consistent decline effect over time, as many of the theories seemed to fade into irrelevance. In fact, even when numerous variables were controlled for—Jennions knew, for instance, that the same author might publish several critical papers, which could distort his analysis—there was still a significant decrease in the validity of the hypothesis, often within a year of publication. Jennions admits that his findings are troubling, but expresses a reluctance to talk about them publicly.
...
> In recent years, publication bias has mostly been seen as a problem for clinical trials, since pharmaceutical companies are less interested in publishing results that aren’t favorable. But it’s becoming increasingly clear that publication bias also produces major distortions in fields without large corporate incentives, such as *psychology* and ecology.
...
> Between 1966 and 1995, there were forty-seven studies of acupuncture in China, Taiwan, and Japan, and every single trial concluded that acupuncture was an effective treatment. During the same period, there were ninety-four clinical trials of acupuncture in the United States, Sweden, and the U.K., and only fifty-six per cent of these studies found any therapeutic benefits.
(Emphasis added.)
--
gwern
http://www.gwern.net