Discrepancy GLMFlex result - classical stat analysis on extracted values

135 views
Skip to first unread message

Thierry Chaminade

unread,
Nov 6, 2013, 9:52:45 AM11/6/13
to fmri_mat...@googlegroups.com, Dietrich Stout
Hello,

Sorry to bother again. I have experienced a troubling result for the last couple of weeks, and I have failed to understand.

I run GLMFlex on a complex set of data, results identify some clusters, and I extract estimates in these clusters to run further analyses in classical statistical packages. In all cases but one the results of these and GLMFlex analyses matched perfectly.

BUT: I've had one cluster that was reported as significant in a 2-dimension factor (simple ttest) by GLMFlex, that I have never found to be even close to significance once data was extracted. Suspecting I did something wrong, I have tried zillions of things (changing extraction procedure using "Load Plot Data" in FIVE or MarsBar, using peak vs cluster vs sphere, running the stats in SPSS or matlab, removing outliers, playing with hypotheses about independance, including some or all factors etc...) but never got anywhere close to significance. p values are between 0.5 and 0.9, never smaller, and all other results are relatively constant across all these attempts. In other words, whatever I try with this data, I found reproducible results that don't match GLMFlex results.

I want to stress again: I've been doing this procedure on GLMFlex results, and for this particular analysis, several times (if only to double check results and for plotting with other tools) without any hitch. But I've been struggling for hours, even days, on this one and couldn't figure out the whys and hows.

If you have any suggestions of things I could try to understand this mismatch I would be more than interested.

Thanks in advance and again for your help,

Thierry C. 


Aaron Schultz

unread,
Nov 6, 2013, 9:08:56 PM11/6/13
to fmri_mat...@googlegroups.com
Can you send me an example VOI structure from FIVE save as a .mat file?

-Aaron
> --
> You received this message because you are subscribed to the Google Groups
> "Aaron's fMRI matlab tools." group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to fmri_matlab_to...@googlegroups.com.
> To post to this group, send email to fmri_mat...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/fmri_matlab_tools/7ad32e1c-c0f9-42ae-bb94-cce9f5067772%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Thierry Chaminade

unread,
Nov 12, 2013, 9:40:48 AM11/12/13
to fmri_mat...@googlegroups.com
Dear Aaron and Donald,

Sorry I was out of office most of last week and couldn't continue the discussion.

I extracted data again and attache the VOI file as IPL.mat. I also attach the experimental plan (I.mzt). I tried attaching the contrast image file (T_Tech.nii) but the message was too large.

I just want to stress again: I am surprised because this is the first time I see a discrepancy between GLMFlex and external analysis of data in a cluster identified by GLMFlex. On other words, it worked 100% of the times, including other clusters in the same contrast, until this cluster.

Thanks a lot for your help, Thierry C.
IPL.mat
I.mat

Aaron Schultz

unread,
Nov 14, 2013, 6:40:51 PM11/14/13
to fmri_mat...@googlegroups.com
I think it may depend on how the model is run. Can you provide some
more information regarding what stats package you are using to
validate, and what analysis is being performed?

I was able to duplicate the effects using R. The results reported by
estimateGLM.m are identical to running the following model in R:

estimateGLM should be identical to results obtained using GLM_FLex and
with a VOI structure can be run as: estimateGLM(VOI.data,VOI.F);


mod = aov(Y~(F1*F2*F3)+F4 + Error(Subj/((F1*F2*F3)+F4)), data = dat);
summary(mod)

Becuase all the factors are within subject and the design is perfectly
balanced, the effect for factor 3 is identical to running the
following test:

mod = aov(Y~F3 + Error(Subj/F3), data = dat)
summary(mod)

There should also be a way of getting the same result with an LME, but
I need to re-remember how to specify repeated measures ANOVAs on LMEs.

I'm interested to hear what analyses you have been running that are
not matching up.


-Aaron




On Tue, Nov 12, 2013 at 9:40 AM, Thierry Chaminade
> https://groups.google.com/d/msgid/fmri_matlab_tools/202a4444-cfc4-48e2-915a-af4118bdcfde%40googlegroups.com.

Thierry Chaminade

unread,
Nov 15, 2013, 5:19:25 AM11/15/13
to fmri_mat...@googlegroups.com, Dietrich Stout
Thanks,

I've tried estimateGLM on the IPL.mat VOI data and effectively, the results match the whole brain GLMFlex ones (see #1 hereunder). But I find very surprising (to say the least) the very high F and very low Error term for the Factor 3, the one I could not reproduce.

I copied the exact same data from Matlab to SPSS where I ran a standard ANOVA (#2) as well as a mixed effect ANOVA (#3). In the analysis #2, the SS are the same for the factors, but the same for all factors for the error. 

I also tried anovan in matlab and surprisingly, produced results very close to estimateGLM (#4).

I couldn't run R because I don't use this tool and haven't taken the time to get into it so shortly.

I guess at this point the question becomes:
-do SPSS ANOVA and mixed effect ANOVA on the one hand, and anovan and estimateGLM use different error terms that would explain the discrepancy between their results?
-according to you, why would estimateGLM be preferable?
-and in the precise case I came to you first, don't you find the very low error term of the third factor suspect?
                                   df Sum Sq Mean Sq F-value P-value
Factor 3                           1  0.762  0.762 165.668 0.000050
Error                              5  0.023  0.005

Many thanks for your help,
Thierry C.

-------------------- results from different approaches to the GLM -----------------------------------

#1 estimateGLM

                                   df Sum Sq Mean Sq F-value P-value

                                   df Sum Sq Mean Sq F-value P-value
Factor 1                           2 11.690 5.845 1.540 0.261163
Error                              10 37.950 3.795

                                   df Sum Sq Mean Sq F-value P-value
Factor 2                           1 2.557 2.557 2.934 0.147425
Error                              5 4.358 0.872

                                   df Sum Sq Mean Sq F-value P-value
Factor 3                           1 0.762 0.762 165.668 0.000050
Error                              5 0.023 0.005

                                   df Sum Sq Mean Sq F-value P-value
Factor 4                           3 2.136 0.712 0.382 0.767274
Error                              15 27.939 1.863

                                   df Sum Sq Mean Sq F-value P-value
Factor 1_by_Factor 2               2 2.279 1.139 0.751 0.496842
Error                              10 15.178 1.518

                                   df Sum Sq Mean Sq F-value P-value
Factor 1_by_Factor 3               2 1.429 0.714 2.928 0.099775
Error                              10 2.440 0.244

                                   df Sum Sq Mean Sq F-value P-value

                                   df Sum Sq Mean Sq F-value P-value
Factor 2_by_Factor 3               1 1.100 1.100 0.546 0.493155
Error                              5 10.073 2.015

                                   df Sum Sq Mean Sq F-value P-value

                                   df Sum Sq Mean Sq F-value P-value

                                   df Sum Sq Mean Sq F-value P-value
Factor 1_by_Factor 2_by_Factor 3   2 0.369 0.184 0.121 0.887345
Error                              10 15.244 1.524



#2 standard ANOVA


Tests des effets inter-sujets

Variable dépendante:   VAR00001  

Source

Somme des carrés de type III

ddl

Moyenne des carrés

D

Sig.

Ordonnée à l'origine

Hypothèse

484.489

1

484.489

680.484

.000

Erreur

2.136

3

.712a



F1

Hypothèse

11.690

2

5.845

4.257

.015

Erreur

374.806

273

1.373b



F2

Hypothèse

2.557

1

2.557

1.862

.173

Erreur

374.806

273

1.373b



F3

Hypothèse

.762

1

.762

.555

.457

Erreur

374.806

273

1.373b



F4

Hypothèse

2.136

3

.712

.519

.670

Erreur

374.806

273

1.373b



F1 * F2

Hypothèse

2.279

2

1.139

.830

.437

Erreur

374.806

273

1.373b



F1 * F3

Hypothèse

1.429

2

.714

.520

.595

Erreur

374.806

273

1.373b



F2 * F3

Hypothèse

1.100

1

1.100

.801

.372

Erreur

374.806

273

1.373b



F1 * F2 * F3

Hypothèse

.369

2

.184

.134

.874

Erreur

374.806

273

1.373b



a.  MS(F4)

b.  MS(Erreur)


#3 Mixed effect ANOVA


Tests des effets fixes de type IIIa

Source

Numérateur ddl

Dénominateur dll

F

Sig.

Constante

1

179.859

477.158

.000

F1

2

112.144

12.424

.000

F2

1

163.517

1.805

.181

F3

1

167.461

.867

.353

F4

3

98.404

2.147

.099

F1 * F2

2

105.482

.850

.430

F1 * F3

2

106.020

2.082

.130

F2 * F3

1

154.899

.061

.806

F1 * F2 * F3

2

108.667

.290

.749

a. Variable dépendante : VAR00001.

#4 anovan(VOI.data,VOI.F.FM,'random',[1 5],'model','full')

NB X1 is subject, X2 is F1, X3 is F2 etc...


Aaron Schultz

unread,
Nov 15, 2013, 7:59:25 AM11/15/13
to fmri_mat...@googlegroups.com
For the standard ANOVA, the error terms should not be the same across all the effects.  The whole point of GLM_Flex is to utilize partitioned error which will result in different, and more appropriate error terms for each effect.  So the difference you are seeing is due to how the error is modeled.  In your design there are 48 observations per subject spread across 4 factors, this means that the covariance structure for all the within subjects observations and tests is rather complex.

It should be possible to reproduce the results with an LME, but I'm not as up to speed with LMEs as I am with the GLM, so I'm not quite sure how to properly specify the model.  In particular the issue with the LME will be properly specify the random effects, which will not just be subjects, but a set of subject by factor interactions.

Ultimately this is what I did for a sanity test.  You are worried that the results for F3 are wrong, so let's recompile the data to look at the effect of F3 (note will work just fine in this case since the design is perfectly balance i.e. same number of observations in all cells).

So what I did was to compute the mean for each subject for each level of F3.  This returned a nice tidy little matrix with subjects in rows and the two levels of F3 as columns:

data = [    
    1.0980    0.9824
    1.3347    1.2495
    0.4923    0.3926
    1.9883    1.8958
    1.5501    1.4622
    1.6271    1.4911];


Now, lets just do a simple paired samples t-test on this data:

[h p ci st] = ttest(data(:,1),data(:,2));

Let's compare the results to what we get from GLM_Flex ...

For GLM_FLex the results were F(1,5) = 165.668, p =0.00005

For the t-test from above we get t(5) = 12.8712, p = 0.00005

If we convert the t-stat into an F-stat by squaring we get 165.668

So huzzah!  We have parity in p, F, and df.  This is essentially what the partitioned approach is for, it gives you a parsimonious answer for each set of effects.  This exact parity only works because the design is balanced, and therefore the effects are independent from one another.  The independence or lack thereof of different factors is an important thing to consider, and one that unfortunately GLM_Flex does not always do correctly (more on this later in another post).  But for this fully within design it works perfectly.  GLM_FLex is giving you the correct answer to the question "what is the within subjects effect of factor 3".  As to why you are getting a different answer in SPSS I don't know, but the use of the same error term for all effects suggests that either the right model is not being used, or the specification of that model was not done correctly.


For interpreting the effect of F3 it is important to note, that the proper interpretation of the effect of F3 is potentially confounded by interactions with other factors.  In your model you are not looking at any interactions with F4 (either two, three, or four way).  It is quite possible that there are higher order interactions that would curtail a simple explanation of the main effect of F3 (even the F1 x F3 trending effect might give you some pause).  The general rule of ANOVA is DO NOT interpret effects in the presence of higher order interactions.  Rather what you should do is look at the main effects.  For your design there are 24 difference simple effects for the F3 factor.  A review of the individual simple effects may provide a much different interpretation of the F3 effect than looking at the overall main effect.

A quick review of the 24 simple effects for F3 shows that not a single one of them is significant, however the higher order terms are also not significant (other than the possible trend of F1:F3 at  p = 0.0998).  It's possible that the F1:F3 interaction is driving the F3 effect or the answer is simple measurement accuracy.  That is, the 24 repeated observations per subject of each level of F3 provides a more accurate, stable, and noise free measurement of the behavior that then enables the greatly improved significance of the overall F3 effect.  The call on the interpretation is of course up to you!

Finally, as seen from data above the difference between F3_1 and F3_2 isn't very big (probably around 10% of the mean), the significance of the statistics is being driven by the very consistent pattern that all subjects have higher values for F3_1 than F3_2.  With only 6 subjects in hand to evaluate this effect, it is difficult to determine if the effect would hold up in a larger sample.  After all, making population generalizations from 6 subjects is a little dicey.  If the F3 effect is of particular interest for you, my advice would be to collect data on another 9 subjects and see if the effect still holds up.

-Aaron

P.S.  If I figure out the correct specification for the LME model I will post that here as well.


Thierry Chaminade

unread,
Nov 15, 2013, 10:41:53 AM11/15/13
to fmri_mat...@googlegroups.com
Many many thanks!

I will now take some time to fully understand and reproduce what you did. I am still puzzled by 2 things, but I'll try to understand them before asking you.

Again, thank you

PS: I concur 6 is too small a sample. This fMRI experiment was done as a side to a project training ina very complex skill, only 6 subjects were fully trained over the 3 years period... 

Aaron Schultz

unread,
Nov 16, 2013, 9:20:46 AM11/16/13
to fmri_mat...@googlegroups.com
I'm still not 100% on the LME specification, and it also looks like the LME model is only an approximate equivalence and not an exact replication.  Using the lme4 package in R I believe that the specification would be:

anova(lmer(Y~(F1*F2*F3)+F4 + (1|Subj) + (1|F1:Subj) + (1|F2:Subj) + (1|F3:Subj) + (1|F4:Subj),family="gaussian",REML=TRUE,data=dat))

Using this model the F3 effect is not significant.  Still not 100% sure that this is specified correctly.  


An alternative would be to use lme from the nlme package which I think would be specified as follows:

anova(lme(fixed=Y~(F1*F2*F3)+F4 ,random=~1|Subj/F1/F2/F3/F4,method="REML",data=dat))


This does not give the exact same results as the lmer command, but they are very similar, and F3 is not significant in either model.

Again though, for this model I don't think there is a directly corresponding LME model, and the GLM and LME results will diverge.  The degree of divergence is likely dependent on how complicated the covariance structure is, and with 4 repeated measures the covariance structure gets pretty crazy.


The standard ANOVA model in R is specified as:

summary(aov(Y~(F1*F2*F3)+F4 + Error(Subj/(F1*F2*F3*F4)), data = dat))

The standard ANOVA will return the same result as GLM_Flex


If anyone else is more up to speed on LME models please feel free to weigh in, I'd like to hear your thoughts.


-Aaron






Reply all
Reply to author
Forward
0 new messages