Running MVPA when univariate differences exist?

Srikanth

unread,

Apr 11, 2012, 5:40:22 PM4/11/12

to Princeton MVPA Toolbox for Matlab

Hi,

I have a small conceptual question. I would like to run a MVPA to
classify the fMRI data between two conditions A and B. But as I was
not able to perfectly balance the effects of A and B, there is a
significant univariate differences (A > B) in the regions I am
interested in. While doing the MVPA, in the training and testing data,
if I remove the mean from each example (across voxels) (and thus the
average signal is each example is zero), would it take care the issue
of univariate differences between A and B (I thought it would be but
wanted to confirm with someone with more expertise)?

Also, is there any reason to believe that MVPA is meaningful ONLY when
there are NO univariate differences?

Thanks,
-Srikanth.

J.A. Etzel

unread,

Apr 12, 2012, 2:01:14 PM4/12/12

to mvpa-t...@googlegroups.com

This is not a trivial question to answer. I've been toying with the
notion of starting a MVPA-related blog which might be a better place to
get at issues like this in some detail. And more methods papers.

Anyway, it depends (like most things in fMRI analysis and statistics
more generally). For instance, are you using a linear classifier (e.g.
linear svm)? Are you doing a searchlight analysis? ROI-based? When in
the processing stream would you take out the mean? And take out the mean
voxel-wise (each voxel averages zero over example) or volume-wise (each
group of voxels in an example averages zero - not a good idea in
searchlight-type analyses)?

In terms of linear svm, yes, it will use univariate differences. In
fact, it *must* have univariate differences (condition A != condition B
in individual voxels) in order to classify. I gave a talk that includes
a description of this (and some other MVPA issues); see
http://nrg.wustl.edu/events/jo-etzel-ph-d-2011-nil-niac-seminar-series/
or google "Multivariate Pattern Analysis of fMRI data: what it can and
can not tell us".

A linear classifier constructs a linear combination of the voxel values;
it can be pictured as pooling the weak bias across multiple voxels. So a
very weak bias (e.g. A > B) in a group of individual voxels might not be
detected by a standard mass univariate GLM analysis (because the
individual voxels' bias is too weak and perhaps not distributed in a
tight cluster) but could be in an MVPA. Also, mass univariate analysis
generally looks for a cluster of voxels with consistent bias, whereas
linear classifiers can use voxels with opposite bias (e.g. A > B as
informative as B > A).

So I don't think it's quite proper to ask if MVPA is only meaningful
when there are no univariate differences. But it varies with desired
interpretation: for example, if you want to argue that the mean
activation in an anatomical region is not the only source of information
you could subtract the mean from that group of voxels before analysis.

Jo Etzel

Rainer Boegle

unread,

Apr 13, 2012, 7:38:31 AM4/13/12

to mvpa-t...@googlegroups.com

Hi,

thanks to both of you for starting this interesting discussion, here is my take on it:

Maybe I did misunderstand what Dr Etzel meant with volume-wise deman,
(or did you mean voxel-wise and volume-wise combined?)
but I do not think it to be a problem when doing searchlight-type analyses.

To me it seems to be simply a way of asking different questions
(from now on I am talking about input brain images in the sense of averages over time or regression parameters,
NOT spatio-temporal observations! -that would open another can of worms, which might also contain some diamonds to be discovered, but that is another story.)

1.) "Is there any uni- or multivariate effect in a given sphere?"
If the mean for a given sphere (mean over voxels) is not removed, significant classification might utilize overall in/decreases of the ROI or multivariate patterns in the ROI.

2.) "Is there a multivariate effect in a given sphere aside from mean in/decreases of all voxels in the sphere for each example of both conditions?"
If one removes the mean for a given sphere for each example (this is always the case for "correlation distance to centroid" classification algorithm even if input data has not been demeaned, by virtue of correlation) then one will prevent the classifier from picking up on average univariate effects and thus focus on the multivariate pattern.
(Significant classification might then be interpreted as evidence for population codes.)

as a third option one could do a univariate Analysis to see if univariate effects exist (are significant) additionally to the multivariate effects.

NB: the choice of the classifier (as long as it isn't correlation distance classification, see above) doesn't matter much as far as I see it judging (mostly) from reading the literature and testing (on the little bit of data that I have).

Have I neglected some subtle effects? Probably!

Please do continue with the discussion I find it highly interesting.

Cheers!
Rainer

James

unread,

Apr 13, 2012, 12:20:41 PM4/13/12

to Princeton MVPA Toolbox for Matlab

hi Srikanth, Jo and Rainer

indeed, this is not a "small" question, and I have been wondering
about the same thing. I was specifically interested in how one could
directly compare GLM and searchlight in their ability to detect
differences between conditions...one example is in the original
Kriegeskorte paper but I think there are more questions than one short
paper could answer. I posted to the AFNI list about this and we had
some discussion that might be interest to you:

http://afni.nimh.nih.gov/afni/community/board/read.php?f=1&i=41190&t=41190#reply_41190

Hopefully some of the points that are made there will be useful for
thinking about this.

I very much agree with Jo and Rainer that it would be great to have
more discussion about this.

I don't know if it's helpful, but Rainer's point that:

"If one removes the mean for a given sphere for each example (this is
always
the case for "correlation distance to centroid" classification
algorithm
even if input data has not been demeaned, by virtue of correlation)
then
one will prevent the classifier from picking up on average univariate
effects and thus focus on the multivariate pattern."

kind of jibes with the way I think of these things--imaging that the
activation patterns for condition 1 and 2 are thought of as the
buildings in a city (so each voxel is a building) the GLM asks which
city has higher buildings on average while MVPA asks if the shapes of
their skylines are reliably different, and this is why smoothing is
advisable in the former but not the latter. But maybe I am missing
something. In the work we are currently doing in Bangor, we use
searchlight analysis on the correlation between contrast betas from
independent sets of runs. While in theory this is orthogonal to any
main effects in the univariate analysis (because correlation doesn't
change if you add a constant to a variable), in practice there tends
to be overlap, with the searchlight analysis being much more
sensitive.

Hope none of what I said is _too_ incorrect, and that some of it is
even useful.

James

On Apr 13, 12:38 pm, Rainer Boegle <rainer.boe...@googlemail.com>
wrote:

Feng Rong

unread,

Apr 13, 2012, 12:09:31 PM4/13/12

to mvpa-t...@googlegroups.com

Hi Sri, Jo, Rainer,
Thanks a lot for the discussion. I agree with Jo, it is not a trivial issue. Hopefully it can evoke even more thinking.
Here are my 2 cents (Let's keep the discussion within the paired comparison using linear SVM for now):
I guess what Jo meant about 'volume-wise' demean was that whether we need to remove the mean among the voxels in an ROI or a 'searchlite' volume before sending them to the training & testing functions. First of all, I believe voxel-based demean (or, I would rather suggest detrend) and normal scaling (z-score, etc.) on each scanning session is a must do, we don't want to include the inter-scan variability in our analysis anyway. For the further demean among the voxels in each volume, my opinion is we'd better do both. Let's imagine the possible combinations of the results: (1) no significant classification for either not-demeaned or demeaned dataset, we can say this volume can not classify the conditions without hesitation; (2) Significant classification for not-demeaned dataset but no significance for demeaned, I guess that suggests there is a certain group of voxels that activate in both conditions but on different levels (univariate difference), just imagine in 2-D condition, the values are distributed differently along the y-axis, and invariate along the x-axis, when moving them down to around origin, it might be the case that these two clusters can not be distinguished any longer; (3) Significant classification for both, that means no matter whether or not there exists univariate difference, there are two different voxel groups in the volume that have been biased to either one condition or another (multivariate/pattern difference). The above are the reasons why I believe we'd better do both to make sure. However, I can not figure out by imagination what might be the case for the fourth possibility: No significance for not-demeaned but significant for demeaned. Does it exist? How to interpret it?

Best,
Feng

--
You received this message because you are subscribed to the Google Groups "Princeton MVPA Toolbox for Matlab" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mvpa-toolbox/-/avWjihuEsIcJ.
To post to this group, send email to mvpa-t...@googlegroups.com.
To unsubscribe from this group, send email to mvpa-toolbox...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mvpa-toolbox?hl=en.

Marc Coutanche

unread,

Apr 13, 2012, 3:00:54 PM4/13/12

to mvpa-t...@googlegroups.com

Hi all,

This is a great discussion. Feng - I think you described the situation nicely.

Coming back to Srikanth's original message:

"But as I was not able to perfectly balance the effects of A and B, there is a significant univariate differences (A > B) in the regions I am interested in"

Srikanth - the answer to your particular question might depend on what you meant here. When you mentioned that you were not able to perfectly balance the univariate effects, why is this a concern for your particular question? For example, if a univariate difference exists, then the conditions are separable in your ROI (as Feng's e-mail described) and you have found an effect that may not need an additional multivariate analysis. On the other hand, if you had specific hypotheses regarding multi-voxel patterns (independent of the mean) then removing the mean from each trials' pattern (across-voxels) could be one way to look at this.

But, if you are concerned that some other (unwanted) factor is causing the univariate differences, such as attention to stimuli, then removing the mean would take out the unwanted univariate changes, but of course would leave in any multi-voxel pattern changes that might be induced by this variable.

Best,

Marc

----

Marc Coutanche

Center for Cognitive Neuroscience

University of Pennsylvania

Jesse Rissman

unread,

Apr 13, 2012, 3:41:39 PM4/13/12

to mvpa-t...@googlegroups.com

I have also found this discussion to be interesting. One thing that I'd note about removing the mean in the context of a searchlight-style MVPA analysis is that you can create a situation where your classifier will capitalize on spheres that lie on the boundary of two opposing task effects. For instance, let's say you're trying decode the mental states associated with adding vs. subtracting two numbers. You've decided that you don't want your searchlight MVPA analysis to pick up regions that simply show a mean activity difference between these two conditions, so you go ahead and remove the mean from each sphere. So far, so good. However, your searchlight analysis will likely end up identifying regions that fall along the boundaries of the task-positive lateral frontoparietal network and the task-negative default mode network, by virtue of the fact that spheres placed on these boundaries will maintain highly diagnostic information despite the fact that you've subtracted the mean. Sticking with this toy example, if the subtraction math task is more cognitively demanding than the addition math task, then the subtraction task will likely have more lateral frontoparietal network activity and less default mode activity. Any sphere situated at the boundary of these networks will have an activation pattern that looks kind of like the Pepsi logo, even with the mean voxel activity level artificially set to zero. And unless the spatial distribution of this pattern remained perfectly balanced across conditions, the classifier would conceivably be able to detect condition-specific differences. One might then be led to erroneously conclude that distinct neural ensembles within such regions are involved in the implementation of these respective mathematical operations. This same kind of boundary concern applies to any adjacent brain regions that show differential, if not opposing, task effects. I'd be interested to hear if any of you have thoughts on this seemingly insidious aspect of mean-corrected searchlight MVPA?

-- Jesse

--------------------------------------------------

Jesse Rissman, Ph.D.

Assistant Professor

Dept. of Psychology

University of California, Los Angeles

6639 Franz Hall, Box 951563

Los Angeles, CA 90095-1563

(310) 825-4084

http://rissmanlab.psych.ucla.edu

J.A. Etzel

unread,

Apr 13, 2012, 6:47:11 PM4/13/12

to mvpa-t...@googlegroups.com, Yaroslav Halchenko

To clarify just this point: Yes, I did intend to say that across-voxels
de-meaning is potentially problematic if done before a searchlight
analysis. (I'm talking about "row-wise mean-subtraction" or "volume-wise
mean-subtraction": subtracting the mean from every voxel in an example,
not subtracting the mean from a single voxel (voxel-wise) or single
searchlight).

The potential problem is that volume-wise mean-subtraction could
introduce information to truly uninformative voxels.

This is easiest to picture in small numbers of voxels. I have (R) code
and pictures showing the example I describe here; email me directly if
you'd like a copy. (I need to get an MVPA blog started!)

In brief, imagine that I have 25 voxels and two classes. The activation
patterns for the two classes are identical in 15 voxels (no information
about the two classes) but different in the other 10: I added 1 to the
class 'a' activations to get the class 'b' activations for those 10
voxels. We now have 10 informative voxels and 15 uninformative ones. But
if I subtract the mean activation from each example, I'm not just
subtracting from my 10 informative voxels, I'm subtracting from the 15
uninformative ones as well, and the number I subtract will be different
in the two classes (I'll subtract a larger number from the class 'b'
examples, since I added 1 to some of the voxels). After volume-wise
mean-subtraction I therefore now have 25 informative voxels: 10 with
real signal and 15 with a small bias introduced by the mean-subtraction.

Realistically, this will *probably* not be a big problem if you run a
small-diameter searchlight after whole-volume mean-subtraction. But it
could be a problem, and I suspect the distortion will be worse if there
are big differences in activation across conditions or a closer
proportion of searchlight size and volume size (i.e. searchlight
analysis within a ROI or large-diameter searchlights).

Jo

PS: A short discussion of this issue was on the pymvpa message list last
year:
http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/2011q4/001929.html

PPS: I fully agree with Jesse's description of the weird boundary
effects that can happen with searchlight analyses, even if
mean-subtraction is done within each searchlight. And I don't have a
good fix; there are a very, very large number of unpleasant surprises
that can happen with searchlight analyses.

J.A. Etzel

unread,

May 7, 2012, 10:20:51 AM5/7/12

to mvpa-t...@googlegroups.com

I decided to go ahead and start an MVPA methodology blog (at
http://mvpa.blogspot.com/), with the first posts illustrating the
effects of different types of scaling - some of what I was trying to
describe in this thread.

I hope you find it useful, and would appreciate your feedback and/or
contributions.

Jo

Reply all

Reply to author

Forward