Jesse and Thomas,
The different strategies that you proposed correspond to different
statistical models of the data. I can make this explicit by pointing
out a connection between spherical searchlights and Gaussian
smoothing.
I'm going to base my argument on the original implementation of the
spherical searchlight in Kriegeskorte and Bandettini (2006), but I
realize that there are other ways to do it. In that implementation
they used the Mahalanobis distance between activity on 2 conditions in
each spotlight as their information measure. To better understand this
measure, consider P voxels in a spherical spotlight. We partition the
timepoints into 2 conditions: A and B. Now imagine the P-dimensional
space spanned by the voxel activity, and that the activity on each
condition is distributed according to a multivariate Gaussian in this
space. According to the maximum likelihood estimate, we put the mean
of each Gaussian at the sample mean. Now what about the covariance?
Kriegeskorte and Bandettini use a 'shrinkage' estimate that biases the
covariance matrix to be as diagonal as possible. This actually
corresponds to an "empirical Bayes" estimate with a particular kind of
prior. Specifically, it is a prior that embodies the belief that the
multivariate Gaussian can be decomposed into the product of
(independent) univariate Gaussians, one for each voxel. This estimator
'shrinks' the sample covariance towards the prior covariance. One
implication of this prior is that voxels within the searchlight share
very little information, according to the Mahalanobis distance. When
the covariance matrix is diagnoal, the Mahalanobis distance measures
the Euclidean distance between the 2 means, normalized by the
covariance; if the response to each condition is correlated on average
across voxels, this distance will be small (because the normalization
will push it towards zero).
Let's consider a different prior that embodies different assumptions
about the data. Suppose that, contrary to the diagonal prior, I
believe that voxels within the searchlight are fairly homogenous in
their response to the different conditions. More realistically, I
might suppose that the degree of similarity between the response of
two voxels decreases as they physical distance between them increases.
I could encode these beliefs in a non-diagonal prior with covariance
as a function of physical distance (note that in this case the
diagonal entries would still have the largest variance). In terms of
the Mahalanobis distance, now in addition to penalizing the Euclidean
distance according to how correlated a voxel on condition A is with
itself on condition B, we are also penalizing according to how
correlated that voxel is with other voxels on condition B. Now here is
the thrust of my argument: using this prior has the same effect on the
ensuing results as smoothing the data with a Gaussian kernel and then
using the shrinkage estimate. The Gaussian kernel is enforcing prior
beliefs about spatial covariance on your estimates. I would need to
work through the math to ascertain the precise quantitative
relationship, but I'm pretty sure that this holds qualitatively.
So my point is this: both strategies are reasonable under different
assumptions about the data. What are your assumptions?
Sam
On Aug 13, 3:01 pm, "Jesse Rissman" <
riss...@gmail.com> wrote:
> Hi Thomas,
> For a spherical searchlight analysis, if you smooth your data with a FWHM
> kernel that's approx. the same size of your spheres, you run the risk of
> reducing the total amount of information contained within the sphere, since
> all of its voxels will have highly correlated values after smoothing. Thus,
> it seems the classifier will largely base its classifications on the mean
> activity of the sphere, rather on the distributed activation pattern within
> the sphere. As a result, the spheres that are best able to discriminate
> your two conditions will be those whose mean is consistently higher in one
> condition than the other. When I generate spherical searchlight maps, I run
> the analysis on unsmoothed data, but then smooth the resulting searchlight
> maps by 4mm or 8mm before averaging them across subjects, which serves to
> make the group maps more robust to slight anatomical/functional differences
> across subjects.
>
> -- Jesse
>
> On Wed, Aug 13, 2008 at 11:28 AM, Thomas Wolbers
> <
thomas.wolb...@gmail.com>wrote: