More questions about permutation testing

44 views
Skip to first unread message

Jonatan Ottino

unread,
Jan 30, 2020, 4:36:36 PM1/30/20
to brainGraph-help
Hi Chris,

I've some new questions regarding permutation testing on structural covariance data, being the first one (1) on how to specify which contrast to run when having more than 2 groups. Let's say I have 5 groups and would like to test the difference between group A and D, although by default the brainGraph_permute function will return the contrast of A and B when call for summary(). 

######################################################

kNumPerms <- 5e3

myPerms <- permute::shuffleSet(n = nrow(all.dat.resids$resids.all), 
                               nset = kNumPerms)

densities <- seq(0.1, 0.4, 0.01)

atlas <- 'dk'

level <- 'graph'

perms.grp <- brainGraph_permute(densities, all.dat.resids, 
                                perms =  myPerms,
                                level = level,      
                                atlas = atlas)

measure <- 'mod'

summary(perms.grp, measure = measure, alt = 'two.sided')    

permPlot <- plot(perms.grp, measure = measure, alt = 'two.sided')

print(permPlot[[2]])

########################################################

The other two questions I have are perhaps trivial but (2) is it possible to bootstrap and do some permutation testing over normalized coefficients such as clustering and path length? and (3) has someone performed a permutation analysis on robustness (random failure and targetted attacks)?

All the best,
J

Chris Watson

unread,
Feb 1, 2020, 9:24:21 PM2/1/20
to brainGr...@googlegroups.com
Hi Jonatan,
For these permutation analyses, there are no "contrasts" as commonly used in GLM
analyses. As far as I am aware, it is only possible to calculate the difference
between 2 groups (typically a difference in means). So if you are interested in
the difference between groups A and D, you could do:

resids.AD <- all.dat.resids[, g=c('A', 'D')]
myPerms <- shuffleSet(n=nrow(resids.AD$resids.all), nset=5e3)
perms.AD <- brainGraph_permute(densities, resids.AD, perms=myPerms, level='graph', atlas='dk')

I believe that should work (assuming you are using the latest version, in which
the "[" method for the residuals object exists). You may have to change the
first line to "resids.AD <- all.dat.resids[, g=c(1, 4)]" if your groups are
coded numerically, but either *are supposed to* work.

2) Not "out-of-the-box", and it actually might be too involved to be worthwhile.
It depends on what you are interested in showing, ultimately. The reason it
would be quite difficult is that, in both bootstrapping and permutation
analysis, you have to build new graphs based on the permuted groups. And then
you would have to generate N random graphs to get the normalized values. This
would have to be done *for each permutation/bootstrap replicate*. If you wanted
to generate 100 random graphs (which is generally recommended), and with 5,000
permutations (and potentially multiple densities), then this will increase
computation time and memory by a huge amount. If you would like to follow this
route, though, I could give you some pointers.

3) I am not aware of anyone doing this off the top of my head, but I haven't
gone looking for it. It is possible, but again it depends on what exactly you
want to show with such analyses. I imagine this would also take a pretty long
time (for random failure, at least) for a similar reason as outlined in 2). But
perhaps a bigger issue is that robustness/random failure analysis does not
return a single number, so I am not sure how you would compare these results.

Let me know if you have other questions.
Chris

On Thu, Jan 30, 2020 at 03:36 PM, Jonatan Ottino <jonata...@gmail.com> wrote:

> from: Jonatan Ottino <jonata...@gmail.com>
> date: Thu, Jan 30 01:36 PM -08:00 2020
> to: brainGraph-help <brainGr...@googlegroups.com>
> reply-to: brainGr...@googlegroups.com
> subject: [brainGraph-help] More questions about permutation testing
> --
> You received this message because you are subscribed to the Google Groups "brainGraph-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to brainGraph-he...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/brainGraph-help/4b9ea1c7-df94-4d86-84b3-45d261871ca2%40googlegroups.com.

Jonatan Ottino

unread,
Feb 3, 2020, 12:58:27 AM2/3/20
to brainGraph-help
Thanks Chris,

I was asking because of these studies that apparently performed a set of analysis over normalized measures as well as with random-failures/targeted attacks.

https://doi.org/10.1016/j.nicl.2018.101619

http://dx.doi.org/10.1016/j.neurobiolaging.2016.08.013

https://doi.org/10.1093/cercor/bhq291


This next one is somehow in line with the robustness analysis question. I´ve seen that, for this type of analysis, studies choose as the threshold the minimum density at which all the nodes are fully connected for each group/graph (network not fragmented). However, and yet authors report the cut-off, they normally are not very clear on how they calculated it. Such values tend to range from 0.08 to 0.22.


I understand that a fully connected graph based on the dk atlas would have 2,278 (67 * 68 parcels / 2) edges. I believe the r.thresh object from corrs (corrs <- corr.matrix(all.dat.resids, densities = densities)) has each group's adjacency matrices at every density, so the number of existing edges could be easily calculated.  Still, I´m not very sure about how to follow on this. I´ll appreciate any kind of insight on this.


Best,

J

 







> To unsubscribe from this group and stop receiving emails from it, send an email to brainGr...@googlegroups.com.

Chris Watson

unread,
Feb 3, 2020, 10:52:55 AM2/3/20
to brainGr...@googlegroups.com
Hi Jonatan,

Thanks for the references. I will refer to them as Refs. 1-4.

In Ref. 1, they report the "mean relative size of remaining largest component"
and "mean relative global efficiency". It looks like they take the difference in
the AUC, for each permutation, of the targeted attack curves. However, they both
show asterisks at specific values of vertex removal (Figure 2), and in their
table they report a % for the relative component size which should be 0 after
removing all vertices. So their methods are incomplete and I am not sure how
they calculated this.

In Ref 2., they don't mention how many random graphs they generate for each
group and density. They also don't mention it for the permutation analyses. So I
cannot comment more; they would have had to create N random graphs for each of
the 1,000 permutations (each group and density/threshold, as well) and then
calculate normalized coefficients and group differences. As I mentioned before,
this would be extremely computationally intensive.

In Ref 3., it looks like they do permutation analysis on a single density. This
makes the method a little more understandable, but the actual result doesn't
make much sense to me. For example, if you found a significant between group
difference after removing ~25% of vertices, then what does that mean, really? It
is hard to interpret and it is limited to a single density which in itself is
not very informative, in my opinion. Either way, doing this type of analysis
would require writing your own function.

The methods in Ref. 4 looks to be about the same as those of Ref. 3 and the same
comments/caveats apply. However, they are again not entirely clear in the
methods.

To comment on the use of "fully connected graphs" (at least in Ref. 1), they are
actually using it incorrectly. A fully connected graph is one in which *all*
vertices are connected to *all* others, so the definition you mention is correct
and a fully connected graph with 68 vertices would indeed have 2,278 edges. What
they really mean is that there are no isolate vertices; i.e., all vertices have
at least 1 connection. I don't have a function to search for this, although I
have really thought about adding one since it shouldn't be too hard. This is
sometimes called a "percolation analysis", in which you search for the lowest
density/threshold that results in a graph with 1 large, connected component. If
you are unable to write the code for it, I can help you with that but you would
essentially try a large range of thresholds, calculate the largest connected
component, and return the value that results in e.g. 68. So for Ref. 1, that
value must have been 0.3. Here's a function that is untested but might work:

# "R" is the raw correlation matrix for 1 group
max_comp_size <- function(R, threshes) {
Nv <- nrow(R)
max.comp <- rep(0, threshes)
for (i in seq_along(threshes)) {
r <- matrix(0, Nv, Nv)
r <- ifelse(R > threshes[i], 1, 0)
g <- graph_from_adjacency_matrix(r, mode='undirected', diag=F, weighted=NULL)
max.comp[i] <- max(components(g)$csize)
}
return(data.table(thresholds=threshes, csize=max.comp))
}
Then in your example, you would search for where "csize" changes from 68 to 67;
then you can re-run the function with more thresholds closer to that value.

Chris

On Sun, Feb 02, 2020 at 11:58 PM, Jonatan Ottino <jonata...@gmail.com> wrote:

> from: Jonatan Ottino <jonata...@gmail.com>
> date: Sun, Feb 02 09:58 PM -08:00 2020
> to: brainGraph-help <brainGr...@googlegroups.com>
> reply-to: brainGr...@googlegroups.com
> subject: Re: [brainGraph-help] More questions about permutation testing
>> <javascript:>> wrote:
>>
>> > from: Jonatan Ottino <jonata...@gmail.com <javascript:>>
>> > date: Thu, Jan 30 01:36 PM -08:00 2020
>> > to: brainGraph-help <brainGr...@googlegroups.com <javascript:>>
>> > reply-to: brainGr...@googlegroups.com <javascript:>
>> an email to brainGr...@googlegroups.com <javascript:>.
> --
> You received this message because you are subscribed to the Google Groups "brainGraph-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to brainGraph-he...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/brainGraph-help/25c8eb06-3040-4caa-8a94-4463ae32db53%40googlegroups.com.

Jonatan Ottino

unread,
Feb 3, 2020, 12:14:39 PM2/3/20
to brainGraph-help
Chris, thank you again for your thoughtful explanation. 

I wrote a very straight-forward loop to calculate the size of the component at each density/threshold.

R1 <- corrs$Cases$R

R2 <- corrs$Controls$R

densities <- seq(0.1, 0.4 , 0.01)

for(i in 1:length(densities)) {
  print(max_comp_size(R1, threshes = densities[i]))
}

for(i in 1:length(densities)) {
  print(max_comp_size(R2, threshes = densities[i]))
}

I assume I should pick the minimum density at which both groups have a score of 68 (i.e., every vertex is connected to another vertex) while trying to maximize the threshold to have a biologically plausible graph (i.e., not picking the 10%, which is the starting value at densities) – is this correct?

All the best,
Jonatan

Chris Watson

unread,
Feb 4, 2020, 7:27:20 PM2/4/20
to brainGr...@googlegroups.com
Just remember that the "threshes" function argument should *not* be densities;
it should be *correlation coefficients*. The "cutoff" would be the highest
correlation coefficient which still results in a connected graph. Then you would
go down in threshold value (equivalently, up in density) as far as you would
like to.

Chris

On Mon, Feb 03, 2020 at 11:14 AM, Jonatan Ottino <jonata...@gmail.com> wrote:

> from: Jonatan Ottino <jonata...@gmail.com>
> date: Mon, Feb 03 09:14 AM -08:00 2020
> to: brainGraph-help <brainGr...@googlegroups.com>
> reply-to: brainGr...@googlegroups.com
> subject: [brainGraph-help] Re: More questions about permutation testing
> --
> You received this message because you are subscribed to the Google Groups "brainGraph-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to brainGraph-he...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/brainGraph-help/201e8d6c-0c88-4cc2-a800-bdb37f064af2%40googlegroups.com.

Jonatan Ottino

unread,
Feb 4, 2020, 8:04:30 PM2/4/20
to brainGraph-help
thank you Chris, that makes more sense now.

I was getting weird results in where k was 68 at every density and picking the 'lowest' threshold was not returning a fully connected graph. 

Now in the group of patients, the lowest correlation threshold was 0.16, while in the controls was 0.21. Both thresholds equalled to a density of 12%. Following the previous approach, I would have erroneously picked a density of 17%.  

Best,
J
> To unsubscribe from this group and stop receiving emails from it, send an email to brainGr...@googlegroups.com.

Jonatan Ottino

unread,
Feb 12, 2020, 1:37:58 PM2/12/20
to brainGraph-help
Hi Chris,

the modified version of the loop including your code to get the minimum Pearson-based threshold that corresponds to each density is below:

```
R1 <- corrs$patients$R

R1.thresh <- t(sapply(corrs, with, thresholds))[1,]

c <- as.data.frame(rep(NA, length(R1.thresh)))

d <- as.data.frame(rep(NA, length(R1.thresh)))

for(a in 1:length(R1.thresh)) {
   for(i in 1:length(R1.thresh)) {                                                                                                    
   c[i,] <- max_comp_size(R1, threshes = R1.thresh[i])[,1]
  }
  for (j in 1:length(R1.thresh)) {
     d[j,] <- max_comp_size(R1, threshes = R1.thresh[j])[,2]
  }
  PAT <- as.data.frame(cbind(c, d))
  names(PAT)[1] <- 'threshold'
  names(PAT)[2] <- 'k_nodes'
}

View(PAT)

```
It seems to work fine but another question popped along with this which is whether to adjust or not for mean cortical thickness (cth). 

When I'm not controlling for mean cth, correlations are very high and to get the minimum threshold for a fully connected graph I have to push the densities' range up to 60%. This results in a very strange network at this given density (i.e, the one at the two groups converge in having a fully connected 

not_controlling.png


On the other hand, when controlling for mean cth, the picture looks more plausible (minimum density downs to 0.125).

blabla.png



I know this is might be out of the scope of this group but, is it really necessary to control for mean cth? The literature seems inconsistent when performing structural covariance with thickness but not with volume. In the later, it is very common to adjust for intracranial total volume. This seems adequate because you're taking into account 'head size' (grey, white and CSF); but looking for thickness while adjusting for thickness feels... 'unnatural'. On the other hand, it is also noticeable that correlations are inflated if ts correction is not included... 

By the way, the patient's group exhibits a widespread pattern of cortical thinning and mean cth (bilateral and unilateral) is also thinner as compared to controls – so I'm wondering on the viability of controlling for mean cth as by doing so patients and controls 'lose' what was making them different from each other.

Is there any other way to take care of this? If I don't control for mean cth spurious correlations are pooled into the network while if I do adjust for mean cth differences between groups simply vanish. 

Thank you in advance.

Best, 
J

Chris Watson

unread,
Feb 12, 2020, 5:50:29 PM2/12/20
to brainGr...@googlegroups.com
This is indeed outside the scope of this mailing list as it is up to the user to decide based on the literature, their data, and what they ultimately would like to show from their analyses.

Either way, it would appear that, at least for your data, controlling for mean thickness is the sensible choice. To better understand why, you will probably have to inspect your data more closely. That is, you can try to check how the correlations change for certain "suspect" connections (e.g., connections that appear at a low threshold when controlling for mean thickness, but require a very high threshold without correction). You could also try to look at the residuals themselves, to see if they are appropriate. You can use the "plot" method for this. You can also compare the results when using a single model for all subjects (method='comb.groups', the default) or separate models for separate groups. Finally, you can try the "what='raw'" option to "corr.matrix" and see what that looks like, or you can calculate Spearman rather than Pearson correlations.

Chris

To unsubscribe from this group and stop receiving emails from it, send an email to brainGraph-he...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/brainGraph-help/6302886c-d385-4dea-837d-30f3533e4f36%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages