ANOVA completely different results w pegas OR ade4 implementation...??

Gabriella Scatà

unread,

Jul 26, 2024, 10:14:27 AM7/26/24

to poppr

Hi,

I tried running poppr:amova with dist= NULL (Euclidean distance calculated by the function itself) and either ade4 or pegas implementation, and I get completely different results.

pegas:

Eucl_popprAMOVAres = poppr::poppr.amova(
myGenlight,
hier = ~clusters/pop,
clonecorrect = FALSE,
within = FALSE,
dist = NULL,
squared = FALSE, # false, otherwise it gets sqrooted!
freq = TRUE,
correction = NULL,
sep = "_",
filter = FALSE,
threshold = 0,
algorithm = "farthest_neighbor",
threads = 1L,
missing = "loci",
cutoff = 0.05,
quiet = FALSE,
method = c("pegas"),
nperm = 100000)

Results:

Warning message:
In poppr::poppr.amova(myGenlgiht, : Missing data are not filtered from genlight data

Analysis of Molecular Variance

Call: pegas::amova(formula = hier, data = hierdf, nperm = nperm, is.squared = FALSE)

SSD MSD df

clusters 1500.402 1500.4016 1

pop 3329.225 832.3063 4

Error 142431.371 847.8058 168

Total 147260.997 851.2196 173

Variance components:

sigma2 P.value

clusters 8.63952 0.0000

pop -0.53478 0.8762

Error 847.80578

Phi-statistics:

clusters.in.GLOBAL (Phi_CT) pop.in.GLOBAL (Phi_ST) pop.in.clusters (Phi_SC)

0.0100939536 0.0094691422 -0.0006311825

Variance coefficients:

a b c

28.98276 29.03448 77.33333

ade4:

Eucl_popprAMOVAres_ade4 = poppr::poppr.amova(
myGenlight,
hier = ~clusters/pop,
clonecorrect = FALSE,
within = FALSE,
dist = NULL,
squared = FALSE, # false, otherwise it gets sqrooted!
freq = TRUE,
correction = NULL,
sep = "_",
filter = FALSE,
threshold = 0,
algorithm = "farthest_neighbor",
threads = 1L,
missing = "loci",
cutoff = 0.05,
quiet = FALSE,
method = c("ade4"),
nperm = 100000)

Warning message:

In poppr::poppr.amova(myGenlight, :

Missing data are not filtered from genlight data

$call

ade4::amova(samples = xtab, distances = xdist, structures = xstruct)

$results

Df Sum Sq Mean Sq

Between clusters 1 984.7106 984.7106

Between samples Within clusters 4 3844.9162 961.2291

Within samples 168 142431.3705 847.8058

Total 173 147260.9974 851.2196

$componentsofcovariance

Sigma %

Variations Between clusters 0.2675142 0.03139829

Variations Between samples(*pops) Within clusters 3.9289742 0.46114598

Variations Within samples (*pops) 847.8057771 99.50745573

Total variations 852.0022654 100.00000000

$statphi

Phi

Phi-samples-total 0.0049254427

Phi-samples-clusters 0.0046129082

Phi-clusters-total 0.0003139829

Eucl_popprAMOVAres_ade4_amova.test = randtest( Eucl_popprAMOVAres_ade4 )

Eucl_popprAMOVAres_ade4_amova.test

class: krandtest lightkrandtest

Monte-Carlo tests

Call: randtest.amova(xtest = Scallop_Rep98_Loc95_Eucl_popprAMOVAres_ade4)

Number of tests: 3

Adjustment method for multiple comparisons: none

Permutation number: 99

Test Obs Std.Obs Alter Pvalue

1 Variations within samples/pops 847.8057771 -10.34458625 less 0.01

2 Variations between samples/pops 3.9289742 8.47441524 greater 0.01

3 Variations between clusters 0.2675142 0.00439018 greater 0.46

How is it possible that the variation between clusters is highly significant when using pegas but highly non significant when using ade4?

I would really appreciate somefeedback...

Thanks

Gabriella

Zhian Kamvar

unread,

Jul 26, 2024, 2:46:03 PM7/26/24

to Gabriella Scatà, poppr

This is because `nperm` only applies to the pegas implementation. When you used `randtest()`, it's default is 99 permutations.

To fix this, use

Eucl_popprAMOVAres_ade4_amova.test = randtest( Eucl_popprAMOVAres_ade4, nrepet = 99999)

--
You received this message because you are subscribed to the Google Groups "poppr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to poppr+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/poppr/7c4ddcc2-af81-4886-87f9-f2c6cb2c907bn%40googlegroups.com.

Gabriella Scatà

unread,

Jul 26, 2024, 7:10:34 PM7/26/24

to Zhian Kamvar, poppr

Hi Zhian,

thank you so much for your reply and explanation.

I ran the randtest with the same n repetitions as I used for the pegas amova (100000), but even if now I do get a significance for clusters, it's still much lower than the significance for variations between samples (which would be my populations).

While this is the opposite with pegas...cluster are highly significant, populations are not...

see below the result of the test:

class: krandtest lightkrandtest
Monte-Carlo tests

Call: randtest.amova(xtest = Scallop_Rep98_Loc95_Eucl_popprAMOVAres_ade4,
nrepet = 1e+05)

Number of tests: 3

Adjustment method for multiple comparisons: none

Permutation number: 100000

Test Obs Std.Obs Alter Pvalue

1 Variations within samples 847.8057771 -9.8469609 less 9.99990e-06
2 Variations between samples 3.9289742 8.3893135 greater 9.99990e-06
3 Variations between clusters 0.2675142 0.1030489 greater 4.00706e-01

Honestly this makes me feel really uncomfortable about both implementations as they give such different results, and I don't know which I can trust.

Any ideas or suggestions?

Thanks again!

Best

Gabriella

Gabriella Scatà

unread,

Jul 29, 2024, 9:14:45 PM7/29/24

to poppr

Also, can I ask:

in my pegas amova results ( see below again):

- is it correct that the sigma^2 for clusters = variation among clusters & sigma^2 for pop = variation among populations BUT within clusters?

- what does the Error correspond to in my case? would that be the remaining variation - so variation among individuals BUT within populations or within the all sample?

- so to obtain the % variance for each hierarchical level --> should i sum all sigma^2 values (cluster+pop+error) and then divide each by the sum?
i.e. sigma^2 for clusters/sum sigmas*100 = % variance explained by clusters, and
sigma^2 for error/sum sigmas*100 = % variance due to variation among individuals?

I would really appreciate some feedback and help interpreting these results!

Thanks!

Gabriella

Analysis of Molecular Variance

Call: pegas::amova(formula = hier, data = hierdf, nperm = nperm, is.squared = FALSE)

SSD MSD df

clusters 1500.402 1500.4016 1

pop 3329.225 832.3063 4

Error 142431.371 847.8058 168

Total 147260.997 851.2196 173

Variance components:

sigma2 P.value

clusters 8.63952 0.0000

pop -0.53478 0.8762

Error 847.80578

Phi-statistics:

clusters.in.GLOBAL (Phi_CT) pop.in.GLOBAL (Phi_ST) pop.in.clusters (Phi_SC)

0.0100939536 0.0094691422 -0.0006311825

Variance coefficients:

a b c

28.98276 29.03448 77.33333

Reply all

Reply to author

Forward