Getting silly, cryptic errors, trying to start my amova (FUN.VALUE = integer(nInd(poly)), MARGIN = 1, : values must be type 'integer', but FUN(X[[1]]) result is type 'double')

Niklas

unread,

Jan 9, 2020, 12:34:32 PM1/9/20

to poppr

Hi there !

Hope somebody can help me on an issue getting an Amova to run on my data

I start with a vcf-file that I turned into a genind file, after creating a subset of my samples

I turned it into a genclone to write information about the populations into the strata

########

svcf<-read.vcfR("sali1.vcf")

s1<-vcfR2genind(svcf)

m<-as.matrix(s1)

as.matrix(s1)[1:5,1:3]

#Subsets

di.names<-read.csv("Namen diploid.txt", header=F); di.names<-as.matrix(di.names)

indNames(s1)

s1dsub<-s1[c(di.names),]

pop.datad<-read.csv("pop.datad.csv", sep=";", header=T)

strata(dclone)<-data.frame(pop.datad$Location)

dclone@pop<-(pop.datad$Location)

damova <- poppr.amova(dclone, ~pop.datad.Location, missing = "mean")

########

After running the command I get following error message:

Replaced 1190162 missing values.

Error in vapply(ploc, FUN = apply, FUN.VALUE = integer(nInd(poly)), MARGIN = 1, :

values must be type 'integer',

but FUN(X[[1]]) result is type 'double'

In addition: Warning messages:

1: In validityMethod(as(object, superClass)) :

@tab does not contain integers; as of adegenet_2.0-0, numeric values are no longer used

2: In validityMethod(as(object, superClass)) :

@tab does not contain integers; as of adegenet_2.0-0, numeric values are no longer used

Sadly I am not much of a pro in R or understanding these programs and I just wish

to get some information out of this population comparison that I worked my but of to get.

Maybe somebody has an idea of what the problem might be.

As I am not too familiar with this kind of trouble shooting, please write me which further infos

are required to understand the issue and I´ll provide it ASAP !

Cheers !

Niklas

Zhian Kamvar

unread,

Jan 10, 2020, 4:45:26 AM1/10/20

to Niklas, poppr

Forgot to send this to the group:

Hello,

The reason why AMOVA is failing is because there is a ~slight~ bug when running AMOVA with missing data interpreted as mean allele frequencies.

AMOVA can handle missing data if you set missing = "asis", so use that instead of "mean".

A couple of side notes:

1. There is a known issue with handling genind data from vcfR2genind() from the current CRAN version of vcfR. When this happens, within-individual variance cannot be calculated.

2. When you set the strata, you need only use the data frame: strata(dclone) <- pop.datad

from there, you can use poppr.amova(dclone, ~Location, missing = "asis"), which tells poppr.amova to take the Location column from the strata.

Hope that helps,

Zhian

--
You received this message because you are subscribed to the Google Groups "poppr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to poppr+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/poppr/54da3f65-fc29-42a2-bc67-0590b2f81bb5%40googlegroups.com.

Niklas

unread,

Jan 16, 2020, 11:55:03 AM1/16/20

to poppr

Hi there.

Thanks for your quick reply back than !

I just had the time to try everything out.

THAT problem seems to be solved, though the next came up.

##########

Error in eigen(delta, symmetric = TRUE, only.values = TRUE) :

infinite or missing values in 'x'

In addition: Warning messages:

1: In poppr.amova(dclone, ~pop.datad.Location, missing = "asis") :

Data with mixed ploidy or ambiguous allele dosage cannot have within-individual variance calculated until the dosage is correctly estimated.

This function will return the summary statistic, rho (Ronfort et al 1998) but be aware that this estimate will be skewed due to ambiguous dosage. If you have zeroes encoded in your data, you may wish to remove them.

To remove this warning, use within = FALSE

2: In is.euclid(xdist) : Zero distance(s)

##########

But it seems, that the amount of my missing data is simply to high: I tried the other options for "missing"

and all gave different messages hinting to too much missing data.

Thanks though.

Cheers mate !

Niklas

To unsubscribe from this group and stop receiving emails from it, send an email to po...@googlegroups.com.

Zhian Kamvar

unread,

Jan 16, 2020, 12:10:24 PM1/16/20

to Niklas, poppr

I would suggest that you try using the `incomp()` function to find any genotypes that are incompatible with other genotypes (e.g. they share no alleles in common). This function returns a square matrix of your samples with a 1 if they are compatible and 0 if they are incompatible. This will give you a quick idea of which samples you need to remove.

Also, vcfR just recently updated on CRAN, so I would highly recommend you use `vcfR2genind(svcf, return.alleles = TRUE)` to allow for within-sample variance to be calculated.

Here is an example of using `incomp()` to find and remove incomparable samples.

library(poppr)

data(nancycats)

strata(nancycats) <- data.frame(p = pop(nancycats))

nan <- nancycats[pop = c(1, 17), loc = c(1, 4)]

poppr.amova(nan, ~p, missing = "asis")

#> Warning in is.euclid(xdist): Zero distance(s)

#> Error in eigen(delta, symmetric = TRUE, only.values = TRUE): infinite or missing values in 'x'

rowSums(incomp(nan))

#> N215 N216 N282 N283 N288 N291 N292 N293 N294 N295 N296 N297 N281 N289 N290

#> 2 2 13 13 13 13 13 13 13 13 13 13 13 13 13

poppr.amova(nan[rowSums(incomp(nan)) > 2, ], ~p, missing = "asis")

#> Warning in is.euclid(xdist): Zero distance(s)

#> Distance matrix is non-euclidean.

#> Using quasieuclid correction method. See ?quasieuclid for details.

#> Warning in is.euclid(distmat): Zero distance(s)

#> $call

#> ade4::amova(samples = xtab, distances = xdist, structures = xstruct)

#>

#> $results

#> Df Sum Sq Mean Sq

#> Between p 1 3.238913 3.2389135

#> Between samples Within p 17 16.886126 0.9933015

#> Within samples 19 30.659856 1.6136766

#> Total 37 50.784895 1.3725647

#>

#> $componentsofcovariance

#> Sigma %

#> Variations Between p 0.1212120 8.507891

#> Variations Between samples Within p -0.3101875 -21.772115

#> Variations Within samples 1.6136766 113.264224

#> Total variations 1.4247011 100.000000

#>

#> $statphi

#> Phi

#> Phi-samples-total -0.13264224

#> Phi-samples-p -0.23796713

#> Phi-p-total 0.08507891

To unsubscribe from this group and stop receiving emails from it, send an email to poppr+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/poppr/53117652-bf78-4091-aeca-3f7194a97856%40googlegroups.com.

Niklas

unread,

Jan 16, 2020, 1:16:29 PM1/16/20

to poppr

Is this good ?

+++++

> rowSums(incomp(dclone))

numeric(0)

> samova<-poppr.amova(dclone, ~Location, missing = "asis")

Distance matrix is non-euclidean.

Using quasieuclid correction method. See ?quasieuclid for details.

Warning messages:

1: In is.euclid(xdist) : Zero distance(s)

2: In is.euclid(distmat) : Zero distance(s)

> samova

$call

ade4::amova(samples = xtab, distances = xdist, structures = xstruct)

$results

Df Sum Sq Mean Sq

Between Location 13 93027.11 7155.9316

Between samples Within Location 243 236804.08 974.5024

Within samples 257 472072.13 1836.8565

Total 513 801903.32 1563.1644

$componentsofcovariance

Sigma %

Variations Between Location 169.1136 10.73878

Variations Between samples Within Location -431.1771 -27.37992

Variations Within samples 1836.8565 116.64114

Total variations 1574.7931 100.00000

$statphi

Phi

Phi-samples-total -0.1664114

Phi-samples-Location -0.3067392

Phi-Location-total 0.1073878

+++++

To view this discussion on the web visit https://groups.google.com/d/msgid/poppr/53117652-bf78-4091-aeca-3f7194a97856%40googlegroups.com.

Zhian Kamvar

unread,

Jan 17, 2020, 4:44:22 AM1/17/20

to Niklas, poppr

Hi Niklas,

This looks fine. The negative variance is a thing that does tend to happen and has been addressed on this board: https://groups.google.com/d/msg/poppr/NSag-55d6bs/cfvOSV-VAQAJ

The fact that your data set passed through poppr.amova() the second time without any errors about missing data means that using the return.alleles = TRUE option with vcfR2genind() was the right way to go. I'll make a separate post on the forum explaining why.

Hope that helps.

Best,

Zhian

To unsubscribe from this group and stop receiving emails from it, send an email to poppr+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/poppr/4487dd13-61af-4cc2-a033-e58add541b99%40googlegroups.com.

Niklas

unread,

Jan 18, 2020, 4:58:59 AM1/18/20

to poppr

Hi Zhian,

You helped me more than you can imagine and helped me a lot with one of my chapters.

Thanks for your direct answers and the the helping advice beyond that.

I would have run into traps like the old vcfR2genind versions.

Thanks so much,

I am very grateful !

Cheers,

Niklas

To view this discussion on the web visit https://groups.google.com/d/msgid/poppr/4487dd13-61af-4cc2-a033-e58add541b99%40googlegroups.com.

Reply all

Reply to author

Forward