Error with gl.filter.rdepth() and gl.filter.reproducibility()

95 views
Skip to first unread message

Lana Austin

unread,
Feb 24, 2024, 8:47:35 PM2/24/24
to dartR
Hi there,

I am re-running some code that worked okay a few months ago, and now I am getting an error I can't work out. Any insight would be most appreciated!

gl.report.reproducibility(gl.filtered.data.6.4) Starting gl.report.reproducibility Processing genlight object with SNP data Reporting Repeatability by Locus No. of loci = 23755 No. of individuals = 765 Minimum : 0.93 1st quartile : 0.972727 Median : 0.98731 Mean : 0.982335 3r quartile : 0.997041 Maximum : 1 Missing Rate Overall: 0.04 Error in quantile.default(repeatability, probs = seq(0, 1, 1/20), type = 1) : missing values and NaN's not allowed if 'na.rm' is FALSE

Starting gl.filter.reproducibility Processing genlight object with SNP data Error in if (repeatability[i] < threshold) { : missing value where TRUE/FALSE needed

And....

gl.report.rdepth(gl.filtered.data.6.4) Starting gl.report.rdepth Processing genlight object with SNP data Reporting Read Depth by Locus No. of loci = 23755 No. of individuals = 765 Minimum : 2.5 1st quartile : 4.4 Median : 5.7 Mean : 6.066246 3r quartile : 7.4 Maximum : 34.1 Missing Rate Overall: 0.04 Error in quantile.default(rdepth, probs = seq(0, 1, 1/20), type = 1) : missing values and NaN's not allowed if 'na.rm' is FALSE

If I run gl.filter.rdepth() it leaves me with 0 SNPs, even though the plots and my parameters look okay to me? (Sorry, I couldn't paste them in this post). 


gl.filtered.data.6.5 <- gl.filter.rdepth(gl.filtered.data.6.4, lower = 5, upper = 30) Starting gl.filter.rdepth Processing genlight object with SNP data Removing loci with rdepth <= 5 and >= 30 Completed: gl.filter.rdepth Warning messages: 1: Removed 1467 rows containing non-finite values (`stat_bin()`). 2: Removed 1467 rows containing non-finite values (`stat_bin()`).
gl.filtered.data.6.5 ******************** *** DARTR OBJECT *** ******************** ** 765 genotypes, NA SNPs , size: 30 Mb missing data: 428821 (=NA %) scored as NA

gl.filtered.data.6.4 looks like this

******************** *** DARTR OBJECT *** ******************** ** 765 genotypes, 23,755 SNPs , size: 36.1 Mb missing data: 732218 (=4.03 %) scored as NA ** Genetic data @gen: list of 765 SNPbin @ploidy: ploidy of each individual (range: 2-2) ** Additional data @ind.names: 765 individual labels @loc.names: 23755 locus labels @loc.all: 23755 allele labels @position: integer storing positions of the SNPs [within 69 base sequence] @pop: population of each individual (group size range: 52-412) @other: a list containing: loc.metrics, ind.metrics, loc.metrics.flags, verbose, history @other$ind.metrics: id, pop, sex, Mito, service, plate_location @other$loc.metrics: AlleleID, CloneID, AlleleSequence, TrimmedSequence, Chrom_Yellow_robin_HiC_v2, ChromPos_Yellow_robin_HiC_v2, AlnCnt_Yellow_robin_HiC_v2, AlnEvalue_Yellow_robin_HiC_v2, SNP, SnpPosition, CallRate, OneRatioRef, OneRatioSnp, FreqHomRef, FreqHomSnp, FreqHets, PICRef, PICSnp, AvgPIC, AvgCountRef, AvgCountSnp, RepAvg, clone, uid, rdepth, monomorphs, maf, OneRatio, PIC @other$latlon[g]: no coordinates attached
 

Some other things to note:
- I am using dartR 2.9.7,
- I tried to run the code after running devtools::install_github("green-striped-gecko/dartR", ref = "beta", dependencies = TRUE), but still the same error. 
- I have compliance checked gl.filtered.data.6.

Any insight into how I can get around this error would be greatly appreciated.

Lana 

Jose Luis Mijangos

unread,
Feb 25, 2024, 11:34:46 PM2/25/24
to dartR

Hi Lana,

 

Could you please send me a subset of your data that replicates the error to my personal e-mail: luis.m...@gmail.com. The code below shows how you can subset your data and save it. Please send me the file “test.rds” as in the code example below and also the code you used.  

 

library(dartR)

# subset loci

# you can increase the number of loci to subsample for example n=400

test <- gl.subsample.loci(your_data.gl,n=200,method = "random")

#subset individuals

# you can increase the number of individuals to subsample by using for example indNames(test)[1:20]

test <- gl.keep.ind(test,ind.list = indNames(test)[1:10])

# saving object in the working directory

saveRDS(test, "test.rds")

 

Cheers,

Luis

Jose Luis Mijangos

unread,
Feb 27, 2024, 3:37:22 PM2/27/24
to dartR
Hi Lana,

Thank you for reporting this bug, which has been fixed in the developing version of dartR. You can install it as shown below.

#install developing version of dartR
devtools::install_github("green-striped-gecko/dartR@dev")
library(dartR)
test <- gl.load("test.rds")
gl.report.rdepth(test)  
gl.filter.rdepth(test)  
gl.report.reproducibility(test)
gl.filter.reproducibility(test) 

Cheers,
Luis

Lana Austin

unread,
Feb 29, 2024, 8:47:26 PM2/29/24
to dartR
Thanks Luis for this. 

The reproducibility functions works as expected now. 

Unfortunately, I am still having the same trouble with gl.filter.rdepth(), with the same test.rds I sent you. 

Starting gl.filter.rdepth Processing genlight object with SNP data Removing loci with rdepth <= 5 and >= 50 Completed: gl.filter.rdepth ******************** *** DARTR OBJECT *** ******************** ** 100 genotypes, NA SNPs , size: 18.1 Mb

Thanks for looking into this,
Lana

Lana Austin

unread,
Mar 16, 2024, 7:05:13 PM3/16/24
to dartR
Thanks Luis, 

these all work perfectly now!

Lana

Reply all
Reply to author
Forward
0 new messages