Error during gl.LDNe

148 views
Skip to first unread message

Elspeth McLennan

unread,
Apr 4, 2023, 12:51:42 AM4/4/23
to dartR
Hi dartR team, 

Firstly a huge thank you for implementing a function to use NeEstimator via R! I am trying to get estimates of Ne from a rather large whole genome dataset subsetted down to 50,000 SNPs. 

NeEstimator appears to run fine going through the iterations for each population but at the end gives the following error:
Error in `$<-.data.frame`(`*tmp*`, "color", value = c("#F8766D", "#F8766D",  :
  replacement has 15 rows, data has 18
Calls: gl.LDNe -> $<- -> $<-.data.frame
In addition: Warning messages:
1: In rbind(freq, as.numeric(harmonic_mean[[i]]), as.numeric(comparisons[[i]]),  :
  number of columns of result is not a multiple of vector length (arg 3)
2: In rbind(freq, as.numeric(harmonic_mean[[i]]), as.numeric(comparisons[[i]]),  :
  number of columns of result is not a multiple of vector length (arg 3)
3: In rbind(freq, as.numeric(harmonic_mean[[i]]), as.numeric(comparisons[[i]]),  :
  number of columns of result is not a multiple of vector length (arg 3)

There could be something wrong with my genlight object - as this is not DArT data there may have been an issue with how it has been converted but all other functions from dartR have worked. I first read in my filtered vcf file (generated via vcftools) with vcfR, converted to genind using adegenet and then to genlight using dartR.

Here is my complete code:
vcf <- read.vcfR("wild_413_100cr_1mac_0.005maf_15-60depth_unlinked.recode.vcf")
subset <- sample(size = 50000, replace = F, x = c(1:nrow(vcf)))
vcf_subset <- vcf[subset, ]
vcf_subset

### by cluster
gen_cluster <- vcfR2genind(vcf_subset, sep = "/", return.alleles = F)
pops_cluster <- read.table("wild_413_populations_faststructure.txt", header = T)
gen_cluster@pop <- as.factor(pops_cluster$cluster)
gen_cluster

# convert to genlight for NeEstimator
gl_cluster <- gi2gl(gen_cluster)
gl_cluster

nes <- gl.LDNe(gl_cluster, outpath = "/mnt/sdd/filteredvcf_analysis", neest.path = "/mnt/sdd/filteredvcf_analysis", critical = c(0,0.05), singleton.rm = T, mating = "random")
nes

Any help you could provide is greatly appreciated!

Thank you so much, 
Elle

Bernd.Gruber

unread,
Apr 4, 2023, 1:50:38 AM4/4/23
to da...@googlegroups.com

Hi Elspeth,

 

Looks most likely something with the plot at the end goes wrong. Not sure why. Can you check you have no spaces in your indNames and popNames and lociNames

 

 

And or run the gl.LDNe with the option   plot.out=FALSE (this might work), but still would be good to know what is going wrong here.

 

Also not sure why you go via genind there is a function

 

vcfR2genlight which should work as well, but maybe you tried that.

 

Another test would be to run

 

 

gl.compliance.check(gl_cluster) to try to make sure it is a complete genlight/dartR object.

 

 

Cheers, Bernd

 

 

 

 

 

==============================================================================

Dr Bernd Gruber                                              )/_         

                                                         _.--..---"-,--c_    

Professor Ecological Modelling                      \|..'           ._O__)_     

Tel: (02) 6206 3804                         ,=.    _.+   _ \..--( /          

Fax: (02) 6201 2328                           \\.-''_.-' \ (     \_          

Institute for Applied Ecology                  `'''       `\__   /\          

Faculty of Science and Technology                          ')                

University of Canberra   ACT 2601 AUSTRALIA

Email: bernd....@canberra.edu.au

WWW: bernd-gruber

 

Australian Government Higher Education Provider Number CRICOS #00212K 

NOTICE & DISCLAIMER: This email and any files transmitted with it may contain
confidential or copyright material and are for the attention of the addressee
only. If you have received this email in error please notify us by email
reply and delete it from your system. The University of Canberra accepts
no liability for any damage caused by any virus transmitted by this email.

==============================================================================

--
You received this message because you are subscribed to the Google Groups "dartR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dartr+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dartr/e6303d11-1cb1-420d-9e8b-5f8badb73407n%40googlegroups.com.

Message has been deleted

Jose Luis Mijangos

unread,
Apr 6, 2023, 1:31:10 AM4/6/23
to dartR
Hi Elle,

The problem was that when the output file of LDNe is too large the function had some problems reading it. 

I fix this problem and the updated version of the function is available in the developing version of dartR which can be installed as shown below:

library(dartR)
gl.install.vanilla.dartR(flavour = "dev")

Cheers,
Luis 

Elspeth McLennan

unread,
Apr 13, 2023, 8:09:00 PM4/13/23
to da...@googlegroups.com
Hi Luis, 

Unfortunately, I'm still having problems with this analysis. I downloaded the developing version of dartR which seem to solve the problem - the printed output had numbers rather than NaN but before it printed out the final output with the estimates and 95% CIs it output the following error:
Error in strt[y]:stp[y] : NA/NaN argument

Do you think it is still struggling to output the results?

Thanks again for your help!

Elle

On Thu, 6 Apr 2023 at 15:44, Elspeth McLennan <elle.mc...@gmail.com> wrote:
Hi Luis, 

Thank you so much!!

Have a lovely weekend, 

All my best, 
Elle

Elspeth McLennan

unread,
May 9, 2023, 8:46:56 PM5/9/23
to dartR
Hi Luis

My apologies in the delay getting back to you. Unfortunately I have tried updating R, R studio and all packages and still receiving the same error both when using an AWS VM and on my local computer. The only thing I've noticed is that sometimes when it is outputting the results (before the Error in strt[y]:stp[y] : NA/NaN argument) the 95% confidence intervals do not seem to include the final estimate for Ne:
Actual number of r^2-values evaluated = 55270323 Initial estimate of Ne: 205.9 Final estimate of Ne: 58.3 Parameter CI: 13.5 13.5 
Jackknife CI: 12.2 14.9

Could that be causing the error I'm seeing?

The other thing is that I'm not sure the developing version is downloading properly, this is the output I'm receiving:
> gl.install.vanilla.dartR(flavour = "dev") Starting gl.install.vanilla.dartR Installing dartR from github (dev) Downloading GitHub repo green-striped-gecko/dartR@dev Skipping 1 packages not available: ggtern ── R CMD build ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ✔ checking for file 'C:\Users\emcl2913\AppData\Local\Temp\Rtmp044LTI\remotes6e0c315e9a6\green-striped-gecko-dartR-2d623f2/DESCRIPTION' ... preparing 'dartR': (1.2s) ✔ checking DESCRIPTION meta-information ... checking for LF line-endings in source and make files and shell scripts (1.7s) checking for empty or unneeded directories building 'dartR_2.9.4.tar.gz' Installing package into ‘C:/Users/emcl2913/AppData/Local/R/win-library/4.2’ (as ‘lib’ is unspecified) * installing *source* package 'dartR' ... ** using staged installation ** R ** data *** moving datasets to lazyload DB ** inst ** byte-compile and prepare package for lazy loading ** help *** installing help indices *** copying figures ** building package indices ** installing vignettes ** testing if installed package can be loaded from temporary location ** testing if installed package can be loaded from final location ** testing if installed package keeps a record of temporary installation path * DONE (dartR) Installing package ggtern from provisional GitHub Repository Downloading GitHub repo mijangos81/ggtern@HEAD ── R CMD build ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ✔ checking for file 'C:\Users\emcl2913\AppData\Local\Temp\Rtmp044LTI\remotes6e0c9ca4daa\mijangos81-ggtern-8bfb500/DESCRIPTION' (394ms) preparing 'ggtern': (983ms) ✔ checking DESCRIPTION meta-information ... installing the package to process help pages (669ms) ----------------------------------- installing *source* package 'ggtern' ... files 'build/partial.rdb', 'man/approved_layers.Rd', 'man/geom_smooth_tern.Rd', 'man/ggtern_package.Rd', 'man/mahalanobis_distance.Rd', 'man/position_jitter_tern.Rd', 'man/position_nudge_tern.Rd', 'man/theme_gridsontop.Rd', 'man/theme_ticklength.Rd' are missing file 'R/gg-internal.R' has the wrong MD5 checksum ** using staged installation ** R ** data ** demo ** inst ** byte-compile and prepare package for lazy loading ** help Registered S3 methods overwritten by 'ggtern': method from grid.draw.ggplot ggplot2 plot.ggplot ggplot2 print.ggplot ggplot2 Error : theme_elements.Rd:21: the condition has length > 1 ERROR: installing Rd objects failed for package 'ggtern' ─ removing 'C:/Users/emcl2913/AppData/Local/Temp/RtmpQRMbNw/Rinst74d44c9b270e/ggtern' ----------------------------------- ERROR: package installation failed Error: Failed to install 'ggtern' from GitHub: ! System command 'Rcmd.exe' failed


Thank you again for your help!

All my best,
Elle 

Jose Luis Mijangos

unread,
May 16, 2023, 10:13:24 PM5/16/23
to dartR
Hi Elle,

I have fixed the bug in the function gl.LDNe. To use the update function please install the developing version of dartR as shown below.

> devtools::install_github("green-striped-gecko/dartR@dev”)

Cheers,
Luis 

Elspeth McLennan

unread,
May 17, 2023, 8:27:21 PM5/17/23
to da...@googlegroups.com
Hi Luis, 

Thank you so much! The bug fix has enabled me to run the analysis for 15,000 SNPs but unfortunately, the 95% CI values do not include the estimate? Is this something you saw when you ran my data?
image.png

Thank you again so much for your help!
Elle

Elspeth McLennan

unread,
May 17, 2023, 8:28:40 PM5/17/23
to da...@googlegroups.com
Opps sorry I attached the wrong picture - here are the populations where the estimate of Ne is outside the 95% CIs:
image.png

Jose Luis Mijangos

unread,
May 18, 2023, 12:15:48 AM5/18/23
to dartR
Hi Elle,

I couldn't replicate the strange results you see when I run the code and the dataset you sent me (see pic attached). Probably because one of the steps in your code is subsampling at random 15,000 SNPs. 

One way to make this step reproducible would be to set a seed number before using the function, for example:

> set.seed(32)
> gl_cluster_3 <- gl.subsample.loci(gl_cluster_2,n=5,method = "random")
> locNames(gl_cluster_3)

Maybe this is just a "hiccup" of the program, maybe if you run it again, you will get results that make more sense. Maybe there are some loci that might be causing this. You could do a bit more filtering, e.g. Hardy Weinberg, sex chromosomes, monomorphs, etc. 

Please let us know if that solved the issue. Otherwise, send me your code (using a seed number) that replicates the behavior you see. Also, it would help if you could send the information of your session using:

> sessionInfo()

Cheers,
Luis

NE_results.png

Bernd.Gruber

unread,
May 18, 2023, 12:31:27 AM5/18/23
to da...@googlegroups.com

Hi,

 

Not sure if helpful. We found in one of our student a “quirk” in the Neestimator. If there is a loci that is 100% heterozygous for all individuals (all ones) then the interval estimates become INF. Therefore we filtered for that loci “manually” before running it. This kind of loci do not get filtered by the monomorph filter and in your case it could be that by subsampling you might get such cases.

 

The way to filter is for example:

 

gl <- possums.gl

drop <- locNames(gl)[apply(as.matrix(gl), 2, function(x) all(as.numeric(x)==1, na.rm=T))]

if (length(drop)>0) gl <- gl.drop.loc(gl,loc.list = drop)

 

should work I think (but not tested)

 

 

 

 

==============================================================================

Dr Bernd Gruber                                              )/_         

                                                         _.--..---"-,--c_    

Professor Ecological Modelling                      \|..'           ._O__)_     

Tel: (02) 6206 3804                         ,=.    _.+   _ \..--( /          

Fax: (02) 6201 2328                           \\.-''_.-' \ (     \_          

Institute for Applied Ecology                  `'''       `\__   /\          

Faculty of Science and Technology                          ')                

University of Canberra   ACT 2601 AUSTRALIA

Email: bernd....@canberra.edu.au

WWW: bernd-gruber

 

Australian Government Higher Education Provider Number CRICOS #00212K 

NOTICE & DISCLAIMER: This email and any files transmitted with it may contain
confidential or copyright material and are for the attention of the addressee
only. If you have received this email in error please notify us by email
reply and delete it from your system. The University of Canberra accepts
no liability for any damage caused by any virus transmitted by this email.

==============================================================================

 

Elspeth McLennan

unread,
May 19, 2023, 2:29:14 AM5/19/23
to da...@googlegroups.com
Hi Luis, 

I am still getting the strange results even with set.seed(32). Here is my code 
rm(list=ls())
graphics.off()
setwd("C:/Users/emcl2913/OneDrive - The University of Sydney (Staff)/Documents - AWGG/KOALAS/GenomeSurvey/Population Genomics/Analysis/Effective Population Size")

### read in genlight object created via Ronin
load("gl_cluster_wild_413_100cr_1mac_0.005maf_15-60depth_unlinked_50000.Rdata")
gl_cluster

library(dartR)

set.seed(32)
gl_cluster_subset <- gl.subsample.loci(gl_cluster,n=15000,method = "random")
gl_cluster_subset

nes <- gl.LDNe(gl_cluster_subset, outpath = "./", neest.path = "./", critical = c(0,0.05), singleton.rm = T, mating = "random")
nes <- nes

And here if my session info:
R version 4.2.3 (2023-03-15 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.utf8  LC_CTYPE=English_Australia.utf8    LC_MONETARY=English_Australia.utf8
[4] LC_NUMERIC=C                       LC_TIME=English_Australia.utf8    

attached base packages:
[1] tcltk     stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] sf_1.0-12            maps_3.4.1           maptools_1.1-6       plotrix_3.8-2        plotfunctions_1.4   
 [6] scales_1.2.1         shapefiles_0.7.2     foreign_0.8-84       gstat_2.1-1          fields_14.1         
[11] viridis_0.6.3        viridisLite_0.4.2    spam_2.9-1           patchwork_1.1.2      ggrepel_0.9.3       
[16] BiodiversityR_2.15-1 vegan_2.6-4          lattice_0.21-8       permute_0.9-7        ggmap_3.0.2         
[21] rgdal_1.6-6          geodata_0.5-8        terra_1.7-29         ggpubfigs_0.0.1      lubridate_1.9.2     
[26] forcats_1.0.0        stringr_1.5.0        purrr_1.0.1          readr_2.1.4          tidyr_1.3.0         
[31] tibble_3.2.1         tidyverse_2.0.0      stringi_1.7.12       AlleleShift_1.1      raster_3.6-20       
[36] sp_1.6-0             dartR_2.9.5          dartR.data_1.0.2     dplyr_1.1.2          ggplot2_3.4.2       
[41] adegenet_2.1.10      ade4_1.7-22         

loaded via a namespace (and not attached):
  [1] utf8_1.2.3          R.utils_2.12.2      tidyselect_1.2.0    lme4_1.1-33         htmlwidgets_1.6.2  
  [6] grid_4.2.3          combinat_0.0-8      StAMPP_1.6.3        munsell_0.5.0       units_0.8-1        
 [11] codetools_0.2-19    withr_2.5.0         gdsfmt_1.34.1       colorspace_2.1-0    pegas_1.2          
 [16] knitr_1.42          rstudioapi_0.14     stats4_4.2.3        labeling_0.4.2      RgoogleMaps_1.4.5.3
 [21] farver_2.1.1        gap.datasets_0.0.5  vctrs_0.6.2         generics_0.1.3      xfun_0.39          
 [26] timechange_0.2.0    R6_2.5.1            doParallel_1.0.17   bitops_1.0-7        reshape_0.8.9      
 [31] promises_1.2.0.1    nnet_7.3-18         gtable_0.3.3        sandwich_3.0-2      rlang_1.1.1        
 [36] calibrate_1.7.7     splines_4.2.3       checkmate_2.1.0     yaml_2.3.7          reshape2_1.4.4     
 [41] abind_1.4-5         backports_1.4.1     httpuv_1.6.11       Hmisc_5.0-1         tools_4.2.3        
 [46] Rcmdr_2.8-0         ellipsis_0.3.2      RColorBrewer_1.1-3  proxy_0.4-27        Rcpp_1.0.10        
 [51] plyr_1.8.8          base64enc_0.1-3     classInt_0.4-9      rpart_4.1.19        zoo_1.8-12         
 [56] haven_2.5.2         cluster_2.1.4       survey_4.1-1        magrittr_2.0.3      data.table_1.14.8  
 [61] visreg_2.7.0        spacetime_1.3-0     genetics_1.3.8.1.3  effects_4.2-2       mvtnorm_1.1-3      
 [66] hms_1.1.3           RcmdrMisc_2.7-2     mime_0.12           evaluate_0.21       xtable_1.8-4       
 [71] jpeg_0.1-10         readxl_1.4.2        gridExtra_2.3       MuMIn_1.47.5        compiler_4.2.3     
 [76] KernSmooth_2.23-20  crayon_1.5.2        minqa_1.2.5         gdistance_1.6.2     R.oo_1.25.0        
 [81] htmltools_0.5.5     mgcv_1.8-42         later_1.3.1         tzdb_0.3.0          Formula_1.2-5      
 [86] DBI_1.1.3           PopGenReport_3.0.7  MASS_7.3-59         boot_1.3-28.1       relimp_1.0-5       
 [91] Matrix_1.5-4        car_3.1-2           cli_3.6.1           mitools_2.4         R.methodsS3_1.8.2  
 [96] gdata_2.19.0        parallel_4.2.3      insight_0.19.1      dotCall64_1.0-2     igraph_1.4.2       
[101] pkgconfig_2.0.3     foreach_1.5.2       SNPRelate_1.32.2    digest_0.6.31       rmarkdown_2.21     
[106] cellranger_1.1.0    intervals_0.15.3    htmlTable_2.4.1     nortest_1.0-4       gap_1.5-1          
[111] shiny_1.7.4         gtools_3.9.4        nloptr_2.0.3        lifecycle_1.0.3     nlme_3.1-162       
[116] dismo_1.3-9         carData_3.0-5       seqinr_4.2-30       fansi_1.0.4         pillar_1.9.0       
[121] GGally_2.1.2        fastmap_1.1.1       httr_1.4.6          survival_3.5-5      glue_1.6.2         
[126] xts_0.13.1          FNN_1.1.3.2         mmod_1.3.3          png_0.1-8           iterators_1.0.14   
[131] tcltk2_1.2-11       class_7.3-21        e1071_1.7-13        ape_5.7-1

You received this message because you are subscribed to a topic in the Google Groups "dartR" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dartr/qIgrNXlBgkQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dartr+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dartr/d4f7c300-626c-4584-be8f-95d5ce9307a8n%40googlegroups.com.

Bernd.Gruber

unread,
May 19, 2023, 2:46:26 AM5/19/23
to da...@googlegroups.com

Hi Elsbeth,

 

Can you try to run it locally and not from your onedrive with the “funny” characters and spaces in the path.

 

I know that R and oneDrive can have issues and that might be the case here.

Dylan Westaway

unread,
May 16, 2024, 11:53:46 PMMay 16
to dartR
Hi Elle, 

I'm having a similar issue with Ne estimates falling outside of CIs. I'm wondering if you found a solution to this?

Cheers,

Dylan 

Jose Luis Mijangos

unread,
May 17, 2024, 12:00:19 AMMay 17
to dartR
Hi Dylan,

I ran the gl.LDNe function on a PC and I was able to replicate the error of  Ne estimates falling outside of CIs. Then, I ran the the function on a Mac using the same input files that were used on the PC and the error did not appear. 

So, this means that the PC version of the LDNe program has a bug. I guess that the short-term solution would be to use the Linux or Mac version of LDNe.

Cheers,
Luis 

Dylan Westaway

unread,
May 17, 2024, 12:48:59 AMMay 17
to dartR
Hi Luis, 

Thanks for your response. 

Ok, fair enough. I'll try running it on a Mac.

Cheers,

Dylan 

Bernd.Gruber

unread,
May 17, 2024, 1:19:22 AMMay 17
to da...@googlegroups.com
This is just a guess, but maybe sub sampling random chooses loci that are identical and that might throw the algorithm off. 

So maybe filter pairs that are identical and see if this works. 



---------


On 17 May 2024, at 14:49, Dylan Westaway <dwest...@gmail.com> wrote:

Hi Luis, 
Reply all
Reply to author
Forward
0 new messages