gl2vcf troubleshooting errors

55 views
Skip to first unread message

David Tork

unread,
May 8, 2023, 5:41:42 PM5/8/23
to dartR
Hello,

I am trying to convert my genlight file to vcf for analysis in TASSEL, but am having trouble with a few errors.

I would like to include SNP and chromosome positions from the reference genome, but I cannot tell if I am calling the right field names. I keep getting the error:
gl2vcf(May23Vibf, snp_pos = May23Vibf@other[["loc.metrics"]][["ChromPosSnp_viburnum_lautum_v031"]], snp_chr = May23Vibf@other[["loc.metrics"]][["Chrom_viburnum_lautum_v031"]], pos_cM = May23Vibf@other[["loc.metrics"]][["ChromPosSnp_viburnum_lautum_v031"]],plink_path = "/Users/davidtork/Desktop/'R projects'/Vib_May23Hyb/plink_mac_20230116", outfile = "May23Vibf_vcf", outpath = getwd(), verbose = 5)
Starting gl2vcf [dartR vers. 2.7.2 Build = Jody ] Processing genlight object with SNP data Error in if (snp_pos == "0") { : the condition has length > 1

If I remove SNP/chr information, it returns a different error, and I am still not able to generate a VCF file: 
gl2vcf(May23Vibf,plink_path = "/Users/davidtork/Desktop/'R projects'/Vib_May23Hyb/plink_mac_20230116", outfile = "May23Vibf_vcf", outpath = getwd(), verbose = 5)
Starting gl2vcf [dartR vers. 2.7.2 Build = Jody ] Processing genlight object with SNP data Chromosome information is not present in the slot 'chromosome'. Setting '0' as the name chromosome for all the SNPs. Starting gl2plink Processing genlight object with SNP data Completed: gl2plink Error: --out only accepts 1 parameter. For more information, try "plink --help <flag name>" or "plink --help | more". Warning: running command '/Users/davidtork/Desktop/'R projects'/Vib_May23Hyb/plink_mac_20230116/plink --file /Users/davidtork/Desktop/R projects/Vib_May23Hyb/gl_plink_temp --recode vcf --allow-no-sex --reference-allele /var/folders/4v/9qc45m6x7_97f72bwk135_y00000gn/T//RtmpJhxHoE/mylist.txt --out /Users/davidtork/Desktop/R projects/Vib_May23Hyb/May23Vibf_vcf --aec' had status 5 ----------Output of function start: PLINK v1.90b7 64-bit (16 Jan 2023) www.cog-genomics.org/plink/1.9/ (C) 2005-2023 Shaun Purcell, Christopher Chang GNU General Public License v3 ----------Output of function finished... Completed: gl2vcf

I attached a screenshot of my loc.metrics for reference. I apologize if this is a basic question. I am not familiar with PLINK, and have had no luck finding solutions to this problem elsewhere. 

Final note -- I saw this mentioned elsewhere on this forum, but the "plink_path" argument does not accept spaces in the path name. It would be helpful if this was mentioned in the gl2vcf() documentation. A simple workaround is to put single quotes around the directory level with a space: 
"/Users/davidtork/Desktop/'R projects'/Vib_May23Hyb/plink_mac_20230116"

Thank you,
David
Screenshot 2023-05-08 at 4.11.39 PM.png

Jose Luis Mijangos

unread,
May 9, 2023, 3:44:32 AM5/9/23
to dartR
Hi David,

- It seems there was bug. Can you try the code below:

library(dartR)
# installing the devoloping version of dartR
gl.install.vanilla.dartR(flavour = "dev")
library(dartR) # should be Version 2.9.5
#reading DArT report
t1 <-  your_genlight
# checking compliance
t1 <- gl.compliance.check(t1)
# filtering loci with all missing data
t1 <- gl.filter.allna(t1)
# as described in the function documentation 
# the parameters "snp_chr" and "snp_pos"  required the field name from the slot loc.metrics where the chromosome and SNP positions are stored 
gl2vcf(t1,outpath = getwd(),snp_chr = "Crom_viburnum_lautum_v031", snp_pos = "CromPosSnp_viburnum_lautum_v031" )

- The documentation of the function has been updated about using no spaces in the path and using quotes in the path instead. Thanks for that!

Cheers,
Luis 

David Tork

unread,
May 9, 2023, 10:59:15 AM5/9/23
to da...@googlegroups.com
Hi Luis,

After a bit more troubleshooting I believe I found a solution. First, I tried the code you mentioned and still no luck:
gl2vcf(t1, outpath = getwd(), snp_chr = "Chrom_viburnum_lautum_v031", snp_pos = "ChromPosSnp_viburnum_lautum_v031")
Error in system(..., intern = TRUE) : error in running command

The above error is the same one I received when the Plink path was not specified correctly, so I added this back in, but now I get the same error as my original post:
gl2vcf(t1, outpath = getwd(), snp_chr = "Chrom_viburnum_lautum_v031", snp_pos = "ChromPosSnp_viburnum_lautum_v031", plink_path = "/Users/davidtork/Desktop/'R projects'/Vib_May23Hyb/plink_mac_20230116")
Starting gl2vcf 
  Processing genlight object with SNP data
  Using the SNP position information in the field Chrom_viburnum_lautum_v031 from loc.metrics.
  Using the chromosome information in the field Chrom_viburnum_lautum_v031 from loc.metrics.
Starting gl2plink 
  Processing genlight object with SNP data
Completed: gl2plink 
Error: --out only accepts 1 parameter.
For more information, try "plink --help <flag name>" or "plink --help | more".
Warning: running command '/Users/davidtork/Desktop/'R projects'/Vib_May23Hyb/plink_mac_20230116/plink --file /Users/davidtork/Desktop/R projects/Vib_May23Hyb/gl_plink_temp --recode vcf  --allow-no-sex --allow-extra-chr --reference-allele /var/folders/4v/9qc45m6x7_97f72bwk135_y00000gn/T//RtmpJhxHoE/mylist.txt --out /Users/davidtork/Desktop/R projects/Vib_May23Hyb/gl_vcf ' had status 5

----------Output of function start:

PLINK v1.90b7 64-bit (16 Jan 2023)             www.cog-genomics.org/plink/1.9/

(C) 2005-2023 Shaun Purcell, Christopher Chang   GNU General Public License v3

----------Output of function finished...


Completed: gl2vcf 

As you can see from the plink_path, I had all the plink files within a folder in my working directory (Vib_May23Hyb). The solution was to simply move the plink files from "plink_mac_20230116" to "Vib_May23Hyb". Once they were in my working directory, your original code worked just fine:
gl2vcf(t1, outpath = getwd(), snp_chr = "Chrom_viburnum_lautum_v031", snp_pos = "ChromPosSnp_viburnum_lautum_v031")

Thanks,
David

--
You received this message because you are subscribed to the Google Groups "dartR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dartr+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dartr/9f669570-7d51-495d-942f-b8d2a6446c6en%40googlegroups.com.


--
Researcher
M.S. Plant Breeding and Molecular Genetics
Dept. of Horticultural Science, University of Minnesota

David Tork

unread,
May 9, 2023, 11:46:35 AM5/9/23
to da...@googlegroups.com
A quick follow-up that may be helpful for future users:
 
An important point I accidentally left out of my previous post was that I completely removed all spaces from my working directory file path. After more testing, I think this was actually the primary issue. 

Once the spaces were removed from all file paths I was able to call the plink binaries from a separate folder, as long as the Plink file path was specified:
gl2vcf(May23Vibf, outpath = getwd(), snp_chr = "Chrom_viburnum_lautum_v031", snp_pos = "ChromPosSnp_viburnum_lautum_v031", plink_path = "/Users/davidtork/Desktop/R_projects/Vib_May23Hyb/plink_mac_20230116")

This invalidates my suggestion that the Plink files need to be in the working directory. It also means that using the 'single quotes' workaround is not valid. It seems that, no matter what, all spaces need to be removed from the file path for Plink to function properly. 

David
Reply all
Reply to author
Forward
0 new messages