Issues with my VCF file

30 views
Skip to first unread message

Jean Rodrigue Sangaré

unread,
Aug 5, 2025, 8:46:42 AMAug 5
to dartR

Dear Jose,

I am encountering a significant issue with the VCF file generated from my genlight object. Despite my expectations, the file appears to contain only Chromosome 1 and '0' as the positions for all SNPs. This limitation severely hinders any subsequent analysis. I have attached a screenshot of the file format for your review. Could you please provide guidance on the potential cause and a solution for this problem?

Regards

vcf.png

Renee Catullo

unread,
Aug 5, 2025, 9:21:03 PMAug 5
to da...@googlegroups.com
Hi Jean,

Right now there is no automated way to get a VCF into a genlight object where it replicates all the columns and information the dartR is expecting. This includes information about position, as well as things like reproducibility.

Attached is the script I developed (including code from others) for my lab to move from a VCF to a genlight. It adds in the all the position information, and calculates all the missing columns.

The purpose of this script is to get all the correct information into the genlight and do very light filtering to get rid of very bad individuals, duplicate individuals and very bad loci. Within this script is not the place to do hard filtering - it generates the broad genlight you need to then filter specific to your question. We go through this script to get our formatted genlight, then do a new script with question-specific filtering and analyses.

It expects:

1. a VCF file that includes your technical replicates
2. a metadata file with the columns “id”, “pop”, “lat”, “lon”, “genotype”, plus any additional column like sex. ID is the name of the individual in the VCF file, and genotype is the name of the individual. This is important for technical replicates as id must be unique but genotype can be duplicated.
3. A techreps file that two columns - “genotype" and “ID", just including individuals that are duplicated as indicated by the ID column. It uses this to calculate reproducibility.

If you use this, you can cite some of the very recent papers from my lab, like the one on Philoria frogs, or an incoming ones on Uperoleia daviesae, Antechinus argentus, and Mixophyes.

Cheers,

Renee
VCF_filtering_Aug25.R

Jose Luis Mijangos

unread,
Aug 6, 2025, 6:00:00 AMAug 6
to dartR
Hi Jean,

Please try the code below to generate a VCF file of your dataset. Note that we are using dartRverse instead of dartR.

Cheers,
Luis 

library(dartRverse)
t1 <- readRDS("test.rds")
# here is the Chromosome information
t1$other$loc.metrics$Chrom_Rice_RGAP_v7
# here is the SNP position information
t1$other$loc.metrics$ChromPosSnp_Rice_RGAP_v7
# Plink binary is in the working directory

gl2vcf(t1,
       plink.bin.path = getwd(),
       snp.pos = "ChromPosSnp_Rice_RGAP_v7",
       snp.chr = "Chrom_Rice_RGAP_v7",
       outfile = "test_vcf",
       outpath = getwd())

Jean Rodrigue Sangaré

unread,
Aug 9, 2025, 7:34:33 AMAug 9
to dartR

Thank you, Luis and Renee. I successfully converted my genlight object into a VCF file. Initially, I used the following code, specifying the chromosome and SNP position information:

gl2vcf( + gl, + plink.bin.path = getwd(), + snp.pos = "ChromPosSnp_Rice_RGAP_v7", + snp.chr = "Chrom_Rice_RGAP_v7", + outfile = "gl_vcf", + outpath = getwd() + )

However, this attempt resulted in the following error:

Error in checkSlotAssignment(object, name, value) : assignment of an object of class “numeric” is not valid for slot ‘position’ in an object of class “dartR”; is(value, "intOrNULL") is not TRUE

To resolve this, I converted the SNP position information to an integer data type, and then reran the gl2vcf function:

gl$other$loc.metrics$Chrom_Rice_RGAP_v7
gl$other$loc.metrics$ChromPosSnp_Rice_RGAP_v7

gl$other$loc.metrics$ChromPosSnp_Rice_RGAP_v7 <- as.integer(gl$other$loc.metrics$ChromPosSnp_Rice_RGAP_v7)

  gl2vcf( + gl, + plink.bin.path = getwd(), + snp.pos = "ChromPosSnp_Rice_RGAP_v7", + snp.chr = "Chrom_Rice_RGAP_v7", + outfile = "gl_vcf", + outpath = getwd() + )

Renee Catullo

unread,
Aug 11, 2025, 7:49:27 PMAug 11
to da...@googlegroups.com
Hi Jean,

That works to get a properly formatted genlight, but I would advise you to think through the filters in the script I sent. They are more robust than what can be achieved using just a genlight. 

Renee

-- 
You received this message because you are subscribed to the Google Groups "dartR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dartr+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/dartr/4e4be040-4f4a-442e-81d9-df8358bd2f5en%40googlegroups.com.

Jean Rodrigue Sangaré

unread,
Aug 12, 2025, 6:43:39 PMAug 12
to da...@googlegroups.com
well noted Renee
Thank you very much

Reply all
Reply to author
Forward
0 new messages