Merge datasets

60 views
Skip to first unread message

Lyndal Hulse

unread,
Jun 26, 2025, 1:26:42 AMJun 26
to dartR
Hi dartR Team,

Firstly, it was wonderful meeting most of the dartR team at the ICCB 2025 workshop.

Secondly, I was hoping you could help me with my query.  I have two vcf files (with associated metadata files) containing loci generated from the same SNP panel which I have uploaded into dartRverse.  The two vcf files contain genotype data from different populations but each file has a different number of loci, although there will be some overlapping loci between the two files.

Is it possible to merge the two vcf files (with attached individual metrics) to make one file and then identify sex-linked markers based on sex indentification metadata of some individuals?  I then want to use gl.infer.sex to determine individuals with unknown sex.

If it helps, I can email you the files.  I really hope this is feasible...

Kind regards,
Lyndal

nanisrobledo

unread,
Jun 26, 2025, 10:52:36 PMJun 26
to dartR
Hi Lyndal,

It was lovely meeting you at ICCB!

I think there are a couple options for what you want to do: (1) merging the vcf files first, with bcftools merge for example, then transform the global vcf into a dartR genlight object with gl.read.vcf, or (2) transforming each vcf into a dartR genlight object and merging them with dartR.base::gl.join. Remember to apply gl.keep.sexlinked before gl.infer.sex. For example:

LBP_sexLinked <- dartR.sexlinked::gl.keep.sexlinked(x = LBP, system = "xy", plot.display = TRUE, ncores = 1) inferred.sexes <- dartR.sexlinked::gl.infer.sex(gl_sexlinked = LBP_sexLinked, system = "xy", seed = 100) inferred.sexes # The new sexes will be in this data frame

Best,
Diana

Lyndal Hulse

unread,
Jul 1, 2025, 7:49:15 PMJul 1
to dartR
Hi Diana,

Thankyou for the advice.  However, I'm having issues merging two dartR genlight objects.
This is the error I'm receiving:

> Watergum_new <- gl.join(Watergum, Val) Starting gl.join Error in gl.join(Watergum, Val) :
Fatal Error: the two genlight objects do not have data for the same individuals in the same order
 
My two genlight objects have different individuals, but the same loci.  I want to merge the individuals from both genlight objects into one genlight object.  Is this possible?

Cheers,
Lyndal

Jose Luis Mijangos

unread,
Jul 1, 2025, 8:08:09 PMJul 1
to da...@googlegroups.com

Hi Lyndal,

 

Could you please send me your datasets to my personal e-mail (luis.m...@gmail.com) so I can have a closer look.

 

Cheers,

Luis

 

--
You received this message because you are subscribed to the Google Groups "dartR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dartr+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/dartr/ed8ffdf8-e6cd-4edf-a612-6425ddf5d31cn%40googlegroups.com.

Jose Luis Mijangos

unread,
Jul 3, 2025, 12:21:04 AMJul 3
to dartR
Hi Lyndal,

The issue has now been resolved. To use the updated version of the function, please install the development version of dartR.base—see the first line in the code snippet below.

I've also included an example of how to join two datasets with the same loci but different individuals.

Let me know if you run into any issues.

Cheers,
Luis

devtools::install_github("green-striped-gecko/dartR.base@dev")
library(dartRverse)
# loading datasets
t1 <- readRDS("dataset1.rds")
t2 <- readRDS("dataset2.rds")
# getting common loci for t1
loc_common_t1 <- which(locNames(t1) %in% locNames(t2) == TRUE)
t1a <- gl.keep.loc(t1,loc.list = locNames(t1)[loc_common_t1])
# getting common loci for t2
loc_common_t2 <- which(locNames(t2) %in% locNames(t1) == TRUE)
t2a <- gl.keep.loc(t2,loc.list = locNames(t2)[loc_common_t2])
# oredring by loci
t1a <- t1a[,order(t1a$loc.names)]
t2a <- t2a[,order(t2a$loc.names)]
# joining datasets by loci in common
t3 <- gl.join(x1 = t1a,
              x2 = t2a,
              method = "join.by.loc")
Reply all
Reply to author
Forward
Message has been deleted
0 new messages