Obtain call rate value for each individual

54 views
Skip to first unread message

Nadia Faure

unread,
Apr 7, 2023, 5:03:07 AM4/7/23
to dartR

Hi!

I have a dataset of 174 genotypes with 38,081 SNPs. But among these 174 genotypes, 108 individuals were genotyped once and 33 individuals were genotyped twice (we put the same tissue sample in 2 different wells of the DArT plate for DNA extraction) to explore genotyping differences.

Among this dataset of 33 individuals genotyped twice, I would like to select the individual with the best call rate score and remove its duplicate to perform my analyses of genetic diversity and differentiation.

How can I extract the call rate value for each individual from the gl.report.callrate(gl, method=’ind’) function? Is it possible from this function to obtain a dataframe with the call rate value for each individual (with its ID) in order to filter my duplicated individuals according to their ‘sequencing success’?

Thanks!

Nadia

Jose Luis Mijangos

unread,
Apr 10, 2023, 9:41:51 PM4/10/23
to dartR
Hi Nadia,

Can you please try the code below.
Cheers,
Luis 

library(dartR)
t1 <- capture.output(gl.report.callrate(platypus.gl,method = "ind"))
t2 <- t1[40:length(t1)-1]
t3 <- as.data.frame(t2)
t4 <- read.table(text=sub("^(\\S+)\\s+.*\\s+(\\S+)$", "\\1 \\2", t3$t2),
           header=FALSE, stringsAsFactors= FALSE)
colnames(t4) <- t4[1,]
t4 <- t4[-1,]

Nadia Faure

unread,
Apr 12, 2023, 3:50:36 AM4/12/23
to dartR
Hi Luis!
Thank you for your rapid answer!
Unfortunately your code does not work.
When I run t1 it has only 38 rows and not 40; but even when I replace by 38, the output of t4 is null.
Do you have an idea?
Cheers,
Nadia

Jose Luis Mijangos

unread,
Apr 12, 2023, 9:54:30 PM4/12/23
to dartR
Hi Nadia,

Can you try the code below.

Cheers,
Luis 

library(dartR)
t1 <- capture.output(gl.report.callrate(platypus.gl,method = "ind"))
srow <- grep(pattern =  " ind_name", x=t1 )
t2 <- t1[srow:length(t1)-1]

t3 <- as.data.frame(t2)
t4 <- read.table(text=sub("^(\\S+)\\s+.*\\s+(\\S+)$", "\\1 \\2", t3$t2),
           header=FALSE, stringsAsFactors= FALSE)
colnames(t4) <- t4[1,]
t4 <- t4[-1,]

Nadia Faure

unread,
Apr 13, 2023, 4:52:39 AM4/13/23
to dartR
Hi Luis,
Sorry the problem was that I still had dartR version 2.0.4, so your code did not work
Now it works perfectly with version 2.7.2!
Thanks for your help!
Cheers,
Nadia

Peter Kriesner

unread,
Apr 13, 2023, 10:18:53 PM4/13/23
to da...@googlegroups.com
Hi folks,
Further to Nadia's question below, I also have a few DArT datasets where there are several technical replicates among the genotypes. In theory, given perfect DNA extraction, etc., these genotypes should be identical, but of course there are usually at least minor discrepancies. Is there a convenient way to use the differences that arise between each pair of such technical replicates to produce an estimate of the sequencing error rate (with standard error)?

Thanks,
Peter

--
You received this message because you are subscribed to the Google Groups "dartR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dartr+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dartr/1e230ff3-1e11-42db-83d2-9cd2d21dfbefn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages