Genotype calling thresholds from DArT allele count data

62 views
Skip to first unread message

Jenny Evans

unread,
Sep 13, 2022, 7:01:45 PM9/13/22
to dartR
Hi dartR team, 

A question related to Gabriella's recent thread on allele balance filtering - does dartR have any functions to handle raw allele count data from DArT, rather than the genotype call data? I'm hoping to apply some read depth & allele balance thresholds for each genotype call (rather than filtering on average read-depth per SNP). I've currently DIYed some scripts to do this, but wanted to check whether there is current functionality to do this within dartR that I've missed, or if there will be in future? 

Much appreciated,
Jenny

Berry, Olly (NCMI, IOMRC Crawley)

unread,
Sep 13, 2022, 7:54:37 PM9/13/22
to da...@googlegroups.com
Hi Jenny,

I think if you search the forum you may find some commentary on this topic from others more expert than me. Presently dartR doesn’t have functionality to deal with read depth per sample, locus, allele etc. I know it has been discussed by the team, but don't know if it's been decided on as a project and if it has, whether a timeline has been set.
Cheers,
Olly

CSIRO Environomics Future Science Platform

From: 'Jenny Evans' via dartR <da...@googlegroups.com>
Sent: Wednesday, September 14, 2022 7:02 am
To: dartR
Subject: [dartR] Genotype calling thresholds from DArT allele count data
 
--
You received this message because you are subscribed to the Google Groups "dartR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dartr+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dartr/e31a6dc0-b40f-4dfb-9bd7-e86763a8cea9n%40googlegroups.com.

Arthur Georges

unread,
Sep 13, 2022, 8:37:45 PM9/13/22
to da...@googlegroups.com
Hi Jenny,

At the moment dartR does not have the capability of working with the counts data. You will need to turn to  radiator (an R package for RADseq Data Exploration, Manipulation and Visualization) for this.

We have a policy of not writing scripts for dartR that delve into DArT processes below what is provided to users by DArT in their routine reports. Users can request the count data specifically, but this is not routine. I understand that DArT is considering providing the count data in their routine reporting (via OneDArT) so this might change. It would certainly be good for some applications such a detecting and filtering sex linked markers.

If you have DIY'ed some scripts for this, perhaps consider contributing them to dartR :)

Sorry we cannot be of more help at this time.

A


Jenny Evans

unread,
Sep 14, 2022, 8:20:42 PM9/14/22
to dartR
Hi Arthur and Olly, 

Thank you both for the responses! I'll have a closer look at radiator, I suspect they'll have a better process for handling count data than my DIY version, but I'll get in touch if my scripts could be useful for dartR :)

Much appreciated, 
Jenny

Carlo Pacioni

unread,
Sep 14, 2022, 8:50:14 PM9/14/22
to da...@googlegroups.com
Hi Jenny,
something that may get you closer to what you want (if I'm not misreading this) is using the AvgCountSnp data provided by dart like so:
glRef <- gl.filter.locmetric(gl, lower=5, upper=100, metric="AvgCountRef")
glAlt <- gl.filter.locmetric(glRef, lower=5, upper=100, metric="AvgCountSnp")

This is clearly the average of the count for each allele across all samples, so it doesn't guarantee that you won't be including samples that have a low read count, but it could be better than nothing.

cheers,
carlo




--
You received this message because you are subscribed to the Google Groups "dartR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dartr+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages