MMnGM Genotype Cleaning

krij...@ucdavis.edu

unread,

Jul 10, 2022, 2:30:51 PM7/10/22

to R/qtl2 discussion

Dr. Broman and community,

I am combining several DO studies to refine a locus of interest. Two of the studies were genotyped using the MegaMuga array while the others were genotyped on the GigaMuga array. I can successfully create a cross object using the MMnGM files provided (thank you!). I am considering how to approach cleaning the genotypes given the different arrays. In the past, I have only worked with one experiment at a time with all mice genotyped on the same array. My specific questions are:

Should I be concerned with genotype cleaning using data from multiple types of arrays?
I can create cross objects for the individual experiments, and clean them. Is it then possible to combine the cross objects into a master object?
- Another strategy, I could get a list of problematic markers for the individual studies, combine the lists, and remove markers from the master MMnGM multi-study cross object.

Thank you for your input.

Sincerely,

Kristen

Karl Broman

unread,

Jul 10, 2022, 10:24:18 PM7/10/22

to R/qtl2 discussion

I'd be inclined to clean them separately and then combine. The main thing that can be confusing is the percent missing data. I find this to be a useful diagnostic for sample quality, but it can be confusing for samples that were typed on the MegaMUGA when you're looking at the combined MM and GM markers, since most markers wouldn't have been attempted to be genotyped.

I personally don't put much effort into trying to identify problem markers, and focus primarily on problem samples, because it seems like for MegaMUGA and GigaMUGA data, we can just smooth over any genotyping errors when calculating genotype probabilities, by using say error_prob=0.002 in calc_genoprob().

There's a function c.cross2() for combining cross2 objects, for the case of different sets of samples with the exact same set of markers.

karl

Kristen James

unread,

Jul 11, 2022, 9:37:56 PM7/11/22

to rqtl2...@googlegroups.com

Yes, it makes sense that the different markers between arrays would make the cleaning more difficult compared to working with a consistent array type. This is very helpful information.

Thank you for the timely response and insight.

--
You received this message because you are subscribed to a topic in the Google Groups "R/qtl2 discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rqtl2-disc/IpgAe6O_GtY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rqtl2-disc+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rqtl2-disc/a5543da3-6945-4516-9a24-f6ea20f9f518n%40googlegroups.com.

Reply all

Reply to author

Forward