QTL analysis in MAGIC populatinon

73 views
Skip to first unread message

Giovanni Gabelli

unread,
Jun 12, 2023, 10:16:45 AM6/12/23
to R/qtl2 discussion
Hi, I have a magic population of 480 samples from 8 parental lines and I wanted to use qtl2 for a QTL analysis. I have the physical position and the genotypes of circa 35000 SNP markers, but I don't have a genetic map. Has qtl2 the tools to do it? Otherwise, can you suggest a proper tool? 
Moreover, I have some doubts about the input building. I have some level of heterozygosity, those positions must be deleted beforehand or will be simply ignored by the program? I build the genotype code on the control file like so:

genotypes:
  AA: 1
  BB: 2
  CC: 3
  DD: 4
  EE: 5
  AB: 6
  AC: 7
  AD: 8
  AE: 9
  BC: 10
  BD: 11
  BE: 12
  CD: 13

But the function read_cross2() didn't prompt any warning, so I wonder if the program understood the heterozygous position correctly.

Thank you for the assistance
Giovanni Gabelli

Karl Broman

unread,
Jun 12, 2023, 12:27:20 PM6/12/23
to R/qtl2 discussion
I think you can use the cross type “riself8” for these lines. You need genotypes of the individuals plus the genotypes of the 8 founder lines. The genotypes would be coded something like A/H/B or AA/AB/BB, and as you thought the heterozygotes will end up being ignored and treated as missing values. I’m not sure what the AA/BB/CC/DD/EE/AB/etc codes you have in mind are. Inference of the founder origin of the genotype is done in calc_genoprob() or sim_geno() rather than in advance in building the data files.

Regarding the genetic map, I would start by interpolating from your physical map (for example, by using some estimate of the cM:Mbp recombination rate). You can then use est_map() to estimate cM distances from your data.

karl

Giovanni Gabelli

unread,
Jun 13, 2023, 5:06:46 AM6/13/23
to R/qtl2 discussion
I thank you for your kind and fast answer.
Many of my loci have more than two alleles, so I assigned the allele "A" to the reference one, "B" for the first reference appearing in the parental lines, "C" for the second one and so one. Since they are progressively more rare, I don't have all the combination of them, and the 13 classes mare the ones that are present at some point.
If I assign a NA value to all the heterozygous position and keep only the 5 homozygous, is qtl2 able to manage multiallelic marker?

If I am understanding correctly, your suggestion is, rather than building a genetic map with R/qtl, to progress with the analysis without input any a genetic map and then estimate the cM with est_map(), right?

Giovanni Gabelli

unread,
Jun 13, 2023, 6:04:34 AM6/13/23
to R/qtl2 discussion
Secondly, in order to use  calc_genoprob() or sim_geno() the program asks me a genetic map. Is this the one that you suggest building interpolating the physical map or there is another way around it?

Thanks!

Il giorno lunedì 12 giugno 2023 alle 18:27:20 UTC+2 Karl Broman ha scritto:

Karl Broman

unread,
Jun 13, 2023, 5:30:55 PM6/13/23
to R/qtl2 discussion
R/qtl2 can only handle markers with two alleles. You could try turning your multi-allelic markers into multiple biallelic markers at the same location.

R/qtl2 needs an initial genetic map, even for est_map(), but you could use a scaled version of your physical map as the starting point.

karl

Dan Gatti

unread,
Dec 6, 2023, 4:33:08 PM12/6/23
to rqtl2...@googlegroups.com

I’m working on an experiment in BXD-F1 strains where there are multiple covariates: sex, diet, & transgene genotype. A strain carrying a dominant mutation is mated to a BXD strain and they phenotype the F1 progeny. Each strain is repeated in 6 lines of the genoprobs (M/F, chow/high fat diet, non-transgenic/mutant. I’m also mapping the transgene as an interactive covariate.

 

I’m running a scan with additive covariates: y ~ sex + diet + transgene + genotype

And a scan with additive and interactive covariates: y ~ sex + diet + transgene + genotype + transgene*genotype.

 

Then I’m taking the difference between the interactive and additive genome scans to look for QTL relating to the transgene.

 

I’m not sure how to run the permutations to get a threshold for the (interactive – additive) scan. I don’t think that it’s correct to shuffle rows in the phenotypes because the samples aren’t all exchangeable. In fact, I think that I need to permute the strain labels, run the additive and interactive scans, take the difference, and get the maximum LOD. My question is: do I just permute the phenotypes and covariates, or should I permute the kinship values as well? Thanks for any thoughts on this.

---

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

Karl Broman

unread,
Dec 6, 2023, 4:43:53 PM12/6/23
to R/qtl2 discussion
Great question!

I would permute the rows of the genotype data across the BXD strains and then percolate down to the F1 progeny.

Which is I think equivalent to shuffling the kinship matrix along with the phenotypes and covariates. The way scan1perm() works now, when you are fitting an LMM with a kinship matrix, is to preserve the phenotype/covariate/kinship relationships, so that you'll end up with an identical estimate of the residual heritability.

karl
Reply all
Reply to author
Forward
0 new messages