Simulating genotype skew

17 views
Skip to first unread message

Tiago Ribeiro

unread,
Oct 27, 2022, 4:13:48 PM10/27/22
to R/qtl discussion
Hi,

I'm working with a data set that has a pretty heavy genotype skew. Instead of the expected 1:1 from RILs the data is closers to 3:1 (the number of heterozygous is negligible). I'd like to simulate QTL with this genotype structure to calculate the power I would have to detect QTLs of different effect sizes. 

When I use sim.cross, the genotype distribution simulated is the expected 1:1, I considered using "x$geno = original$geno" as a command to make the the genotype identical to the original. In this case, I think the only thing sim.cross would be doing is simulating the phenotypes. Is this an appropriate way to tackle this issue? Is there another way to generate a 3:1 distribution? Maybe simulating f2 and somehow converting the heterozygous to one of the parental genotypes?

Best,

Karl Broman

unread,
Oct 28, 2022, 11:34:46 AM10/28/22
to R/qtl discussion
In trying to simulate RILs with a 3:1 skew, the main question is how to model the skew. Is it that there's selection at a single locus, or some set of loci? You could accommodate that by simulating many more RILs than you need and then subsampling based on genotypes at those sites.

Alternatively, were the RILs derived by first backcrossing the F1 to one of the parents and then doing selfing?

Simulating data with sim.cross but then substituting your observed genotypes with x$geno <- original$geno is not going to be effective, since the those original genotypes will be independent of the simulated phenotypes.

You could instead take your observed genotype data as fixed, pick one marker to be treated as a QTL, and simulate new phenotype data.

karl

Tiago Ribeiro

unread,
Oct 28, 2022, 6:26:07 PM10/28/22
to rqtl...@googlegroups.com
It is an intercross (F12), I suspect that one of the two parental inbred strains might have produced more offspring in the first generations leading to a skew toward the genotype of that parent.

I am trying to pick one marker to treat as a QTL and run a power analysis. My approach was to estimate the expected average phenotype for each genotype and then sample from a random distribution to populate the phenotype vector based on the genotype of the marker in which the QTL will be generated. I calculated the averages based on the formula in the book, fixating the phenotype of one of the homozygous as 0.5 I solved the formula to find the average phenotype of the other homozygous at different effect sizes. I am also simulating phenotypes with effect = 0 (overall mean = 0.5) to simulate a null distribution.

The approach is to use makeqtl() in the position of interest with the new phenotypes in place and then fitqtl() to obtain a lod score. Then, I will compare lod scores from the two sets of simulations (no effects and ## effect). I'm still working on the code, but does it seem like a better approach?




--
You received this message because you are subscribed to a topic in the Google Groups "R/qtl discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rqtl-disc/yyZ4I1A2eUE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rqtl-disc+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rqtl-disc/6f9cb486-33b0-4580-a3d5-ff587613fc2an%40googlegroups.com.


--
Tiago da Silva Ribeiro
Ph.D. Candidate
Pool Lab
Department of Integrative Biology, UW-Madison
Laboratory of Genetics, UW-Madison
Reply all
Reply to author
Forward
0 new messages