Negative LOD scores with scanone

109 views
Skip to first unread message

Anji Ballerini

unread,
Jun 2, 2020, 11:08:51 PM6/2/20
to R/qtl discussion
Hello Dr. Broman and the R/qtl community -

I was hoping that I could get some insight on the generation of a negative LOD score generated from the scanone function with a binary trait model generated using HK.

Specifically, I have one marker with a negative LOD score that is sandwiched between two markers with LOD scores > 30. There might be a recombination event or two between the consecutive markers (markers are 500 kb genomic bins genotyped with low coverage sequencing), but the genotypes aren't that different from one marker to the next. Given how I did the genotyping, if a recombination event happened in a bin, I scored that bin as missing data rather than assigning it a genotype, so maybe that particular marker has more missing data points, but I don't think that it varies much (I realize that I could get counts on the missing data, but I don't think it's likely to be substantially different). Would that likely cause the negative LOD score?

I have room to improve in my statistical knowledge and I tried looking into what the code for the scanone function (using getAnywhere) is to see if I could understand it better and figure out what it's actually doing, but I got lost in all of the if/thens. I don't think that the problem occurs if the data aren't analyzed with a binary model.

In a manual somewhere I saw that LOD is calculated as n/2 * log10(RSS0/RSS1). In my head, I read this as if the null hypothesis of no QTL (RSS0) is more likely than the alternative of a single QTL (RSS1), then the ratio would be greater than 1 and taking the log should generate a positive value and that a negative LOD would be generated when a single QTL is more likely than no QTL. What am I missing as it seems like it should be the other way around?

Also, the trait that I'm mapping appears to be a single locus Mendelian trait (based on phenotype frequencies), but when I map it, I get a second peak that appears to be fairly substantial. Looking more closely at the data, I realize that this is likely caused by some sort of DMI - for example, there are no individuals homozygous for one parent at one locus and homozygous for the other parent at the other locus. In order to show that the second locus is most likely due to a genetic incompatibility, I used the genotype at the main locus as a covariate. This eliminated the association between genotype and phenotype at the second locus and also resulted in lots of negative LOD scores at other parts of the genome. Spoiler alert, I already ID'ed the gene that's causing the phenotype (yeah), but am trying to make a nice figure of what happens when I control for the genotype at the main locus. For good measure, I did see what happened when I used the genotype at the second locus as a covariate for the main locus and we're all good.

Any help with understanding what is happening would be greatly appreciated! I'm attaching an image as an example.

Thanks, Anji
pop.covar.bin.pdf

Karl Broman

unread,
Jun 2, 2020, 11:32:10 PM6/2/20
to rqtl...@googlegroups.com
My guess is that this is a convergence problem, perhaps due to perfect genotype/phenotype association at this position. Binary trait analysis requires an iterative algorithm, where you start with initial estimates of the QTL effects and then get improved estimates in each of successive iterations, until the algorithm converges. 

Perhaps try messing around with the maxit and tol parameters. You might need to do a custom analysis to get correct results for that one position.

Congrats on finding the gene!
karl

On Jun 2, 2020, at 10:08 PM, Anji Ballerini <ball...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "R/qtl discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rqtl-disc+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rqtl-disc/a19bad8c-4edc-4eee-9488-e0cceaa7205b%40googlegroups.com.
<pop.covar.bin.pdf>

Anji Ballerini

unread,
Jun 5, 2020, 7:45:27 PM6/5/20
to R/qtl discussion
Thank you for the quick reply - I'll look at your suggestions!

Jason Johns

unread,
Jul 13, 2023, 1:13:28 PM7/13/23
to R/qtl discussion
Hi Karl and group,

I'm having a related yet different issue, and I'm wondering if you can help guide me. Funny enough, Anji was in the same lab I work in when she asked her question and I just happened to notice this thread as I was searching my issue.

I too have a major QTL (and a minor QTL that interacts with the major) for a binary trait: are stomata made on the upper leaf surface (1) or not (0). In addition to phenotyping 0/1 for this trait, I also counted the number of stomata made on the upper leaf surface, as I would like to know if the same locus that controls presence/absence also controls the number of stomata made, and if there are other loci involved. 

When I mapped the number of stomata on the upper leaf surface using presence/absence as a covariate, the same QTL showed up along with a couple of new ones (and a new interaction). The issue is that the presence/absence covariate term has a negative LOD & PVE ("amphi_covar" term in attached output).  Per your suggestion to Anji, I tried modifying the maxit and tol parameters and neither made any impact on the result. I can't say I quite understand what a convergence issue means, nor would I ask you to explain it to me, but I may have one here as well?

Anyway, it may well be that your advice to Anji applies directly to my case (custom analysis), but I thought I would inquire to see if you had any other ideas. Thank you for your wonderful work and guidance!

Jason
stomata multipleqtl model output.png

Karl Broman

unread,
Jul 13, 2023, 2:05:26 PM7/13/23
to R/qtl discussion
Is it possible to share the data? I'm not entirely sure what might be going wrong.

Do you get the same result if you use Haley-Knott regression rather than imputation?

maxit and tol won't have any influence here, since you're using a normal model and so linear regression (maxit and tol are used just for the binary trait model, in logistic regression).

karl

Karl Broman

unread,
Jul 13, 2023, 2:20:18 PM7/13/23
to R/qtl discussion
Rather than include the 0/1 phenotype as a covariate, I'd be inclined to look at the quantitative outcome among those with at least one stomata.
Similar to this 20-yr-old paper: https://doi.org/10.1093/genetics/163.3.1169

karl

Jason Johns

unread,
Jul 13, 2023, 8:05:30 PM7/13/23
to R/qtl discussion
Hi Karl,

I would be happy to share the data with you but the problem is fixed. Thank you for your brilliant idea, which I think is especially brilliant because it's how I initially analyzed the data months ago before I overcomplicated it. The output makes much more sense now.

Thanks again!
Jason

Reply all
Reply to author
Forward
0 new messages