Interaction in SNP association scan

lehong...@gmail.com

unread,

Apr 30, 2024, 11:11:54 AMApr 30

to R/qtl2 discussion

Dear Karl,

I hope you are doing well.

I would like to ask 2 questions:

1. Is there any chance that you have the variants.sqlite file for HS rats (the one similar to Mice that are available here: https://figshare.com/articles/dataset/SQLite_database_of_variants_in_Collaborative_Cross_founder_mouse_strains/5280229).

I want to run SNP-based QTLmapping using scan1snps. The function asked for either snpinfo or query_func().

If there is not any variants.sqlite file for HS rats, do you have any function to create snpinfo from information in the objects file (ie. the output from read_cross2).

2. I would like to run QTL mapping, taking into account the (potential) interaction between Sex, Diet and genotype at the same time. I have tried fitting Sex and Diet as either additive or interactive covariate in haplotype-based scan, using scan1 function. They showed somehow similar results, while interactive model showed slightly higher significance level than additive model, but messier (like figures I attached).

Am I doing correctly? If yes, what can explained for the mess in interactive model?

And how do we interpret the results in interactive model?

Many thanks and I am looking forward to your reply.

Thu

add_rint.RetroFat.pdf

int_Sex_Diet_rint.RetroFat.pdf

Karl Broman

unread,

Apr 30, 2024, 11:47:32 AMApr 30

to R/qtl2 discussion

1. I don't have any resources for the HS rat founders.

But you can do the SNP-based QTL mapping, using the founder SNP data in your cross data, but running genoprob_to_snpprob() and then scan1().

Here's an example with DO data:

library(qtl2)

file <- "https://raw.githubusercontent.com/rqtl/qtl2data/main/DOex/DOex.zip"

DOex <- read_cross2(file)

probs <- calc_genoprob(DOex, error_prob=0.002)

snp_probs <- genoprob_to_snpprob(probs, DOex)

out_snps <- scan1(snp_probs, DOex$pheno)

plot(out_snps, DOex$pmap, type="p", pch=16)

2. The results look reasonable; I can't tell without seeing code. If you include a bunch of QTL x covariate interactions, the degrees of freedom is much higher and so significance thresholds are higher. But how to interpret the results partly depends on what it is you're trying to uncover. There is some discussion of QTL x covariate interactions in chapter 7 of the R/qtl book, https://rqtl.org/book/

karl

lehong...@gmail.com

unread,

May 2, 2024, 7:03:07 AMMay 2

to R/qtl2 discussion

Thank you very much for the answer!

It worked well the SNP-based using scan1(snpprobs) as you suggested.

We have rats fed in 2 diets (high fat and low fat), and we want to see if genotype influences the rats' respond to different diets (on obesity traits and gene expression).

This is my code: fit Sex and Diet as interactive covariates

out_snps <- scan1(snpprobs,pheno=obj$pheno[,measure_name,drop=F], kinship_loco, Xcovar=Xcovar, addcovar=addcovar,intcovar=intcovar, cores=4)

with

> head(obj$pheno[,measure_name,drop=F])

rint.RetroFat

EE3B9ACF78 1.4351520

EE3B9ACF79 -0.8117843

EE3B9ACF7C -1.1212322

EE3B9ACF7E 0.7566598

EE3B9ACF7F -1.9093349

EE3B9ACF86 -0.1584156

> head(Xcovar)

sex

EE3B9ACF78 1

EE3B9ACF79 0

EE3B9ACF7C 0

EE3B9ACF7E 1

EE3B9ACF7F 0

EE3B9ACF86 1

> head(intcovar)

(Intercept) SexM DietLOW.FAT

EE3B9ACF78 1 1 0

EE3B9ACF79 1 0 0

EE3B9ACF7C 1 0 1

EE3B9ACF7E 1 1 0

EE3B9ACF7F 1 0 1

EE3B9ACF86 1 1 1

> head(addcovar)

(Intercept) Litter.Size PrimaryGMG PrimaryOYKS PrimaryTVB

EE3B9ACF78 1 7 0 1 0

EE3B9ACF79 1 9 0 1 0

EE3B9ACF7C 1 9 0 1 0

EE3B9ACF7E 1 9 0 1 0

EE3B9ACF7F 1 9 0 1 0

EE3B9ACF86 1 13 0 1 0

I would like to ask:

1. For the interpretation, I cannot say these QTLs are the ones interacted with Diet (and Sex) and affecting the trait, is it correct?

Is simple anova trait~QTL*Diet*Sex good enough to test for the interaction between QTL and Diet/Sex?

2. For significant threshold, can I just use function? apr = allele probability

operm <- scan1perm(apr, pheno=obj$pheno[,measure_name,drop=F], kinship_loco,addcovar=addcovar, intcovar=intcovar, Xcovar=Xcovar, n_perm=1000, cores=4)

Thank you!

Thu

Karl Broman

unread,

May 2, 2024, 7:35:27 AMMay 2

to R/qtl2 discussion

The LOD scores from scan1 are test statistics comparing the fitted model to the null model of no QTL, and so it doesn’t tell you about the significance of interactions, but just allows for interactions in the test of whether there is a QTL. To assess whether there are interactions, you need to compare to a model with a QTL but without the interactive covariate.

For the permutations, you’d need to use the snpprobs, as used in the analysis you’re doing.

karl

lehong...@gmail.com

unread,

May 6, 2024, 4:07:47 PMMay 6

to R/qtl2 discussion

Many thanks Karl!

Message has been deleted

lehong...@gmail.com

unread,

May 8, 2024, 5:12:08 AMMay 8

to R/qtl2 discussion

Hi Karl again,

Regarding the permutation, is the resulting permutation based on the joint test of QTL and QTL*Diet or only the test QTL*Diet?

To access the interaction between QTL*Diet, I took the difference in LOD score between model 1 where Diet as in interactive covariate (out_snps_intD), and model2 where Diet as additive covariate (out_snps). I wanted to plot this LOD score difference as Manhattan plot (out_com$dif), the technical question is it possible to save this result as scan1 output to be used in plot function of qtl2?

> head(out_snps)

rint.fat.dif

chr1_290509 0.6144129

chr1_842375 0.2042367

chr1_1113500 0.6651967

chr1_1421425 0.2553162

chr1_1780364 1.7111909

chr1_1891852 0.7033188

> head(out_snps_intD)

rint.fat.dif

chr1_290509 1.2329448

chr1_842375 0.9172207

chr1_1113500 1.3402388

chr1_1421425 0.9308256

chr1_1780364 2.6829722

chr1_1891852 1.0829739

> head(out_com)

add intD dif

chr1_290509 0.6144129 1.2329448 0.6185319

chr1_842375 0.2042367 0.9172207 0.7129840

chr1_1113500 0.6651967 1.3402388 0.6750421

chr1_1421425 0.2553162 0.9308256 0.6755094

chr1_1780364 1.7111909 2.6829722 0.9717813

chr1_1891852 0.7033188 1.0829739 0.3796552

> class(out_snps)

[1] "scan1" "matrix"

> class(out_com)

[1] "data.frame"

Many thanks,

Thu

Karl Broman

unread,

May 8, 2024, 1:52:15 PMMay 8

to R/qtl2 discussion

The scan1perm permutations give you the same sort of LOD scores as scan1, which concern the

To assess the interaction, you need to look at differences between LOD scores, as you’ve done. To get permutation results that tell you about the significance of those LOD differences, you need to do those same sorts of differences within the permutations, either by writing your own code to loop over permutations and at each replicate calculate the two sets of LOD curves, or doing two runs of scan1perm with identical seeds for the random number generator.

There is some discussion of this in Chapter 7 of the R/qtl book.

karl

Karl Broman

unread,

May 9, 2024, 9:39:08 AMMay 9

to R/qtl2 discussion

Oops; the first sentence in that response was incomplete. I meant to write:

The scan1perm permutations give you the same sort of LOD scores as scan1, which concern the comparison of the fitted model to the null model.

karl

lehong...@gmail.com

unread,

May 24, 2024, 6:24:57 AMMay 24

to R/qtl2 discussion

Thanks Karl for your answers.

I have 2 more questions:

1. I want to plot QTL effect for HS rats, using plot_coef(scan1blup,). Is it possible to use result from model with interactive covariate? because I don't see the option intcovar=incovar in scan1blup().

2. Do you have any guide for Gene expression QTL mapping using qtl2?

Many thanks,

Thu

Karl Broman

unread,

May 24, 2024, 7:19:03 AMMay 24

to R/qtl2 discussion

1. I've not implemented interactive covariates in scan1blup.

2. I've not written any guides about expression QTL with qtl2.

sorry,

karl

Dan Gatti

unread,

May 24, 2024, 8:56:29 AMMay 24

to rqtl2...@googlegroups.com

Karl answered your question about scan1_blup().

eQTL mapping is largely the same as mapping a few phenotypes, with at least two differences: compute time and multiple testing. You can’t easily run 1000 permutations on 20,000 genes. So we transform the data to make all of the gene have the same distribution and then split the mapping up on a computing cluster. And we use a p-value adjustment to set the significance threshold.

We transform all of the expression values y ranking them and then taking quantiles from the normal distribution. We call this “RankZ” transforming the data.

rankZ = function(x) {

x = rank(x, na.last = "keep", ties.method = "average") / (sum(!is.na(x)) + 1)

return(qnorm(x))

}

This makes the distribution of each phenotype the same and allows you to run permutations on one gene to get thresholds for all of the genes. I usually break the job up into batches of 100 genes and kick off each set on the cluster as a separate job. Then harvest QTL above your significance threshold. We will then get p-values for each LOD score from the permutation LOD scores to get and FDR for each gene. There is a package called “qvalue” that I’ve used. Or sometimes, we’ll select a LOD threshold and plot QTL vs gene positions. This adjustment is a bit tricky because, if gene A has a QTL at a given position, then if gene B has a QTL in the exact same location, you might relax the significance threshold. But you can make plots at different FDR values.

That’s a very rough overview with lots of details left out.

Dan

--
You received this message because you are subscribed to the Google Groups "R/qtl2 discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rqtl2-disc+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rqtl2-disc/1a9059fb-c639-40c6-b62e-41968fa238bdn%40googlegroups.com.

---

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

Reply all

Reply to author

Forward