CC F2 Cross - gmap required? plots look different? QTL effects are off

176 views
Skip to first unread message

Aravindh Nagarajan

unread,
Mar 4, 2021, 3:15:48 PM3/4/21
to R/qtl2 discussion
Hi Dr. Broman, 

First of all, thanks for the rqtl2 software, manual and this google group. It has helped me a lot. 
I have data from an F2 cross between two CC lines. I have attached all the data and codes in the google drive link. I have few questions? 
1. Is gmap required? I don't have a genetic map, I am using the same physical map for both gmap and pmap in the control file. If I run without gmap, I get a warning initially and then I get the following error 
f2_kinship <- calc_kinship(f2_pr)
- Error in probs[[1]] : subscript out of bounds
f2_out <- scan1(genoprobs = f2_pr, pheno = f2$pheno, cores =1, Xcovar = Xcovar)
- Error in genoprob_Xcol2drop[[chrnam]] : 
  attempt to select less than one element in get1index
When I assign the physical map for both (gmap and pmap), the same code runs without any error. 
2. The plot_scan1 plots look like a column of lines ( I have attached an example) instead of a curve? is that something to do with the number of markers or map positions? 
3. QTL effect plots -  The effect plot doesn't match with the plot_scan1 plot for my peaks. It doesn't look anything similar to the plots in the manual either https://kbroman.org/qtl2/assets/vignettes/user_guide.html#Estimated_QTL_effects
is there a way to fix this? 

Any advice on these questions with be helpful. I have attached all the data and codes. 

Thanks again, 

- Aravindh 




3309a62c-104f-4760-a2e9-05ac09f015e1 (2).png
Broman_question.zip
6ac38bce-a364-4461-91d0-f2c943651906.png

Karl Broman

unread,
Mar 4, 2021, 3:31:54 PM3/4/21
to R/qtl2 discussion
1. Is gmap required? I don't have a genetic map, I am using the same physical map for both gmap and pmap in the control file. If I run without gmap, I get a warning initially and then I get the following error 
f2_kinship <- calc_kinship(f2_pr)
- Error in probs[[1]] : subscript out of bounds
f2_out <- scan1(genoprobs = f2_pr, pheno = f2$pheno, cores =1, Xcovar = Xcovar)
- Error in genoprob_Xcol2drop[[chrnam]] : 
  attempt to select less than one element in get1index
When I assign the physical map for both (gmap and pmap), the same code runs without any error.

calc_genoprob needs a genetic map. The hidden Markov model needs recombination fractions between markers, which it gets by converting from cM distances with a genetic map. Mbp positions would be a better approximation than basepairs; by providing basepairs you end up with the odd LOD curves that look more like a Manhattan plot than a LOD curve.

2. The plot_scan1 plots look like a column of lines ( I have attached an example) instead of a curve? is that something to do with the number of markers or map positions?

Yes this is due to providing basepair positions where we expect cM.

3. QTL effect plots -  The effect plot doesn't match with the plot_scan1 plot for my peaks. It doesn't look anything similar to the plots in the manual either https://kbroman.org/qtl2/assets/vignettes/user_guide.html#Estimated_QTL_effects
is there a way to fix this? 

I think this is another a symptom of the same problem.

karl

Aravindh Nagarajan

unread,
Mar 4, 2021, 4:25:42 PM3/4/21
to rqtl2...@googlegroups.com
Thanks for the quick reply. 
1. I changed the physical map from bp tp Mbp (dividing it by 1000000), it did help the QTL effect plot
3309a62c-104f-4760-a2e9-05ac09f015e1 (2).png
2. Just to clarify - it is wrong to use to the Mbp version of the physical map for both gmap and pmap and get plots? 

In order to convert my physical map to a genetic map
- I tried estmap(), it returns NULL 
-  http://cgd.jax.org/mousemapconverter/  - There are too many options (cox, GF21) is one method preferred over the other for rqtl2? 
Please let me know if there are other ways to convert pmap to gmap. 

Thank you, 

- Aravindh 





--
You received this message because you are subscribed to a topic in the Google Groups "R/qtl2 discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rqtl2-disc/EtpCloQg288/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rqtl2-disc+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rqtl2-disc/77691693-adf9-40c5-a185-9af4b4cd54bbn%40googlegroups.com.

Karl Broman

unread,
Mar 4, 2021, 5:11:44 PM3/4/21
to R/qtl2 discussion
I don't know about "wrong". Taking Mbp and dividing by 2 would get you approximately cM, assuming that the mouse genome is 3000 Mbp and 1500 cM.
It'd be a bit better to get a map that more closely reflects the recombination rate in your cross. Either the Cox et al map or the G2F1 map is probably fine.

Your data has a lot of monomorphic markers, where individuals are all AA, all AB, or all BB. You'll want to remove those. There are a bunch of markers that aren't quite monomorphic but are close.

karl

Karl Broman

unread,
Mar 5, 2021, 12:39:37 AM3/5/21
to R/qtl2 discussion
Try the following:

library(qtl2)
f2 <- read_cross2(file = "f2.yaml")
f2$pmap <- lapply(f2$pmap, "*", 1e-6) # physical map -> Mbp
f2$gmap <- lapply(f2$pmap, "*", 0.5)  # genetic map = Mbp/2

# remove some markers
g <- do.call("cbind", f2$geno)
no_data <- colnames(g)[colSums(g == 0)==nrow(g)]
monomorphic <- colnames(g)[apply(g, 2, function(a) length(unique(a[a!=0]))==1)]
f2 <- drop_markers(f2, c(no_data, monomorphic))

pr <- calc_genoprob(f2, err=0.01)
out <- scan1(pr, f2$pheno, cores=0)
plot(out, f2$pmap)

qtlpos <- max(out, f2$pmap)
qtlgeno <- maxmarg(pr, f2$pmap, chr=qtlpos$chr, pos=qtlpos$pos,
                   return_char=TRUE)

plot_pxg(qtlgeno, f2$pheno[,1], SEmult=2, sort=FALSE)
                                                                                                                      
karl

Aravindh Nagarajan

unread,
Mar 5, 2021, 11:15:53 AM3/5/21
to rqtl2...@googlegroups.com
Dr. Broman, 

Thank you so much for taking the time to run the code. It works well. I have attached the improved plot. 

I have a couple more general questions? 

1. How important are the pseudo markers? I have always assumed that you need at least one pseuduomarker between the genotyped markers for the qtl to work, 
But I see that you didn't use it and get the same peak. Is there a place I can read about them?
2. None of my phenotypes are normal. normality doesn't get any better if I do, log, sqrt, or box-cox transformations. Is it unethical if I try all the transformations and pick the one with the highest LOD value? Also, what are your thoughts on rankZ transformation (sometimes I get a peak on a different chromosome compared to all the other transformations and untransformed data for the same phenotype)? 
3.  the LOD scores are different when I do the QTL effect plot - 
  #gives a peak around 120 /LOD 4)  
plot(out, lodcolumn = "w", f2$pmap, chr=10)
# the peak shifts a bit to the left and LOD values are less than one
c2eff <- scan1coef(pr[,'10'], f2$pheno[,"w"])
plot_coef(c2eff, f2$pmap, columns = 1:3, scan1_output =  out, main = "Chr 10 QTL effects ",legend = "topright")
Is this something to do with the plot settings?

Thanks again, 

- Aravindh 




You received this message because you are subscribed to the Google Groups "R/qtl2 discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rqtl2-disc+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rqtl2-disc/82fc3f7e-f4c7-435a-8264-3f4c02a73e91n%40googlegroups.com.
d90fdb73-7437-494c-b257-30428cbfdaa5.png

Karl Broman

unread,
Mar 5, 2021, 11:40:11 AM3/5/21
to R/qtl2 discussion
1. How important are the pseudo markers? I have always assumed that you need at least one pseuduomarker between the genotyped markers for the qtl to work, 
But I see that you didn't use it and get the same peak. Is there a place I can read about them?

Pseudomarkers are places where you calculate the LOD scores but don't have a marker. If markers are really dense, there's no real need for them. If there are gaps between markers, it can be useful to insert some.
See chapter 4 of the R/qtl book https://rqtl.org/book

2. None of my phenotypes are normal. normality doesn't get any better if I do, log, sqrt, or box-cox transformations. Is it unethical if I try all the transformations and pick the one with the highest LOD value? Also, what are your thoughts on rankZ transformation (sometimes I get a peak on a different chromosome compared to all the other transformations and untransformed data for the same phenotype)?

It's only unethical if you don't mention what you did and don't take account of the search over possible transformations.
For the particular phenotype you provided, I would not transform.
The rankZ transformation is particularly useful when you have a very large number of phenotypes and can't look at each individually.

3.  the LOD scores are different when I do the QTL effect plot - 
  #gives a peak around 120 /LOD 4)  
plot(out, lodcolumn = "w", f2$pmap, chr=10)
# the peak shifts a bit to the left and LOD values are less than one
c2eff <- scan1coef(pr[,'10'], f2$pheno[,"w"])
plot_coef(c2eff, f2$pmap, columns = 1:3, scan1_output =  out, main = "Chr 10 QTL effects ",legend = "topright")
Is this something to do with the plot settings?

I don't see the shift you mention. Chromosome 10 looks to start at position 13, so plots of the individual chromosome will start there.

karl

Aravindh Nagarajan

unread,
Mar 5, 2021, 12:08:22 PM3/5/21
to rqtl2...@googlegroups.com
Thanks for the reply. 
3. This is weird,  I run the same code ( I have a attached the images )

plot(out, lodcolumn = "w", f2$pmap, chr=10) #w1 and weight1
c2eff <- scan1coef(pr[,'10'], f2$pheno[,"w"])
plot_coef(c2eff, f2$pmap, columns = 1:3, scan1_output =  out, main = "Chr 10 QTL effects ",legend = "topright") #w2 and weight2

The only difference between the w and weight run is that I am using a different phenotype file with 16 more phenotypes and transformations on there than the one I sent, 
 w1 and weight1 look the same, but between w2 (LOD axis is 4) and weight 2 (LOD axis is1), the LOD values change in the lower small box of the LOD plot. 

Does having too many phenotype columns affect the plots? 
In General, will the analysis at any point differ or give a different value if I run the same phenotype alone ( just one column in a phenotype file) compared to running 
the same phenotype but with a file with multiple different columns and phenotypes?


Aravindh Nagarajan \ AH-rah-'vind\

Genetics Ph.D. Candidate | College of Medicine | Linkedin

Advocacy Committee | SGA Diversity Commission | Website

Student Advisor | Desi Aggies | Facebook

Texas A&M University, College Station, Texas. 

Tel: 979.985.8586aravindhna...@gmail.com

"Be the change you wish to see in this world" - Unknown 



weight2.png
w2.png
weight1.png
w1.png

Karl Broman

unread,
Mar 5, 2021, 12:59:33 PM3/5/21
to R/qtl2 discussion
The scan1 output provided to plot_coef should just have a single LOD score column; so you need to pick off the particular one you want plotted. I can't tell, but I expect that's the problem.

From the help file for plot_coef():

scan1_output

If provided, we make a two-panel plot with coefficients on top and LOD scores below. Should have just one LOD score column; if multiple, only the first is used.

karl

Aravindh Nagarajan

unread,
Mar 5, 2021, 2:39:36 PM3/5/21
to rqtl2...@googlegroups.com
That makes sense. Thank you. 




Aravindh Nagarajan \ AH-rah-'vind\

Genetics Ph.D. Candidate | College of Medicine | Linkedin

Advocacy Committee | SGA Diversity Commission | Website

Student Advisor | Desi Aggies | Facebook

Texas A&M University, College Station, Texas. 

Tel: 979.985.8586aravindhna...@gmail.com

"Be the change you wish to see in this world" - Unknown 


Reply all
Reply to author
Forward
0 new messages