genotypes R/qtl output

982 views
Skip to first unread message

Boryana Koseva

unread,
Apr 15, 2015, 1:45:18 PM4/15/15
to stacks...@googlegroups.com
Hi Stacks team,

I have a large backcross population that I want to use to construct a genetic map. I am having a hard time importing the genotypes output into R/qtl, and am unsure if I am doing something wrong along the pipeline [ustacks all individuals, cstacks parents, sstacks all individuals, genotypes ]. The output file doesn't look like what I know R/qtl likes. I am encountering a couple of problems with it, and I have attached the first 10 lines of the file. The first problem is the first 4 lines of the file (where basic info about the map is recorded) and line 7 (I don't quite know what the purpose of that one is at all) - R/qtl doesn't seem to know what to do with those. Another problem is that there are missing genotypes. By that I mean that some "cells" are completely empty (not even a '-' character). Finally, when I run genotypes with the -c option, some of the genotypes are coded with capital letters and some with lowercase letters. I understand the meaning of that but how is R/qtl supposed to deal with that variation? I keep thinking that I am missing some sort of a file processing step but I can't figure out what it is that I am missing.

I am happy to provide any additional information that you need.

Thank you!
 - Boryana
sample.rqtl.output.txt

Julian Catchen

unread,
Apr 20, 2015, 5:27:30 PM4/20/15
to stacks...@googlegroups.com, bory...@gmail.com
Hi Boryana,

The R/QTL program can read data in multiple formats:

http://www.rqtl.org/sampledata/

You are likely used to seeing the common MapMaker format, but we use the
CSV format, the first example given on the link above.

As your example included, the first four lines of the output are
comments and should be ignored by R/QTL (they start with the "#" comment
character). The 5th line are the markers, the 6th the chromosome (all
'1' because we don't have this information), the 7th line is the
position of the marker (just ordered incrementally because we don't have
this information). You could re-export these data aligned to a reference
genome and these fields would look like normal chromosome/bp
coordinates. One marker per line after that.

The OneMap software also uses MapMaker format so if you want that more
comforting format, you can export for OneMap and then load into R/QTL as
MapMaker.

The uppercase genotypes indicate corrections, this is just for your
information. There is no post processing required.

Best,

julian

Boryana Koseva

unread,
Apr 20, 2015, 5:52:40 PM4/20/15
to stacks...@googlegroups.com
Hi Julian,

Thanks for taking the time to respond. I am used to a different version of the CSV file (more specifically, this - http://www.rqtl.org/tutorials/mapthis.csv). In retrospect, I should have done a better job looking up the file types R/qtl likes. The biggest problem at this point is that R/qtl fails to import the file due to the missing genotypes that I mentioned. Any thoughts on that?

Thank you.
 - Boryana

Julian Catchen

unread,
Apr 21, 2015, 3:36:22 PM4/21/15
to stacks...@googlegroups.com, bory...@gmail.com
Hi Boryana,

What is the exact error from R/QTL? Is the problem a lack of data, or is
R/QTL complaining about a formatting error regarding missing genotypes?

julian
Message has been deleted

Boryana Koseva

unread,
Apr 23, 2015, 3:10:02 PM4/23/15
to stacks...@googlegroups.com, bory...@gmail.com, jcat...@illinois.edu
Hi Julian,

* There was a typo in my previous post that could make the whole thing more confusing so I am reposting with correction.

Here is what I am doing:

$ head batch_0.genotypes_50.rqtl.tsv > test.genotypes.output.csv

In R:
>mothData <- read.cross(file="test.genotypes.output.csv","csv",estimate.map=FALSE,genotypes=c('h','b'),na.string=c('-'))
Error in read.cross.csv(dir, file, na.strings, genotypes, estimate.map,  :
  You must include at least one phenotype (e.g., an index). There was this value in the first column of the second row '# Map Type: BC1' where was supposed to be nothing.

Then I remove the first 4 lines and rerun the command:

> mothData <- read.cross(file="test.genotypes.output.csv","csv",estimate.map=FALSE,genotypes=c('h','b'),na.string=c('-'))
Error in read.cross.csv(dir, file, na.strings, genotypes, estimate.map,  :
  There are missing marker positions.
   In particular, we see these value(s): "" at position(s): 18572,18573,18574,18575,18576,18577,18578,18579,18580,18581,18582,18583,18584,18585,18586,18587,18588,18589,18590,18591,18592,18593,18594,18595,18596,18597,18598,18599,18600,18601,18602,18603,18604,18605,18606,18607,18608,18609,18610,18611,18612,18613,18614,18615,18616,18617,18618,18619,18620,18621,18622,18623,18624,18625,18626,18627,18628,18629,18630,18631,18632,18633,18634,18635,18636,18637,18638,18639,18640,18641,18642,18643,18644,18645,18646,18647,18648,18649,18650,18651,18652,18653,18654,18655,18656,18657,18658,18659,18660,18661,18662,18663,18664,18665,18666,18667,18668,18669,18670,18671,18672,18673,18674,18675,18676,18677,18678,18679,18680,18681,18682,18683,18684,18685,18686,18687,18688,18689,18690,18691,18692,18693,18694,18695,18696,18697,18698,18699,18700,18701,18702,18703,18704,18705,18706,18707,18708,18709,18710,18711,18712,18713,18714,18715,18716,18717,18718,18719,18720,

If you open the file I have attached, you can see that some "cells" are just empty (no '-', 'h', or 'b').

Thanks again.
 - Boryana
test.genotypes.output.csv

Matthew Gibson

unread,
Apr 1, 2017, 2:42:20 PM4/1/17
to Stacks, bory...@gmail.com, jcat...@illinois.edu
Hello,
I realize this was over two years ago, but did you ever solve your problem? I am having the same issue.

Best,
MG

Darwin Pinzón

unread,
Dec 29, 2017, 3:49:38 AM12/29/17
to Stacks
I had this issue and found that you need to grep -v "#" the output file. Julian mentions that it should be ignored by R since it is a commented line, but that wasn't the behavior that I observed.
I should say that there are other issues with the output for both rqtl and onemap, as both programs refuse these files for various reasons.

For onemap, there is the problem of no sample names. I tried a very hackey way of inserting sample names assuming the order is the same as my input, but that failed. Thoughts?

For rqtl, I get a stack limit error and the files are way larger than they should be. 72 individuals genotyped for 10 markers has an output with hundreds of columns. Thoughts?

Boryana Koseva

unread,
Feb 7, 2018, 12:07:45 PM2/7/18
to Stacks
I ended up having to output a onemap file and then wrote a python script to convert it to something R/qtl likes.

 - Boryana

Matthew Gibson

unread,
Feb 7, 2018, 4:03:47 PM2/7/18
to stacks...@googlegroups.com

I did the same as Boryana. Export to onemap and manually convert to R/QTL with a Python script.

 

Best,
Matt

--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to a topic in the Google Groups "Stacks" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stacks-users/Y7LELtu5FpU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stacks-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/stacks-users.
For more options, visit https://groups.google.com/d/optout.

Thayer

unread,
Feb 16, 2018, 12:44:45 AM2/16/18
to Stacks
Hi Boryana and Matt,

I'm having all the same glitches as everyone else with outputting rqtl format. Is sample order in the onemap file just alphabetical? If either of your python scripts are shareable on Github, etc., I would love not to have to re-invent it :)

Thanks,

Rachel

Matthew Gibson

unread,
Feb 16, 2018, 10:12:05 AM2/16/18
to stacks...@googlegroups.com
So turns out I actually output into Joinmap format and converted to RQTL. My script is here: https://github.com/gibsonMatt/jaltomataQTL/blob/master/mapping_3/filter_jp_file.py

Just look at the help responses to see how to use it. It can also do some filtering and output back to joinmap format if you like.

python filter_jp_file.py -v -i joinmap -ot rqtl -f batch_1.genotypes_20.loc -t 0 -o rqtlTest.csv



To unsubscribe from this group and all its topics, send an email to stacks-users+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages