workflow

397 views
Skip to first unread message

Chris

unread,
Nov 19, 2012, 11:08:33 AM11/19/12
to qiime...@googlegroups.com
I recently read the "Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms" manuscript out of your lab. I have used this approach and have just received back the MiSeq data. However, I have not been able to find the workflow used in the MS on the Qiime webpage. Could you point me to the workflow used?

I ask only because it must be much different than the current processing illumina data pipeline (http://qiime.org/tutorials/processing_illumina_data.html) since there isn't a barcode file generated and because it is a paired end run.

Thanks for your help, it is greatly appreciated!

Greg Caporaso

unread,
Nov 19, 2012, 1:05:09 PM11/19/12
to Qiime Forum
Hi Chris,
Do you know where the barcodes are? That tutorial describes the workflow that we use for processing this data, although more recently we're using open reference OTU picking - see this additional document.

Also, we don't have very extensive support for processing paired end read data in QIIME. You can use rtax (via assign_taxonomy.py) for performing taxonomy assignment, and these posts discuss some strategies for assembling the reads for use with QIIME: using PandaSeq data with QIIME and using ea-utils with QIIME.

Greg



--
 
 
 

Tony Walters

unread,
Nov 19, 2012, 1:15:50 PM11/19/12
to qiime...@googlegroups.com
Hello Chris,

The best visual explanation for the three reads (index, read 1, and read 2) is here:  http://www.pnas.org/content/108/suppl.1/4516/F1.large.jpg
The bottom of the figure shows the index read (reverse complement of on of the two reads, whichever one was barcoded).

I believe that by default, the sequencing software is set to read two reads (read 1 and 2).  I've attached an example runparameters.txt file from the MiSeq at our facility (*don't* use this one, it's just an example for comparison with the settings are your sequencing facility).  In it, there are lines like this:   
<Reads>
    <RunInfoRead NumCycles="151" Number="1" IsIndexedRead="N" />
    <RunInfoRead NumCycles="13" Number="2" IsIndexedRead="Y" />
    <RunInfoRead NumCycles="151" Number="3" IsIndexedRead="N" />
  </Reads>
You may be missing the Number 2 (index) read above.

There is an alternative possibility that the index read was done, but rather than being processed into a third read, it was added to one or both of the reads, or perhaps written to the fastq labels of your read(s).

You might look at a few of your reads, type:
head Path_To_Fastq_file
And see if there are barcodes written to the labels.  If not, you might search for barcodes in the fastq files by doing the following:
grep -c AATTCCGG fastq_file
and replace AATTCCGG with some of your barcodes (and the reverse complement of the barcode might be tested as the reads can come out in this orientation).

Hope this helps,
Tony Walters
On Mon, Nov 19, 2012 at 9:08 AM, Chris <jchristo...@yahoo.com> wrote:

--
 
 
 

runparameters.txt

Chris

unread,
Nov 19, 2012, 2:14:07 PM11/19/12
to qiime...@googlegroups.com
Hi Greg,

Thank you for the reply. I read in the Qiime tutorial that there was not extensive support for paired end runs but in the manuscript (referenced above) from your lab seems to say to run MiSeq as paired end. If I am reading this correctly, I was only curious in what workflow the Knight lab used since it is not well documented in Qiime tutorials. I was curious if you just used one read or if you merged them together at some point. If you merged them I was curious how and at what point in the process.

Thanks again Greg,
Chris

Sarah Owens

unread,
Nov 20, 2012, 11:29:45 AM11/20/12
to qiime...@googlegroups.com
Hi Chris,

If you aren't seeing an index read file, that's because the Miseq doesn't automatically generate this. Some changes need to be made to the MiSeqReporter.exe.config file that can be found in C:\Illumina\MiSeq Reporter.

The following needs to be included in the top portion of the file (the <appSettings> section)

<add key="CreateFastqForIndexReads" value="1" />

You will then want to restart the service. This can be accomplished by right clicking on the tool bar in windows, selecting "Start Task Manager", select the "Services" tab, find MiSeq Reporter on the list and then select to stop and then start the service.

Then, you'll want the sequencing facility torequeue the run in MiSeq Reporter to generate the third read.  Make sure they aren't putting all of the barcodes in the sample sheet, because this will cause the MiSeq to attempt to demultiplex your samples (and you want to do this in Qiime).

Thanks.
Sarah

Chris

unread,
Nov 20, 2012, 12:42:14 PM11/20/12
to qiime...@googlegroups.com
Thank you Sarah and Tony! I've passed this along to our sequencing core. It was very helpful.

Chris
Reply all
Reply to author
Forward
0 new messages