How Ion Torrent data can be used by QIIME?

198 views
Skip to first unread message

wei liu

unread,
Dec 10, 2012, 8:53:48 PM12/10/12
to qiime...@googlegroups.com
Dear all my friends,

These days I want use QIIME to analysis my biology data which generated by Ion Torrent. But the data format is a little different with the 454 data format, what change should I do to make QIIME can work with my data fluently?

My sample is the V3 region of 16srRNA, and the PCR primer like this, the head eight(the red parts)bases are barcode the blue part are primers:

Forward: ATATGCTGCCTACGGGAGGCAGCAG
Reverse: ATATGCTGATTACCGCGGCTGCT

After sequencing by Ion Torrent, I got the data like this, the barcode and the primer all at the head of the reads:

>IXU1Y:12:188_V3-6F
GATATATGCTGCCTACGGGAGGCAGCAGCAGCAGTTTGCTCACAGCTACATGCTAACATTCGTTAGTGTTGAAGCGGAGAGTTACCTGCTGAACACGAACTTTTTAAAGAAGTTAAGAAGCAAAGGCAGCTGAAGTTAGTGA
>IXU1Y:145:622_V3-6R
GATATGCTGATTACCGCGGCTGCTTCCGATCTGTTACTGTAGACGGTGACGGGGTGTCGTGATAATCGACGAAGACCACTCGCCGCTGCTGCCTCCCGTAGGCAGCATAT

So I want to know what  change should I do to make QIIME can work with my data. Thank you very much! I am looking forward your reply.

Tony Walters

unread,
Dec 10, 2012, 9:07:10 PM12/10/12
to qiime...@googlegroups.com
Hello Wei Liu,

Do the reads consistently have the base pair preceding the barcodes are you listed for your example sequences here?  If so, what you could do is create a forward and reverse version of your mapping file, with the forward barcodes preceded by the GAT sequence and the reverse barcodes preceded by G
So for the forward barcodes:
#SampleID  BarcodeSequence LinkerPrimerSequence ......
XXX.forward    GATATATGCTG          CCTACGGGAGGCAGCAG

      
and for the reverse barcodes a mapping file like so:
#SampleID   BarcodeSequence  LinkerPrimerSequence ....
XXX.reverse    GATATGCTG        ATTACCGCGGCTGCT

And then use split_libraries.py to demultiplex with each mapping file with different output directories, one for forward and the other for reverse, which could each be processed independently.

Hope this helps,
Tony

--
 
 
 

wei liu

unread,
Dec 10, 2012, 9:17:11 PM12/10/12
to qiime...@googlegroups.com
Oh, Thanks a lot. I will do following your instruction, and I think this will make sense. 

在 2012年12月11日星期二UTC+8上午10时07分10秒,TonyWalters写道:

wei liu

unread,
Dec 11, 2012, 12:57:51 AM12/11/12
to qiime...@googlegroups.com
Dear Tony,

I go ahead to do the analysis with different mapping files: forward mapping file and reverse mapping file. In my opinion the two results should be similar (I do not know whether the inference is right), but the Phylogenetic Tree of the two results are much differcent with each other, can you give me some more explanation. Is it reliable when I use the reverse primer mapping file to do the analysis and  at the alignment step can it get the correct alignment results?

Thank you very much for your patience.

Weiliu


在 2012年12月11日星期二UTC+8上午10时07分10秒,TonyWalters写道:

Tony Walters

unread,
Dec 11, 2012, 9:34:20 AM12/11/12
to qiime...@googlegroups.com
Hello Wei Liu,

There are multiple factors to consider.  One is the quality/length of the reads, and the other is the V3 hypervariable region.  It's possible that the forward and reverse reads, only covering part of the V3 region (I couldn't tell from your post if you were getting long enough reads to span the V3 region or not, if so, you might also make sure you didn't read through the reverse primer at the ends of your reads), aren't equivalent in their ability to cluster/give reliable taxonomic assignments/recapitulate the phylogeny of the tree built from the full length alignment.

One change I'd recommend if you haven't done it already is to use a reference sequence set, such as Greengenes when doing OTU picking.  You would add -r <path to Greengenes database> -m uclust_ref -z
to the parameters used.  This can have a big impact on clustering, which will affect all of the downstream steps done, and might help to make the forward and reverse read results more similar (I wouldn't expect them to be identical though).

-Tony
Reply all
Reply to author
Forward
0 new messages