RE: Using QIIME with the Ion Torrent Metagenomics 16s Kit

382 views
Skip to first unread message

Anna I

unread,
May 29, 2015, 4:56:57 PM5/29/15
to qiime...@googlegroups.com

Hello Qiimers!

I have been exploring the Ion Torrent Metagenomics 16s Kit, and would like to use QIIME to further analyze the data from this Kit. 

(This kit sequences 16S rRNA data in 2 batches: the first batch looks at variable regions V2,V4,V8, while the second batch looks at V3,V67,V9.)


So far I have been working with sample data (a metagenomics mock community provided from Ion Community), which is in the form of a bam file. Running the bam file through the IonReporter workflow ‘Metagenomics 16S beta’ checks the data against 2 reference databases (MicroSeq ID and GreenGenes) and lists the species found in each variable region, and outputs 3 txt files (with taxonomy, %ID and % total and valid reads) and 3 fasta files (with the taxonomy and rRNA seqs associated with each classification).

I would like to use the data from the kit to perform further QIIME analyses (making charts, such as through core_qiime_analyses.py).

My question is, which files would I need to input into QIIME scripts? Would I take the fasta files output from the IonReporter workflow? I don’t think this is the answer because they don’t resemble any fasta I’ve used with QIIME before/in tutorials. Is there a way to split the bam file generated by the Ion Torrent machine into fasta file(s), and run the analysis on this? Also, how would I create a mapping file for this situation(I do not know what primers are used in the kit)?

Sorry for the long post, hopefully some of you more experienced QIIME users can point me in the right direction
J

Anna

Jai Ram Rideout

unread,
Jun 1, 2015, 2:22:59 PM6/1/15
to qiime...@googlegroups.com
Hi Anna,

I don't have any experience with Ion Torrent data, so I'm pretty limited in my ability to help you. There are other users on the forum who have imported this type of data into QIIME, maybe they'll weigh in. I also recommend searching the forum for "ion torrent" to see what others are doing with this kind of data.

In general, we recommend obtaining the most "raw" form of data available to you so that you can use QIIME's standardized demultiplexing/quality control, OTU picking, and taxonomy assignment workflows.

If you can get your data into the format expected by QIIME at a particular step in the pipeline, things should work. For example, if you want to start using QIIME at the OTU picking stage (e.g., pick_open_reference_otus.py), you'll need to format your demultiplexed sequence data in a FASTA file that follows these specifications. If you instead want to start at the diversity analyses stage (e.g., core_diversity_analyses.py), you'll need an OTU table in the BIOM file format (optionally including taxonomy assignment metadata). See here for more details about the BIOM file format and how QIIME uses it.

Hope this helps,
Jai

--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Anna I

unread,
Jun 3, 2015, 4:13:06 PM6/3/15
to qiime...@googlegroups.com
Hi Jai,

Thank you for your reply!

So far I've been working with the sample data from IonCommunity, which only contains one sample. The workflow I've been using is:
- make and validate a mapping file
- add qiime labels
- closed reference otu picking (against the GreenGenes database)
- alpha rarefaction
- summarize taxa by plot

This seems to work, however I'm getting drastically different results for the identified species than the output form the IonReporter workflow. I assume this is because they blast against 2 databases: the MicroSeq ID 16S rRNA reference database AND the GreenGenes database. Do you know if I can use Qiime to also blast against the MicroSeq ID 16S rRNA database to try to reproduce the results? I can't find where I would download the database from.

Although I can assign taxonomy with QIIME to the sample, unfortunately I still can't compare the results by variable region because the primers are already trimmed in the fasta files I'm using from IonReporter. The next step will be testing this workflow (and core_diversity_analyses.py) once I have some real data (aka multiple samples) to work with in the coming weeks!

Anna

Colin Brislawn

unread,
Jun 3, 2015, 5:06:42 PM6/3/15
to qiime...@googlegroups.com
Hello Anna,

You can totally use qiime with the MicroSeq ID 16S rRNA database, if you can get a copy of this database. I did a quick google search and did not find 'MicroSeq ID 16S rRNA database' in a downloadable form. Do you have a copy of in a fasta file now? Do you know where I could download one?


<personal soapbox>
Ion Torrent has been doing some sketchy, closed-source, proprietary stuff with their primers. I understand their desire to project their novel methods and keep competitors from stealing their magic primers. However, I do not respect or trust methods which are hidden from me... If they are doing something similar with their database, I would avoid it and use a database which is publicly available, verifiable, and trusted.
</soapbox over>


Which regions do the primers amplify? Is this a mix of primers or a single pair?

Thanks for helping me learn more about Ion Torrent!
Colin

Anna I

unread,
Jun 4, 2015, 5:02:51 PM6/4/15
to qiime...@googlegroups.com
Hi Colin,

Unfortunately no, I have not been able to find the MicroSeq database for download, I think that has to be purchased from LifeTechnologies.

The Ion Metagenomics Kit compares 7 (well technically 6 since V6 and V7 are clumped together) different regions in the 16S gene (V2, V3, V4, V6-7, V8, V9) and proprietary primers are used corresponding to each variable region, so there is a set of 6 primers used. I'd like to be able to use QIIME to compare the data from each different primer as well as compare the data between samples, but since the primers for each region are not disclosed I'm not sure yet how we will compare the results from different primers. Output from the IonReporter Metagenomics workflow lists the reads and taxa found for each primer, so key information (species found, %reads) can be taken from there, I'm just not sure how to reproduce/further analyze these results using QIIME.

I'll post again when we're further into the analyses and have any new questions/information :)

Best,
Anna

Colin Brislawn

unread,
Jun 4, 2015, 5:43:22 PM6/4/15
to qiime...@googlegroups.com
Hello Anna,

I've heard of this kit before, though never seen the data. A user of the qiime fourms called JenB has worked with this... take a look at this thread: https://groups.google.com/forum/#!searchin/qiime-forum/jenb/qiime-forum/ZZm6E7_uCN0/i5vvenBQ67oJ

What is the rawest form of data you can get from the iontorrent machine? Fastq files? If can you get those, you can demultiplex and qualify filter them like normal. Then you can enter OTU picking... but I'm not sure how this will work with reads from multiple regions. I guess reads from different regions of the same species will cluster into different OTUs, but those OTUs should be assigned the same taxonomy and you can pool OTUs with the same taxonomy using summarize_taxa.py. 

Try pushing your Ion Torrent data thorugh the pipeline like it was 'normal' data from a single region. It may work out fine...


Good luck!
Colin

JenB

unread,
Jun 12, 2015, 8:21:23 AM6/12/15
to qiime...@googlegroups.com
Hello Anna,
I saw your post and we have been working with data from this kit for a little under a year now.

Have you made any progress with your data?  

We did not want to use the Ion Reporter software so we have been working on a method to analyze the data using Qiime and Uparse.

Jen

Anna I

unread,
Jun 17, 2015, 4:25:31 PM6/17/15
to qiime...@googlegroups.com
Hi Jen,

We will be working with this kit in the coming weeks, so haven't been able to test it yet, but for now the plan is to trim primers and discard short reads using the Ion Reporter software then pick otus, assign taxonomy and perform diversity analysis with QIIME.

You mention you're not using the Ion Reporter software, what workflow are you using with QIIME and UPARSE? I wonder, have you found a workaround for removing the proprietary primers that (to the best of my understanding) are in the raw data output from the Ion Torrent?

Looking forward to hearing back on this,
Anna
Reply all
Reply to author
Forward
0 new messages