Combine output from pick_open_reference_otus.py and Ion Reporter in PCoA

35 views
Skip to first unread message

Patrick S.

unread,
Jul 17, 2018, 8:08:00 AM7/17/18
to Qiime 1 Forum
Dear all,

I try to compare 8 samples that have been sequenced by the sequencing platforms Illumina MiSeq and Ion Torrent PGM. Samples that were sequenced by Illumina were further processed by the QIIME script pick_open_reference_otus.py (QIIME platform 1.9.1), Ion Torrent data was processed by Ion Reporter web platform for taxonomic annotation.

Now I would like to compare data after taxonomic annotation with the help of a Principal Coordinates Analysis (PCoA).
In general, is it possible to integrate Ion Torrent's data into the QIIME results so they can be further analysed together?

Ion Torrent produces fasta files for every sample on family and genus/species level. But I am completely unsure in which of the many result files in script's output folder these should be integrated for PCoA.
Another point is, that different taxonomic databases have been used (GreenGenes for QIIME, "Curated MicroSEQ(R)" and "Curated Greengenes" for Ion Reporter), which makes things further complicated.

So to sum up, is it possible at all to create a PCoA plot from data and which way would you recommend?

I already tried a lot of QIIME scripts, but without success. Any advice is really appreciated.

Best regards

Patrick

Colin Brislawn

unread,
Jul 17, 2018, 4:05:31 PM7/17/18
to Qiime 1 Forum
Good morning Patrick,

If you are just getting started with Qiime, you might have better luck using Qiime 2. The developer support is great, and folks are using it with ion torrent data with great success. 

In general, is it possible to integrate Ion Torrent's data into the QIIME results so they can be further analyzed together?
Yes! Given that the same samples were sequenced twice, you can simply repeat your Illumina analysis after you have added your new Ion Torrent samples.

So you might demultiplex Ion Torrent fasta files using this script. 
Then you concatenate your miseq and ion torrent reads into a single seqs.fna file.
cat miseq_reads.fasta ion_torrent_reads_from_add_qiime_labels.fasta > seqs.fna

Then you reprocess your seqs.fna through OTU picking and taxonomy assignment, just like you did when using only MiSeq data. 


A few things to keep in mind:
  • Make sure your miseq and ion torrent samples have different names, even if they are exactly the biological samples. The unique names will let you compare them in the PCoA plot. If you give them the same name, the samples will be merged, making comparison impossible.
  • In order to make a PCoA plot, you need a distance matrix; when doing PCoA, you are Analyzing the Principal Coordinates of a distance matrix, maybe composed of UniFrac distances between all pairs of samples. Note doing a PCoA of UniFrac distances characterizes phylogeny (relatedness of microbe) and does not directly consider taxonomy (name of microbe). (This is good! Phylogeny is more useful than taxonomy.) 

Let me know if this helps!
Colin

Patrick S.

unread,
Jul 18, 2018, 4:46:26 AM7/18/18
to Qiime 1 Forum
Hi Collin,

thank you very much for your quick answer. I already planned to switch over to Qiime 2, but as only the PCoA plots are missing yet, I think I have to stick to Qiime 1 for now. Nevertheless, thanks for the hint!

The problem is that I would like to keep the assigned taxa from Ion Reporter as this is also a part of analysis. So if I would integrate Ion Torrent sequence's into Miseq's sequences for reanalysis, I would lose that information.
I am searching for a way to integrate iontorrent reads + taxa into the results from Qiime's pick_open_reference_otus pipeline to create a PCoA plot in the end.

Best regards

Patrick

Colin Brislawn

unread,
Jul 18, 2018, 12:12:40 PM7/18/18
to qiime...@googlegroups.com
Hello Patrick,

Yes, this problem is a little tricky. There are a few ways forward, but some of them are dead ends. 

The problem is that I would like to keep the assigned taxa from Ion Reporter as this is also a part of analysis. So if I would integrate Ion Torrent sequence's into Miseq's sequences for reanalysis, I would lose that information.
Correct. And not only would you lose these taxonomic classifications, the Ion Torrent and Illumina reads would mix together to form new OTUs. 

Forming common OTUs between the two sequencing types makes the comparison by PCoA possible. If the two sequencing types had no OTUs in common, the distance between every Ion Torrent sample and every Illumina sample would be 1 (== 100% different).

So PCoA plot needs a distance matrix, a distance matrix needs common OTUs, common OTUs are created by reanalysis. 

Does this make sense? 

Colin

EDIT: You mentioned,
I would like to keep the assigned taxa from Ion Reporter
You can still do that! Even though the PCoA requires common OTUs, for the rest of the paper you can use the separate analysis.  
Reply all
Reply to author
Forward
0 new messages