I only have experience so far with quite a poor dataset (i.e. low read length and number of reads).
My samples are pretty similar to Tony's - 3 samples from an anaerobic digester, median read length very low (58), with mean ~100bp. A lot of unassigned reads with RDP but the major archea were identified (this was using universal bacteria primers - the archea primer runs completely failed). My colleague who performed the sequencing thinks she has identified the problem (PCR step) - it certainly seems a lot more sensitive that Ion Torrent would have you believe, but I reckon we can at least get to a decent level of sequencing performance once these issues have been resolved.
Regarding the data analysis:
- The Ion Torrent Suite produces two output files, SFF and FASTQ
- The SFF is the raw data and has key, barcode, adapter and primer retained with each sequence
- The Fastq has those elements removed
- I have used Galaxy workflow to create fasta and qual files from SFF. However, using split_libraries.py with these files results in no reads being assigned due to non-location of primer (possible adapter issue)
- I used
fasta_convert.pl to create fasta and qual files from the Fastq file. However, this script requires some amendment to get the right encoding for
Ion Torrent. Additionally, barcodes and primer are missing. I have my own script for reintroducing those, sample by sample.
Better option:
- in Ion Torrent administration, I removed the barcode identifier from the experiment and ran re-analysis. This produced a Fastaq file with barcode-adapter-primer intact (key removed).
- My mapping file contained barcode & adapter-primer sequence (and reverse primer).
This seemed to work ok.
It would be good to test this properly with a mock community with known identities and with a good Ion Torrent run.
Similarly, for denoising, there is a possibility that Chris Quince, ourselves and Liverpool University are going to work on adapting ampliconnoise for Ion Torrent. I believe this should involve determining the noise profile for the instrument (likelihoods etc). I understand that PCR error may not be an issue with Ion Torrent.
For denoiser, i have not looked into the algorithm or its mechanism, so cannot comment on what would be required - but I guess Rob Knight and others could easily investigate given access to Ion Torrent.
As mentioned in my post, Acacia has been released by University of Queensland (Hugenholtz's group). At present it doesn't seem to work with Ion Torrent data (I have just been playing, nothing serious), but Lauren Bragg says she is aiming to work on the Ion Torrent option.