Hi Jen,
There are obviously many different ways, but I use pretty much the same pipeline as the Brazilian Microbiome Project (http://www.brmicrobiome.org/%23!16sprofilingpipeline/cuhd#!16sprofilingpipeline/cuhd). It will make your life a lot easier to get a non-demultiplexed fastq file from the Torrent Suite. If you have barcodes that are all the same length, you could follow the entire BMP protocol. I stupidly decided to use the IonXpress barcodes, which are variable length, so I need to do the demultiplexing step in qiime. If you need help with that part, let me know.
I’m interested in hearing what others are doing.
Lisa
--
---
You received this message because you are subscribed to a topic in the Google Groups "Qiime Forum" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qiime-forum/b7DrGWrXbuU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Yes! I am getting great results with that pipeline. I started using uparse last year bc the otu picking options in qiime weren't working well with my pgm data. I optimized uparse with some mock communities and then recently saw that the BMP is using the same parameters.
It'll be much easier if you ask them for the non-demultiplexed data, especially if you have a lot of samples. Just ask them to rerun the analysis without the barcode option selected. If you don't want to wait, you'll have to rename the reads and then join all the files together.
Yes! I am getting great results with that pipeline. I started using uparse last year bc the otu picking options in qiime weren't working well with my pgm data. I optimized uparse with some mock communities and then recently saw that the BMP is using the same parameters.
Here it is. Good luck!
Here it is. Good luck!
I started the pipeline but I think that I might be filtering out too much data as I am getting a small number of sequences back after runnign the fastq_filter step. what value do you use for the fastq_maxee argument? I used the default on the BMP pipeline of .5 and I am getting 30% converted. Not sure if this seems ok?
Also, at the abundance sort and discard singletons step, it seems as though I am losing a lot of my unique sequences.
I need to go through and thoroughly read what each of these scripts are doing but after I get down to the chimera filtering step, i am getting 98 non-chimeras. Doesn't that seem quite small when starting with 254000 sequences? or does this seem accurate?
The BMP pipeline has been great so far but there seems to be a step missing just before step 8 onthe website. He specifies a reads_uparse.fa file as input but does not list the step to create it. Do you happen to know it?
--
Hi Again Lisa,So I got the data back from our lab with barcodes in tact thanks to your help!!I started the pipeline but I think that I might be filtering out too much data as I am getting a small number of sequences back after runnign the fastq_filter step. what value do you use for the fastq_maxee argument? I used the default on the BMP pipeline of .5 and I am getting 30% converted. Not sure if this seems ok?
Also, at the abundance sort and discard singletons step, it seems as though I am losing a lot of my unique sequences.
I need to go through and thoroughly read what each of these scripts are doing but after I get down to the chimera filtering step, i am getting 98 non-chimeras. Doesn't that seem quite small when starting with 254000 sequences? or does this seem accurate?
The BMP pipeline has been great so far but there seems to be a step missing just before step 8 onthe website. He specifies a reads_uparse.fa file as input but does not list the step to create it. Do you happen to know it?
Thank you again!Jen
assign_taxonomy.py -i $PWD/otus.fa -o output -r $PWD/rep_set/97_otus.fasta -t $PWD/taxonomy/97_otu_taxonomy.txt
--
--
--
Grep:
grep "GGACTAC**GGGT*TCTAAT" ./ames/pilotDataV4/fastq_bc/first.fastq > revprimer.fq
now i will pull out one sample by barcode: CTAAGGTAAC
adapatar sequence is GAT...
so these are just a couple of the reads that i pulled out. I am trying to get more info from our sequencing lab but do you have any idea why I am seeing this?
>
CTAAGGTAACGATGGACTACCCGGGTTTCTAATCCTGTTCGCTCCCCACGCTTTCGAGCCTCAGCGTCAGTTACAGACCAGAGAGCCGCTTTCGCCACCGGTGTTCCTCCATATATCTACGCATTTCACCGCTACACATGGAATTCCACTCTCCCTTCTGCACTCAAGTTTGACAGTTTTCCAAAGCGAACTATGGTTGAGCCACAGCCTTTTAACTTCAGACTTATCAAACCTGCCTGCGCTCGCTTTACGCCC
>
CTAAGGTAACGATGGACTACCCGGGTTTCTAATCCTGTTTGCTCCCCACGCTTTCGCACATGAGCGTCAGTACATTCCCAAGGGGCTGCCTTCGCCTTCGGTATTCCTCCACATCTCTACGCATTTCACCGCTACACGTGGAATTCTACCCTCCTAAGTACTCTAGCGACCCAGTACT
Strip barcodes ("Ex" is a prefix for the read labels, can be anything you like) <<<USING USEARCH 7>>>
1.
python $PWD/fastq_strip_barcode_relabel2.py $PWD/reads.fastq ATTACCGCGGCTGCTGG $PWD/barcodes.fa Ex > reads2.fastq
2 - Quality filtering, length truncate, and convert to FASTA <<<USING USEARCH 7>>>
$u -fastq_filter $PWD/reads2.fastq -fastq_maxee 0.5 -fastq_trunclen 200 -fastaout reads.fa
3 - Dereplication <<<USING USEARCH 7>>>
$u -derep_fulllength $PWD/reads.fa -output derep.fa -sizeout
4 - Abundance sort and discard singletons <<<USING USEARCH 7>>>
$u -sortbysize $PWD/derep.fa -output sorted.fa -minsize 2
Then I can concatenate my fasta files for the forward and reverse results at this ponit and proceed with the rest of the pipeline.
There is a step further down on the pipeline that asks you to specify the strand: usearch_global:
http://drive5.com/usearch/manual/usearch_global.html
where mayeb I can change it to -strand both
and then just keep moving forward.
What do you think about this? might work?
Jen
Hi Jen,
There are obviously many different ways, but I use pretty much the same pipeline as the Brazilian Microbiome Project (http://www.brmicrobiome.org/%23!16sprofilingpipeline/cuhd#!16sprofilingpipeline/cuhd). It will make your life a lot easier to get a non-demultiplexed fastq file from the Torrent Suite. If you have barcodes that are all the same length, you could follow the entire BMP protocol. I stupidly decided to use the IonXpress barcodes, which are variable length, so I need to do the demultiplexing step in qiime. If you need help with that part, let me know.
I’m interested in hearing what others are doing.
Lisa