Pooling/concatenation question

44 views
Skip to first unread message

Hadley Horch

unread,
Feb 1, 2018, 8:25:52 PM2/1/18
to corset-project
HI Nadia,
  I have a question about your Corset pipeline.  You recommend pooling data:  "You will have more power to detect transcripts if all the samples are pooled, therefore the first step is to concatenate the fastq files together."  However, in your CORSET paper, you imply that each sample transcriptome should be assembled separately because you suggest that reads be mapped back to each transcriptome separately. 
  So, does this mean that you would recommend pooling data from within each condition? For example, if I have 3 control and 3 experimental data sets, should I pool the controls and then pool the experimentals and end up with two files?  Or, should I take all 6 samples and pool them all together into one file before assembly?
  As may be obvious from my question, I'm new to all of this, and I wanted to make sure to thank you for writing such a clear, easy to understand paper.  It was super helpful!
Thanks,
Hadley

Nadia Davidson

unread,
Feb 5, 2018, 10:25:44 PM2/5/18
to corset-project
Hi Hadley,

We usually pool together all sample for the assembly step (so the input is mutiple fastq files and the output in a single .fasta), but then apply the mapping step by sample (so map back each samples .fastq file/s back to the pooled .fasta separately). Then you should end up with multiple .bam files. Hope this makes sense. Note that you might not have to actually pool the reads into a single file for assembly. Some assemblers let you list the set of files you want to use and it will pool them for you.

Cheers,
Nadia.


Reply all
Reply to author
Forward
0 new messages