fastq-mcf barcode file

95 views
Skip to first unread message

Brittany Demmitt

unread,
Sep 12, 2014, 3:34:00 PM9/12/14
to ea-u...@googlegroups.com
Hi Erik,

Thank you so much for the great fastq-mcf tools!  It was great to quality filter both my forward and reverse reads so that there were the same number of reads in each file.  I was wondering if there is  a way to also pass my barcode fastq file through the script and have the corresponding barcodes removed for reads that were removed do to quality filtering?

The reason I ask is that I am getting errors demultiplexing the data since the number of barcodes in the barcode fastq file is more than the filtered read files.

Thanks!

Brittany

Aronesty, Erik

unread,
Sep 12, 2014, 4:50:12 PM9/12/14
to ea-u...@googlegroups.com

Yes, you can pass barcodes though the script as a "3rd read".  By default, it will be ignored for the purposes of quality filtering, since barcode reads have known quality estimate issues.

 

If you're doing adapter removal, you should demultiplex first.   Otherwise adapter prevalence levels will be assessed "in aggregate", and this doesn't reflect the underlying chemistry correctly… adapter levels are sample specific.  

 

(If you're only filtering on quality it shouldn't matter either way.)

--
You received this message because you are subscribed to the Google Groups "EA Utils" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ea-utils+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

brittany

unread,
Sep 15, 2014, 11:47:44 AM9/15/14
to ea-u...@googlegroups.com
Hi Erik,

Thank you for the help, however I have run into a problem.  I added the barcodes as a third read, however now the number of sequences that were removed has almost doubled. Did I misplace the barcode reads?

Here is my original script:

fastq-mcf -0 --qual-gt 113,25 -o forward_filtered.fastq -o reverse_filtered.fastq -o barcode_filtered.fastq n/a forward.fastq reverse.fastq barcode.fastq

New script:

fastq-mcf -0 --qual-gt 113,25 -o forward_filtered.fastq -o reverse_filtered.fastq n/a forward.fastq reverse.fastq

Thanks!

Brittany

Aronesty, Erik

unread,
Sep 15, 2014, 1:17:37 PM9/15/14
to ea-u...@googlegroups.com

Are your barcodes longer than 10 bases?   If so, the code doesn't ignore them for filtering….i can imagine this would have an effect.   If not, they should be completely ignored.

brittany

unread,
Sep 15, 2014, 1:22:33 PM9/15/14
to ea-u...@googlegroups.com
Ahh that would be the problem, my barcodes are 12bp long.

Thanks,

Britt

Aronesty, Erik

unread,
Sep 15, 2014, 1:50:59 PM9/15/14
to ea-u...@googlegroups.com

Well, there should be an option in fastq-mcf to mark a file as "barcode" aside from the 10bp heuristic.  

brittany

unread,
Sep 17, 2014, 6:58:18 PM9/17/14
to ea-u...@googlegroups.com
I am having trouble finding the flag to mark the files as barcode in fastq-mcf.  Do you know what the flag is?

Thanks :)

Brittany

Aronesty, Erik

unread,
Sep 18, 2014, 11:20:18 AM9/18/14
to ea-u...@googlegroups.com

Sorry, when I said "should be", I meant… I haven't added it.    You can pretty easily pop open the cost and change everything that says "< 11" (less than 11) to "< 13" (less than 13) and solve your problem.

 

But realistically, with sequence lengths getting longer and capacity growing, it's only a matter of time where the longest barcodes exceed the shortest reads that people use, and this flag becomes mandatory.   Probably I could change it to 16.   No one runs at 16bp.   But 35bp is something we do all the time.   And some fluidigm codes get up there (24bp)… so the flag needs to be added.

brittany

unread,
Sep 18, 2014, 5:35:38 PM9/18/14
to ea-u...@googlegroups.com
I see, thanks! :)

Britt
Reply all
Reply to author
Forward
0 new messages