Yes, you can pass barcodes though the script as a "3rd read". By default, it will be ignored for the purposes of quality filtering, since barcode reads have known quality estimate issues.
If you're doing adapter removal, you should demultiplex first. Otherwise adapter prevalence levels will be assessed "in aggregate", and this doesn't reflect the underlying chemistry correctly… adapter levels are sample specific.
(If you're only filtering on quality it shouldn't matter either way.)
--
You received this message because you are subscribed to the Google Groups "EA Utils" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
ea-utils+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Are your barcodes longer than 10 bases? If so, the code doesn't ignore them for filtering….i can imagine this would have an effect. If not, they should be completely ignored.
Well, there should be an option in fastq-mcf to mark a file as "barcode" aside from the 10bp heuristic.
Sorry, when I said "should be", I meant… I haven't added it. You can pretty easily pop open the cost and change everything that says "< 11" (less than 11) to "< 13" (less than 13) and solve your problem.
But realistically, with sequence lengths getting longer and capacity growing, it's only a matter of time where the longest barcodes exceed the shortest reads that people use, and this flag becomes mandatory. Probably I could change it to 16. No one runs at 16bp. But 35bp is something we do all the time. And some fluidigm codes get up there (24bp)… so the flag needs to be added.