fastq-mcf's window option for quality trimming

60 views
Skip to first unread message

Gang Chen

unread,
Apr 28, 2015, 2:38:57 PM4/28/15
to ea-u...@googlegroups.com
Hi All,

Could anybody explain how the -w option works? I attempted to use a combination of -q and -w to trim the 3' end low quality nucleotides: 

fastq-mcf -o trimmed.fastq -q 20 -w 10 n/a untrimmed.fastq

I assume the program would slide a 10-base window base by base from the 3' end. But I couldn't figure out what criteria were used to trim the bases within the window. The trimmed results seemed to rule out: 
1) trim all bases when the first base at 5' end < 20
2) trim all bases when the average of all bases within window < 20

Thanks in advance,
Gang 
 

Gang Chen

unread,
Apr 28, 2015, 11:23:42 PM4/28/15
to ea-u...@googlegroups.com
Just read the source code. Not familiar with the language. It appears to slide a window from both front and end: If the average quality score within a window < threshold, trim the beginning/ending base before slide; otherwise, trim the very front/end bases < threshold. Could anybody confirm this? Not all my resulting reads looked being trimmed this way. I'm afraid something is wrong. 

Erik Aronesty

unread,
Jul 6, 2015, 9:31:04 AM7/6/15
to ea-u...@googlegroups.com, gangch...@gmail.com
This is correct, it's a sliding average window.   This prevents a single good quality base from stopping a trim.   Mostly, this is useful on very long (>300bp) reads, where illumina's quality can tail off and then spike up randomly within the low quality tail.


On Tuesday, April 28, 2015 at 2:38:57 PM UTC-4, Gang Chen wrote:
Reply all
Reply to author
Forward
0 new messages