High-depth coverage filter

139 views
Skip to first unread message

Abhimanyu Krishna

unread,
Dec 18, 2014, 2:22:12 PM12/18/14
to strelka...@googlegroups.com
Hi,

We are working on a cancer genome project and we use Strelka to call somatic mutations.
We have a question related to the high-coverage depth filtering. Below is a list of nearby positions output by Strelka 
where the coverage in the tumor sample is very high (>3 times the chromosomal depth) but the coverage in the normal sample is not.

VCF output (the last 2 columns are normal and tumor samples):
2       89877959        .       G       T       .       PASS    NT=ref;QSS=35;QSS_NT=35;SGT=GG->GT;SOMATIC;TQSS=1;TQSS_NT=1     DP:FDP:SDP:SUBDP:AU:CU:GU:TU    79:7:0:0:1,4:0,0:71,228:0,0     463:88:0:0:0,4:2,16:349,1031:24,60
2       89877975        .       A       T       .       QSS_ref NT=ref;QSS=6;QSS_NT=6;SGT=AA->AT;SOMATIC;TQSS=1;TQSS_NT=1       DP:FDP:SDP:SUBDP:AU:CU:GU:TU    63:7:0:0:55,141:0,0:0,0:1,5     346:36:0:0:284,710:0,0:0,1:26,79
2       89877976        .       A       T       .       QSS_ref NT=ref;QSS=1;QSS_NT=1;SGT=AT->AT;SOMATIC;TQSS=2;TQSS_NT=2       DP:FDP:SDP:SUBDP:AU:CU:GU:TU    63:8:0:0:54,138:0,0:0,0:1,8     346:36:0:0:279,674:3,9:2,26:26,81
2       89877979        .       G       A       .       PASS    NT=ref;QSS=27;QSS_NT=24;SGT=GG->AG;SOMATIC;TQSS=1;TQSS_NT=1     DP:FDP:SDP:SUBDP:AU:CU:GU:TU    66:8:0:0:0,1:0,0:55,147:3,3     354:37:0:0:28,65:3,5:286,722:0,0
2       89877988        .       G       A       .       PASS    NT=ref;QSS=32;QSS_NT=32;SGT=GG->AG;SOMATIC;TQSS=1;TQSS_NT=1     DP:FDP:SDP:SUBDP:AU:CU:GU:TU    68:2:0:0:0,3:2,2:64,145:0,0     357:26:0:0:26,56:0,0:305,694:0,0
2       89878003        .       G       A       .       PASS    NT=ref;QSS=24;QSS_NT=24;SGT=GG->AG;SOMATIC;TQSS=1;TQSS_NT=1     DP:FDP:SDP:SUBDP:AU:CU:GU:TU    61:0:0:0:0,0:0,0:61,66:0,0      214:24:1:0:14,64:0,0:176,239:0,0
2       89878013        .       G       A       .       QSS_ref NT=ref;QSS=2;QSS_NT=2;SGT=AG->AG;SOMATIC;TQSS=1;TQSS_NT=1       DP:FDP:SDP:SUBDP:AU:CU:GU:TU    48:0:0:0:0,0:0,0:48,52:0,0      224:28:0:0:13,64:0,0:182,243:1,1

These positions are not filtered out by Strelka. In the paper, the only high-depth filter used is - 
"All calls with normal sample depth >3 times the chromosomal mean (meant to remove pericentromeric regions)... [are filtered]"
(Saunders et al., Bioinformatics 2012)

Why is a high-depth filter not used on the tumor sample too?
Would it make sense to filter them out too and which threshold should we choose?

Regards,
Abhimanyu Krishna

Sean Davis

unread,
Dec 18, 2014, 2:51:34 PM12/18/14
to strelka...@googlegroups.com
Biologically speaking, tumors often harbor copy number alterations, some of which result in high copy number gains.  Filtering out variants in these regions could filter out some of the most interesting variants (for example, mutated and amplified oncogenes).

Sean


--
You received this message because you are subscribed to the Google Groups "strelka-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to strelka-discu...@googlegroups.com.
To post to this group, send email to strelka...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/strelka-discuss/ae3325b0-e14a-4f36-9ccd-47ff91814eae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chris Saunders

unread,
Jan 5, 2015, 12:15:20 PM1/5/15
to strelka...@googlegroups.com
Hi Abhimanyu,

To follow on top of Sean's comments, we aren't interested in filtering regions because of high depth per se -- the high sequencing depth in the normal is a proxy for reference compressions and other repeat/low-complexity driven read mapping noise.

-Chris


Reply all
Reply to author
Forward
0 new messages