Troubleshooting Germline Performance

Feb 20, 2018, 3:59:41 PM2/20/18
to strelka-discuss

I'm evaluating Strelka2 for small variant calling on WES data, and I'm having trouble replicating results from literature. I'm comparing Strelka2 to both HaplotypeCaller and Sentieon for known SNVs/Indels. BQSR has been applied to the mapped reads, but no indel realignment has been performed. 

I'm configuring the Strelka workflow with the `--exome` flag enabled. When I run the workflow, I obtain results that are approximately correct, but fall short of both HaplotypeCaller and Sentieon. For example: (~ for approximate)

                                     Sensitivity           Specificity 
HaplotypeCaller:             ~.99                       ~1
Sentieon:                         ~.99                      ~1
Strelka2:                           .971                     ~1

                                      Sensitivity             Specificity 
HaplotypeCaller:                ~.92                      ~1
Sentieon:                           ~.92                      ~1
Strelka2:                            .874                      ~1

My results with Strelka2 are not completely incorrect, but are not in the ballpark of published results, which appear to meet or exceed the performance of tools like HaplotypeCaller. Is this this performance expected from Strelka2 on exome data? If not, are there any troubleshooting steps I could try to improve performance? 



Saunders, Chris

Feb 23, 2018, 1:21:15 PM2/23/18

Hi Nick,


Sorry for the delay. Note we are better able to support queries through issues filed on the github repo here:


Regarding your question, strelka2 is optimized for whole genome sequencing, and all of our client projects are WGS. We do support exome but put in less optimization effort on this case, so your result is not necessarily surprising, although the indel recall you report is certainly a bit lower than expected. It is hard to tell if this is just a difference in PASS levels because only specificity rather than precision is visible below, thus the corresponding difference in FP counts between these tools is not obvious.


To optimize for exome, we would recommend training a scoring model specifically for the exome case. The scoring model training procedure for strelka2 is described here:


…we’ve done some quick prototypes on this for exome and seen good results, but when you use the `--exome` flag today for germline analysis, strelka disables its rescoring model and reverts to simple hard filters.


Hope that helps,



