FILTER 'REJECT' is not defined in the header

85 views
Skip to first unread message

Xiaopeng Bian

unread,
Sep 19, 2018, 2:50:31 PM9/19/18
to biovalidation
My recent bcbio jobs failed with "FILTER 'REJECT' is not defined in the header" error (yaml attached).
Here's the tail of the error message, can you help me to fix this or you need more information?
Traceback (most recent call last):
  File "/fdb/bcbio-nextgen/current/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 23, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/fdb/bcbio-nextgen/current/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 103, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
CalledProcessError: Command 'set -o pipefail; vcfcat <(/fdb/bcbio-nextgen/current/anaconda/bin/bcftools filter -m '+' -O v --soft-filter 'CHI2FILTER' -e 'INFO/CHI2 > 20.0'  /gpfs/gsfs4/users/nextgen/Xiaopeng/DCEG/bcbio/dream_indel/scalpel/12/dream_set4-12_0_32963615-scalpel-work/main/somatic.indel.vcf.gz) <(/fdb/bcbio-nextgen/current/anaconda/bin/bcftools filter -m '+' -O v --soft-filter 'REJECT' -e '%TYPE="indel"'  /gpfs/gsfs4/users/nextgen/Xiaopeng/DCEG/bcbio/dream_indel/scalpel/12/dream_set4-12_0_32963615-scalpel-work/main/common.indel.vcf.gz) |  awk -F$'\t' -v OFS='\t' '{if ($0 !~ /^#/) gsub(/[KMRYSWBVHDXkmryswbvhdx]/, "N", $4) } {print}' | /fdb/bcbio-nextgen/current/anaconda/bin/vcfstreamsort | grep -v ^##contig | bcftools annotate -h /gpfs/gsfs4/users/nextgen/Xiaopeng/DCEG/bcbio/dream_indel/bcbiotx/tmpe44KMf/dream_set4-12_0_32963615-contig_header.txt | bgzip -c > /gpfs/gsfs4/users/nextgen/Xiaopeng/DCEG/bcbio/dream_indel/bcbiotx/tmpe44KMf/dream_set4-12_0_32963615.vcf.gz
[W::vcf_parse] FILTER 'REJECT' is not defined in the header
Encountered error, cannot proceed. Please check the error output above.
grep: write error
' returned non-zero exit status 255


dream_set4.yaml

Xiaopeng Bian

unread,
Sep 20, 2018, 12:23:56 PM9/20/18
to biovalidation
ok, I fixed it after intensive search by removing Scalpel from variant callers as bcbio does not support stand alone Scalpel. 

Brad Chapman

unread,
Sep 21, 2018, 12:57:36 PM9/21/18
to Xiaopeng Bian, biovalidation

Xiaopeng;
Thanks for the detailed report and apologies about the problem. You're exactly
right that scalpel is the underlying cause here. We haven't done much work on
scalpel and haven't used it as a stand alone caller, so haven't really dug
into this issue. Practically, I'd suggest using vardict, strelka2 and mutect2
for indel calling on somatic samples. Hope this helps,
Brad
> --
> You received this message because you are subscribed to the Google Groups "biovalidation" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to biovalidatio...@googlegroups.com.
> To post to this group, send email to bioval...@googlegroups.com.
> Visit this group at https://groups.google.com/group/biovalidation.
> To view this discussion on the web visit https://groups.google.com/d/msgid/biovalidation/fcf74f23-4f26-4b96-a297-b78ea7646984%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Xiaopeng Bian

unread,
Sep 21, 2018, 3:45:40 PM9/21/18
to biovalidation
Hi, Brad:
Thanks for your reply.
I took Scapel and rerun the job, but it failed again at Mutect2. So I took Mutect2 out and it is running. But I do hope to include Mutect2 in the process. Can you have a look at the log file (attached) and let me know how I can fix it?
Thanks.
Xiaoipeng


On Wednesday, September 19, 2018 at 2:50:31 PM UTC-4, Xiaopeng Bian wrote:
dream_4_9747338_0.e

Brad Chapman

unread,
Sep 22, 2018, 1:58:54 PM9/22/18
to Xiaopeng Bian, biovalidation

Xiaopeng;
Thanks for the detailed report. It looks like GATK doesn't like your input
panel VCF:

```
--panel-of-normals /gpfs/gsfs4/users/nextgen/Xiaopeng/DCEG/bcbio/dream_indel/mutect2/panels/merged_all.vcf.gz
```
Specifically the `##contig` lines don't have a `length` field:

```
12:36:15.542 INFO FeatureManager - Using codec VCFCodec to read file
file:///gpfs/gsfs4/users/nextgen/Xiaopeng/DCEG/bcbio/dream_indel/mutect2/panels/merged_all.vcf.gz
[...]
htsjdk.tribble.TribbleException: Contig 1 does not have a length field.
```
you'll either need to remove those lines or correct them so GATK is happy
reading the VCF.

Hope this helps,
Brad
> --
> You received this message because you are subscribed to the Google Groups "biovalidation" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to biovalidatio...@googlegroups.com.
> To post to this group, send email to bioval...@googlegroups.com.
> Visit this group at https://groups.google.com/group/biovalidation.
> To view this discussion on the web visit https://groups.google.com/d/msgid/biovalidation/7c21749a-fde6-4371-8ec2-f57d4a9fb72c%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages