Failure (and simple fix) running against hs38DH due to colons and asterisks in sequence names

297 views
Skip to first unread message

Tim Fennell

unread,
Feb 20, 2015, 10:13:41 AM2/20/15
to strelka...@googlegroups.com
Hi,

I'm posting this here as I ran into trouble trying to run strelka on human data aligned to hs38DH - the reference built by bwakit to align to human build 38 plus decoys plus alternative sequences.  The reference sequence has many HLA derived sequences that contain colons in the name, e.g. "HLA-A*01:01:01:01".

Initially I got the following error message:
    BAM headers and reference fasta disagree on chromosome: HLA-A*01

To get things working I made two small changes to the configureStrelkaWorkflow.pl script.  Firstly I modified line 281 from:
     my @vals = split(':');
to
     my @vals = split(':', $_, 2);

Secondly I added a one-liner at around line 354:
     @chroms = grep(!/[\*\:]/, @chroms)

This causes strelka not to make calls on any chromosomes with colons or asterisks in the names, which is fine for my purposes and is a lot easier than trying to make everything work on those reference sequences.

With those two changes I could make it through configuration and got a Makefile that works.

Cheers,

-Tim Fennell

Fouad Yousif

unread,
Apr 12, 2016, 5:25:43 PM4/12/16
to strelka-discuss
Thanks Tim, were you able to successfully run strelka on those samples? I did the modifications above but I am still running into issues (error 139)  that are happening in certain bins after the tool runs for hours. I was wondering if you ever ran into the same issues?

Thank you,

jshe...@nygenome.org

unread,
Apr 7, 2017, 9:01:56 AM4/7/17
to strelka-discuss
I am also ran into this issue with the bwa mem recommended version of GRCh38 plus decoys:

ERROR: Tumor and normal BAM file headers disagree on chromosome: 'HLA-A*01'

 at /strelka/strelka-1.0.15/bin/../lib/Utils.pm line 44.
Utils::errorX('Tumor and normal BAM file headers disagree on chromosome: \'H...') called at /strelka/strelka-1.0.15/bin/configureStrelkaWorkflow.pl line 316

Are there any plans to fix this bug in Strelka? It would be great if we could report the use of an actual strelka release in our pipeline.

Thanks for your help,

-Jennifer Shelton


On Friday, February 20, 2015 at 10:13:41 AM UTC-5, Tim Fennell wrote:

Benjamin Schuster-Böckler

unread,
May 12, 2017, 1:48:21 PM5/12/17
to strelka-discuss
I also ran into this issue with hg38. I was able to fix the initial issue, with the following change. I replaced line 289 of configureStrelkaWorkflow.pl with

    $h{$vals[0]} = join(':', @vals[1..$#vals]);

which will keep the full chromosome name, even if the chromosome contains colons. However, this does not ultimately work, because make doesn't like the generated makefile. I get the "target pattern contains no %" error when running make, presumably because colons in target names or make variables doesn't work.

Since HLA*1:01 etc are now included in most BAM files aligned to hg38, it would be really important to update strelka to deal with this issue!

Andrej Benjak

unread,
Apr 16, 2019, 12:37:13 PM4/16/19
to strelka-discuss
Thanks for the fix in the configureStrelkaWorkflow.pl

I managed to run Strelka1 by removing all entries with 'HLA-' in the Makefile and run.config.ini:

sed -i '/HLA\-/d' $ANALYSIS_DIR/Makefile

sed -r -i -e 's/(\tHLA\-[^\t]+)+//g' -e '/^chrom_HLA\-/d' $ANALYSIS_DIR/config/run.config.ini


This will make Strelka1 omit the HLA contigs, which was fine for me as I did not have any reads mapping to them anyway. 
Reply all
Reply to author
Forward
0 new messages