sam_parse1: SEQ and QUAL are of different length

1,362 views
Skip to first unread message

Ivan De Dios

unread,
Jan 3, 2017, 2:45:15 PM1/3/17
to biovalidation
I get the following error when processing a nextera rapid capture library. I've processed this type of library in bcbio before with no problems.

Here's the template yaml used:
# Template for paired (tumor/normal) variant calling
---
details
:
 
- analysis: variant2
    genome_build
: GRCh37
   
# In order to do paired variant calling, samples should belong to the
   
# same batch ("batch" under "metadata" below") and have a "phenotype"
   
# field stating either "normal" or tumor". For each batch there
   
# should be a sample with "tumor" phenotype and a sample with "normal"
   
# phenotype (no more than two samples per batch)
    metadata
:
       batch
: your-batch-name
       phenotype
: tumor # or "normal"
    algorithm
:
      aligner
: bwa
      clinical_reporting
: true
      coverage
: /isilon/bcbio-manifests/DISC.bed
      ensemble
:
        numpass
: 1
      indelcaller
: false
      mark_duplicates
: true
      min_allele_fraction
: 5
      realign
: false
      recalibrate
: false
      remove_lcr
: true
      svcaller
: [lumpy, manta, cnvkit]
      variant_regions
: /isilon/bcbio-manifests/DISC.bed
      variantcaller
: [vardict]
     
# for targetted projects, set the region
     
# variant_regions: /path/to/your.bed

Here's the debug log:
[fputs] Broken pipe
[W::sam_read1] parse error at line 126140
[bam_sort_core] truncated file. Aborting.
[E::sam_parse1] SEQ and QUAL are of different length
[W::sam_read1] parse error at line 174842
[bam_sort_core] truncated file. Aborting.
/bin/bash: line 1:  8885 Exit 1                  /usr/local/share/bcbio/galaxy/../anaconda/bin/bwa mem -c 250 -M -t 16 -R '@RG\tID:NTP16-006160D_S7_R1_001\tPL:illumina\tPU:6_2016-12-30_DISCOVERY-121316-BG\tSM:NTP16-006160D_S7_R1_001' -v 1 /usr/local/share/bcbio/genomes/Hsapiens/GRCh37/bwa/GRCh37.fa /isilon/scratch/DISCOVERY-121316-BG/work/align_prep/NTP16-006160D_S7_R1_001.fastq.gz /isilon/scratch/DISCOVERY-121316-BG/work/align_prep/NTP16-006160D_S7_R2_001.fastq.gz
     
8886 Segmentation fault      | /usr/local/share/bcbio/galaxy/../anaconda/bin/samblaster --addMateTags -M --splitterFile >(/usr/local/share/bcbio/galaxy/../anaconda/bin/samtools sort -@ 16 -m 682M -T /isilon/scratch/DISCOVERY-121316-BG/work/align/NTP16-006160D_S7_R1_001/tx/tmpaaxtwg/NTP16-006160D_S7_R1_001-sort-sorttmp-spl -o /isilon/scratch/DISCOVERY-121316-BG/work/align/NTP16-006160D_S7_R1_001/tx/tmpo0I2xz/NTP16-006160D_S7_R1_001-sort-sr.bam /dev/stdin) --discordantFile >(/usr/local/share/bcbio/galaxy/../anaconda/bin/samtools sort -@ 16 -m 682M -T /isilon/scratch/DISCOVERY-121316-BG/work/align/NTP16-006160D_S7_R1_001/tx/tmpaaxtwg/NTP16-006160D_S7_R1_001-sort-sorttmp-disc -o /isilon/scratch/DISCOVERY-121316-BG/work/align/NTP16-006160D_S7_R1_001/tx/tmpmS_IpW/NTP16-006160D_S7_R1_001-sort-disc.bam /dev/stdin)
     
8887 Done                    | /usr/local/share/bcbio/galaxy/../anaconda/bin/samtools view -b -S -u -
     
8888 Done                    | /usr/local/share/bcbio/galaxy/../anaconda/bin/sambamba sort -t 16 -m 682M --tmpdir /isilon/scratch/DISCOVERY-121316-BG/work/align/NTP16-006160D_S7_R1_001/tx/tmpaaxtwg/NTP16-006160D_S7_R1_001-sort-sorttmp-full -o /isilon/scratch/DISCOVERY-121316-BG/work/align/NTP16-006160D_S7_R1_001/tx/tmpaaxtwg/NTP16-006160D_S7_R1_001-sort.bam /dev/stdin
' returned non-zero exit status 139

Could this be a problem with the fastqs? I've gotten this same error on two separate bcbio-1.0.0 installations.

Brad Chapman

unread,
Jan 3, 2017, 8:18:20 PM1/3/17
to Ivan De Dios, biovalidation

Ivan;
Thanks for the report and sorry about the problems.

> I get the following error when processing a nextera rapid capture library.
> [fputs] Broken pipe
> [W::sam_read1] parse error at line 126140
> [bam_sort_core] truncated file. Aborting.
> [E::sam_parse1] SEQ and QUAL are of different length
> [W::sam_read1] parse error at line 174842
> [bam_sort_core] truncated file. Aborting.
> /bin/bash: line 1: 8885 Exit 1 /usr/local/share/bcbio/
> galaxy/../anaconda/bin/bwa mem -c 250 -M -t 16 -R
> '@RG\tID:NTP16-006160D_S7_R1_001\tPL:illumina\tPU:6_2016-12-30_DISCOVERY-121316-BG\tSM:NTP16-006160D_S7_R1_001'
> -v 1 /usr/local/share/bcbio/genomes/Hsapiens/GRCh37/bwa/GRCh37.fa /isilon/
> scratch/DISCOVERY-121316-BG/work/align_prep/NTP16-006160D_S7_R1_001.fastq.gz
> /isilon/scratch/DISCOVERY-121316-BG/work/align_prep/NTP16-006160D_S7_R2_001.
> fastq.gz
> 8886 Segmentation fault | /usr/local/share/bcbio/galaxy/../

We identified and Greg fixed a segfault issue in samblaster so you might have
an older version. You can update to 0.1.24, the fixed version, with:

bcbio_conda install -c bioconda samblaster

If that updates the version from something earlier, hopefully re-running will
fix the issue.

If you still have issues you could try updating to the latest development
version:

bcbio_nextgen.py upgrade -u development

and removing lumpy from your svcaller specification:

> svcaller: [lumpy, manta, cnvkit]

The bcbio will use a different approach for marking duplicates and hopefully
avoid the problem.

Hope one of these fixes it for you,
Brad

Ivan De Dios

unread,
Jan 4, 2017, 1:09:25 PM1/4/17
to biovalidation, ided...@gmail.com
Hi Brad,

Thanks the samblaster update worked and the analysis completed with no errors.

Ivan
Reply all
Reply to author
Forward
0 new messages