Error: quality string length is not equal to sequence length

Komal Rathi

unread,

Sep 27, 2016, 6:46:00 PM9/27/16

to rna-star

Hello,

I am using STAR v2.5.2b.

This is the genome-generate step:

/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b/bin/Linux_x86_64/STAR \
--runThreadN 4 \
--runMode genomeGenerate \
--genomeDir star_hg38_no_alt \
--genomeFastaFiles hg38.fa \
--sjdbGTFfile gencode.v23.annotation.gtf \
--sjdbOverhang 99

This is my commandline (I have tried both **STAR** and **STARlong**):

/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b/bin/Linux_x86_64/STARlong --version
STAR_2.5.2b


/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b/bin/Linux_x86_64/STARlong \
--runThreadN 4 \
--genomeDir star_hg38_no_alt \
--readFilesIn CHP134_R1.fastq.gz CHP134_R2.fastq.gz \
--readFilesCommand zcat \
--outFileNamePrefix CHP134_ \
--outSAMtype BAM SortedByCoordinate \
--outSAMunmapped Within \
--quantMode TranscriptomeSAM \
--outSAMattributes NH HI AS NM MD \
--outFilterType BySJout \
--outFilterMultimapNmax 20 \
--outFilterMismatchNmax 999 \
--outFilterMismatchNoverReadLmax 0.04 \
--alignIntronMin 20 \
--alignIntronMax 1000000 \
--alignMatesGapMax 1000000 \
--alignSJoverhangMin 8 \
--alignSJDBoverhangMin 1 \
--sjdbScore 1 \
--limitBAMsortRAM 50000000000

But I am getting this error in for one sample:

EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length
@NB501069:27:HY32VBGXX:1:12111:11340:4339
GCGGGCGGGGAAGAGGGCACAGACGGGCGGGCGAGGGCCGGGGACCGCGAGGGCAAGGGCACCCGGGAGCCCGCAGAGGCGGCGGCTCGGGGAGAAACCTC


SOLUTION: fix your fastq file

This is the corresponding read in the two fastq files, the read corresponds to R1 but the quality string length is equal to sequence length:

$ zcat CHP134_R1.fastq.gz | head -n 17136372 | tail -4
@NB501069:27:HY32VBGXX:1:12111:11340:4339 1:N:0:GCCAAT
GCGGGCGGGGAAGAGGGCACAGACGGGCGGGCGAGGGCCGGGGACCGCGAGGGCAAGGGCACCCGGGAGCCCGCAGAGGCGGCGGCTCGGGGAGAAACCTC
+
AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEE<EEA


$ zcat CHP134_R2.fastq.gz | head -n 17136372 | tail -4
@NB501069:27:HY32VBGXX:1:12111:11340:4339 2:N:0:GCCAAT
GTTTTCCTGGTGGCCCGGCCGTGCCTGAGGTTTCTCCCCGAGCCGCCGCCTCTGCGGGCTCCCGGGTGCCCTTGCCCTCGCGGTCCCCGGCCCTCGCCCGC
+
AAAAAEEEEEEEEEEEEEEEEEE<EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEE<AEE<

I saw some solutions where you suggest to make STARlong (However this shouldnt be the case because my reads are not long). When I try to do that, I get no rule to make STARlong:

$ pwd
/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b


$ ls
bin  CHANGES.md  doc  extras  LICENSE  README.md  RELEASEnotes.md  source


$ make STARlong
make: *** No rule to make target 'STARlong'.  Stop.

I am getting this error in a lot of samples but there are other samples that aligned just fine. I have attached the STAR log for this sample.

star_log.txt

Alexander Dobin

unread,

Sep 28, 2016, 5:11:11 PM9/28/16

to rna-star

Hi Komal,

if you re-start the mapping, does it break on the same read every time?

Please try to map just this one read extracted into Read1 Read2 files.

Cheers

Alex

Komal Rathi

unread,

Sep 29, 2016, 12:19:16 PM9/29/16

to rna-star

STAR or STARlong wasn't compiled properly I think. I used conda to run STAR and don't see errors anymore.

Reply all

Reply to author

Forward