Error: quality string length is not equal to sequence length

138 views
Skip to first unread message

Komal Rathi

unread,
Sep 27, 2016, 6:46:00 PM9/27/16
to rna-star
Hello,

I am using STAR v2.5.2b. 

This is the genome-generate step:

/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b/bin/Linux_x86_64/STAR \
--runThreadN 4 \
--runMode genomeGenerate \
--genomeDir star_hg38_no_alt \
--genomeFastaFiles hg38.fa \
--sjdbGTFfile gencode.v23.annotation.gtf \
--sjdbOverhang 99


This is my commandline (I have tried both **STAR** and **STARlong**):

/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b/bin/Linux_x86_64/STARlong --version
STAR_2
.5.2b


/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b/bin/Linux_x86_64/STARlong \
--runThreadN 4 \
--genomeDir star_hg38_no_alt \
--readFilesIn CHP134_R1.fastq.gz CHP134_R2.fastq.gz \
--readFilesCommand zcat \
--outFileNamePrefix CHP134_ \
--outSAMtype BAM SortedByCoordinate \
--outSAMunmapped Within \
--quantMode TranscriptomeSAM \
--outSAMattributes NH HI AS NM MD \
--outFilterType BySJout \
--outFilterMultimapNmax 20 \
--outFilterMismatchNmax 999 \
--outFilterMismatchNoverReadLmax 0.04 \
--alignIntronMin 20 \
--alignIntronMax 1000000 \
--alignMatesGapMax 1000000 \
--alignSJoverhangMin 8 \
--alignSJDBoverhangMin 1 \
--sjdbScore 1 \
--limitBAMsortRAM 50000000000


But I am getting this error in for one sample:


EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length
@NB501069:27:HY32VBGXX:1:12111:11340:4339
GCGGGCGGGGAAGAGGGCACAGACGGGCGGGCGAGGGCCGGGGACCGCGAGGGCAAGGGCACCCGGGAGCCCGCAGAGGCGGCGGCTCGGGGAGAAACCTC


SOLUTION
: fix your fastq file


This is the corresponding read in the two fastq files, the read corresponds to R1 but the quality string length is equal to sequence length:


$ zcat CHP134_R1.fastq.gz | head -n 17136372 | tail -4
@NB501069:27:HY32VBGXX:1:12111:11340:4339 1:N:0:GCCAAT
GCGGGCGGGGAAGAGGGCACAGACGGGCGGGCGAGGGCCGGGGACCGCGAGGGCAAGGGCACCCGGGAGCCCGCAGAGGCGGCGGCTCGGGGAGAAACCTC
+
AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEE
<EEA


$ zcat CHP134_R2
.fastq.gz | head -n 17136372 | tail -4
@NB501069:27:HY32VBGXX:1:12111:11340:4339 2:N:0:GCCAAT
GTTTTCCTGGTGGCCCGGCCGTGCCTGAGGTTTCTCCCCGAGCCGCCGCCTCTGCGGGCTCCCGGGTGCCCTTGCCCTCGCGGTCCCCGGCCCTCGCCCGC
+
AAAAAEEEEEEEEEEEEEEEEEE
<EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEE<AEE<



I saw some solutions where you suggest to make STARlong (However this shouldnt be the case because my reads are not long). When I try to do that, I get no rule to make STARlong:


$ pwd
/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b


$ ls
bin  CHANGES
.md  doc  extras  LICENSE  README.md  RELEASEnotes.md  source


$ make
STARlong
make
: *** No rule to make target 'STARlong'.  Stop.


I am getting this error in a lot of samples but there are other samples that aligned just fine. I have attached the STAR log for this sample.

star_log.txt

Alexander Dobin

unread,
Sep 28, 2016, 5:11:11 PM9/28/16
to rna-star
Hi Komal,

if you re-start the mapping, does it break on the same read every time?
Please try to map just this one read extracted into Read1 Read2 files.

Cheers
Alex

Komal Rathi

unread,
Sep 29, 2016, 12:19:16 PM9/29/16
to rna-star
STAR or STARlong wasn't compiled properly I think. I used conda to run STAR and don't see errors anymore.
Reply all
Reply to author
Forward
0 new messages