EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length

200 views
Skip to first unread message

Laura Spector

unread,
Jul 19, 2019, 12:35:02 PM7/19/19
to rna-star
I am running STAR-2.7.1a and I get the error EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length. I saw on a previous post the suggestion to use --readMapNumber 1 and, when doing so, my output is shown below:

EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length
@M03525:380:000000000-CDJ38:1:1101:13087:1473_GGTGTGCGTGGA
+
@M03525:380:000000000-CDJ38:1:1101:15214:1492_CGTGACATGGGG
SOLUTION: fix your fastq file

It looks like it's entirely missing the quality string and sequence string. The paired end file lengths are the same and divisible by 4. Interestingly, when I run STAR on a copy of the files pre-trimming/barcode extraction (noting that the read IDs are modified slightly upon trimming and barcode extraction by removal of the sample index, i.e., 1:N:0:TCGCCTTA, and addition of the barcode), it works fine.

Here's what I'm running:

~/STAR-2.7.1a/bin/Linux_x86_64$ ./STAR --runMode alignReads --runThreadN 48 --limitBAMsortRAM 10000000000 --readMapNumber 1 --genomeDir STAR_GENOME_overhang100 --readFilesIn DU_1_R1_done.fastq DU_1_R2_done.fastq --alignIntronMax 1 --alignMatesGapMax 2500 --outFileNamePrefix lmpcr_4000/ --outSAMtype BAM SortedByCoordinate > lmpcr_4000/LS501.STARstdout.log

Alexander Dobin

unread,
Jul 22, 2019, 11:10:54 AM7/22/19
to rna-star
Hi Laura,

I suspect there is some kind of formatting problem with fastq files.
could you post (or email me) the first 4 lines of both read1 and read2, i.e.
head -n4 DU_1_R1_done.fastq DU_1_R2_done.fastq

Cheers
Alex

XX Yang

unread,
Jul 25, 2019, 9:34:43 AM7/25/19
to rna-star
Hi Alex,

    I'm running STAR in 2-pass mode for RNA-seq data and error 'EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length' was produced. It confuses me that I've ran the STAR for another RNA-seq dataset sequenced by the same company before and it finished successfully.

    The command line I run this time was (STAR version 2.7.1a):
STAR --runMode alignReads --runThreadN 20 --genomeDir /datapool/Index_files/STAR_hg38_index/Homo_sapiens_assembly38.pri  /
--readFilesIn /datapool/home/fck/yangxx/SCZ59/seqdata/RNAseq/SCZ_01.early.R1.fastq.gz /datapool/home/fck/yangxx/SCZ59/seqdata/RNAseq/SCZ_01.early.R2.fastq.gz  /
--readFilesCommand gunzip -c --twopassMode Basic --outFileNamePrefix /datapool/home/fck/yangxx/SCZ59/result/RNAseq/STAR/SCZ_01.early/  /
           --outSAMtype BAM SortedByCoordinate --outSAMattrIHstart 0 --quantMode TranscriptomeSAM

    The first 4 lines of R1 file were extracted by 'less -S /datapool/home/fck/yangxx/SCZ59/seqdata/RNAseq/SCZ_01.early.R1.fastq.gz| head -n4':

@A00808:11:HJKWNDSXX:1:1101:3332:1063 1:N:0:CAAGGAGC+TCTCGCGC
CGCAGCCCACCTCATAAAACCCAGCATTCCCCTCACAGCAGATCACCAGCTTCTGTCCCTGGGGCTCAGCTGTCCCCCGCCGGTCCACAAACATGGTGTCAATCTCATTGCCATCACAGGCCAGCAGCTTTGCCCGGCGCCCATTACACT
+
F:FF,FF:FFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FF:FF

   and first 4 lines of R2 file:

@A00808:11:HJKWNDSXX:1:1101:3332:1063 2:N:0:CAAGGAGC+TCTCGCGC
GGCCAGTCGACTTCCACTGGGAAGAACCCAGCAGCCGGAAGGAGTCTCGAGGGGGCCCTTCCCGCCGGGGTGTGGCCCTGCTTCGCCCAGAGCCCCTGCACCGGGGGACCGCAGACACCCTCCTCAACCGGGTTAAGAAGCTGCCTTGTC
+
FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,F,FFFFFF,FFFFFFFFF,FFFFFFFFFFF:FFFFFFFFFFF


    The command line run successfully for another RNA-seq dataset was (STAR version 2.5.2b, did not use 2-pass mode parameters):
STAR --runMode alignReads --runThreadN 20 --genomeDir /home2/yangxx/STAR-index/mm10  /
--readFilesIn /home2/yangxx/visual_cortex/seq_data/Hiseq/BDbi-1_L4_I375.R1.clean.fastq.gz /home2/yangxx/visual_cortex/seq_data/Hiseq/BDbi-1_L4_I375.R2.clean.fastq.gz  /
--outFileNamePrefix /home2/yangxx/visual_cortex/result/STAR/BDbi-1/BDbi-1. --outSAMtype BAM SortedByCoordinate --outSAMattrIHstart 0  /
--readFilesCommand gunzip -c --outSAMstrandField intronMotif --outFilterIntronMotifs RemoveNoncanonical --alignEndsType EndToEnd

      The first 4 lines of BDbi R1 file:

@HISEQ:782:C9Y3HANXX:4:1101:1322:2223 1:N:0:TGAAGCT
TTGCGGTGCACGATGGAGGGGCCGGACTCATCGTACTCCTGCTTGCTGATCCACATCTGCTGGAAGGTGGACAGTGAGGCCAGGATGGAGCCACCGATCCACACAGAGTACTTGCGCTCAGGAGG
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFBFFFFFFFFFFFFF/<FF<B<</BFFBF<FFFFFFBFF/BBFF/FB/BB

    The first 4 lines of BDbi R2 file: 

@HISEQ:782:C9Y3HANXX:4:1101:1322:2223 2:N:0:TGAAGCT
GAAGTGTGACGTTGACGTCCGTAAAGACCTCTATGCCAACACAGTGCTGTCTGGTGGTACCACCATGTACCCAGGCATTGCTGACAGGATGCAGAAGGAGATTACTGCTCTGGCTCCTAGCACCA
+
BBBBBFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FBBBFFF/FFFFFFFFFFFFFFFFFBF/B/BFF<


    Although the read length and sequencing platform for these two datasets were different, I guess it should not influence the running of STAR. What should I do to solve this  problem? 



Best wishes,
Xiaoxue Yang

Alexander Dobin

unread,
Jul 25, 2019, 2:48:03 PM7/25/19
to rna-star
Hi XX Yang,

what was the exact error message from STAR? It lists the read name, sequences and qualities for the specific read that it did not like.

Cheers
Alex

Jenny McGrady

unread,
Aug 21, 2019, 1:15:34 PM8/21/19
to rna-star
Hi, I'm running into this issue as well.

I'm having trouble mapping my reads that were trimmed with Scythe. To troubleshoot, I tried mapping just the first line, but am still getting an error. The error I get with the whole trimmed file (not using the readMapNumber option) is that the read ID does not start with @ or < but the error I get with the single read is that the quality string is not equal to the sequence length. When I look at the trimmed fastq files it doesn't appear that these errors are correct. Also, if I use the same command to map the untrimmed files it works fine. Here is the exact error that I'm getting:

EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length
@K00208:YBP029:8:1101:78.29:13.82#0/1
+
@K00208:YBP029:8:1101:88.44:13.82#0/1

SOLUTION: fix your fastq file

Aug 20 15:42:21 ...... FATAL ERROR, exiting


For reference, this is the read I was attempting to map that got the error:
read1

@K00208:YBP029:8:1101:78.29:13.82#0/1
NTCCCGGAGCCGGTCGCGGCGCACCGCCACGGTGGAAGTGCGCCCGGCNGCGNCCGGTNNCNNNCNNGNGNNNNNNNNNNNNNNNNNNNNNNNNCCGGNCC
+
@``eeiii`ieiiiiii`[eiiiiiii`iiiiiiiiii[ie`iiiiei@eii@iiii`@@i@@@i@@i@L@@@@@@@@@@@@@@@@@@@@@@@@V`i[@ee

read2

@K00208:YBP029:8:1101:78.29:13.82#0/3
NAAACCGTTAAGAGGTAAACGGGTGGAGTCCGCGCAGTCCGNCCGGAGGANTCANCACAGCGNNGAGCNANCNNNCNNGNCGNTGNTNCCNGCGGATCTTT
+
@`[[[`e`eeieieLe[iieLei`V`LVeeiii`eieii`V@`iLVLV[i@VeV@LLVLVVV@@VLV[@L@L@@@V@@L@VL@L[@e@eL@LVHL[[`[V`

And here was my command for mapping:
STAR --runMode alignReads --genomeDir ~/project-kmartin/Jenny/mm10_indices/mm10StarIndex/ --runThreadN 64 --readFilesIn ~/project-kmartin/Jenny/RNASeq/demulti/ACSF2_1_trimmed.fastq ~/project-kmartin/Jenny/RNASeq/demulti/ACSF2_2_trimmed.fastq --outFileNamePrefix ~/project-kmartin/Jenny/RNASeq/aligned/ACSF2 --outSAMtype BAM SortedByCoordinate --outWigType bedGraph --genomeLoad LoadAndKeep --limitBAMsortRAM 40000000000 --quantMode GeneCounts --readMapNumber 1

Any help would be greatly appreciated!

--Jenny

Alexander Dobin

unread,
Aug 22, 2019, 2:09:01 PM8/22/19
to rna-star
Hi Jenny,

I think the problem is that the trimmed file is missing the sequence of this read. STAR will not process reads with empty sequences.
What is the output of
grep -A3 ^@K00208:YBP029:8:1101:78.29:13.82  ~/project-kmartin/Jenny/RNASeq/demulti/ACSF2_1_trimmed.fastq

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages