Hi Alex,
I'm running STAR in 2-pass mode for RNA-seq data and error 'EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length' was produced. It confuses me that I've ran the STAR for another RNA-seq dataset sequenced by the same company before and it finished successfully.
The command line I run this time was (STAR version 2.7.1a):
STAR --runMode alignReads --runThreadN 20 --genomeDir /datapool/Index_files/STAR_hg38_index/Homo_sapiens_assembly38.pri /
--readFilesIn /datapool/home/fck/yangxx/SCZ59/seqdata/RNAseq/SCZ_01.early.R1.fastq.gz /datapool/home/fck/yangxx/SCZ59/seqdata/RNAseq/SCZ_01.early.R2.fastq.gz /
--readFilesCommand gunzip -c --twopassMode Basic --outFileNamePrefix /datapool/home/fck/yangxx/SCZ59/result/RNAseq/STAR/SCZ_01.early/ /
--outSAMtype BAM SortedByCoordinate --outSAMattrIHstart 0 --quantMode TranscriptomeSAM
The first 4 lines of R1 file were extracted by 'less -S /datapool/home/fck/yangxx/SCZ59/seqdata/RNAseq/SCZ_01.early.R1.fastq.gz| head -n4':
@A00808:11:HJKWNDSXX:1:1101:3332:1063 1:N:0:CAAGGAGC+TCTCGCGC
CGCAGCCCACCTCATAAAACCCAGCATTCCCCTCACAGCAGATCACCAGCTTCTGTCCCTGGGGCTCAGCTGTCCCCCGCCGGTCCACAAACATGGTGTCAATCTCATTGCCATCACAGGCCAGCAGCTTTGCCCGGCGCCCATTACACT
F:FF,FF:FFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FF:FF
and first 4 lines of R2 file:
@A00808:11:HJKWNDSXX:1:1101:3332:1063 2:N:0:CAAGGAGC+TCTCGCGC
GGCCAGTCGACTTCCACTGGGAAGAACCCAGCAGCCGGAAGGAGTCTCGAGGGGGCCCTTCCCGCCGGGGTGTGGCCCTGCTTCGCCCAGAGCCCCTGCACCGGGGGACCGCAGACACCCTCCTCAACCGGGTTAAGAAGCTGCCTTGTC
FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,F,FFFFFF,FFFFFFFFF,FFFFFFFFFFF:FFFFFFFFFFF
The command line run successfully for another RNA-seq dataset was (STAR version 2.5.2b, did not use 2-pass mode parameters):
STAR --runMode alignReads --runThreadN 20 --genomeDir /home2/yangxx/STAR-index/mm10 /
--readFilesIn /home2/yangxx/visual_cortex/seq_data/Hiseq/BDbi-1_L4_I375.R1.clean.fastq.gz /home2/yangxx/visual_cortex/seq_data/Hiseq/BDbi-1_L4_I375.R2.clean.fastq.gz /
--outFileNamePrefix /home2/yangxx/visual_cortex/result/STAR/BDbi-1/BDbi-1. --outSAMtype BAM SortedByCoordinate --outSAMattrIHstart 0 /
--readFilesCommand gunzip -c --outSAMstrandField intronMotif --outFilterIntronMotifs RemoveNoncanonical --alignEndsType EndToEnd
The first 4 lines of BDbi R1 file:
@HISEQ:782:C9Y3HANXX:4:1101:1322:2223 1:N:0:TGAAGCT
TTGCGGTGCACGATGGAGGGGCCGGACTCATCGTACTCCTGCTTGCTGATCCACATCTGCTGGAAGGTGGACAGTGAGGCCAGGATGGAGCCACCGATCCACACAGAGTACTTGCGCTCAGGAGG
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFBFFFFFFFFFFFFF/<FF<B<</BFFBF<FFFFFFBFF/BBFF/FB/BB
The first 4 lines of BDbi R2 file:
@HISEQ:782:C9Y3HANXX:4:1101:1322:2223 2:N:0:TGAAGCT
GAAGTGTGACGTTGACGTCCGTAAAGACCTCTATGCCAACACAGTGCTGTCTGGTGGTACCACCATGTACCCAGGCATTGCTGACAGGATGCAGAAGGAGATTACTGCTCTGGCTCCTAGCACCA
BBBBBFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FBBBFFF/FFFFFFFFFFFFFFFFFBF/B/BFF<
Although the read length and sequencing platform for these two datasets were different, I guess it should not influence the running of STAR. What should I do to solve this problem?
Best wishes,
Xiaoxue Yang