Hello,
I am using STAR v2.5.2b.
This is the genome-generate step:
/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b/bin/Linux_x86_64/STAR \
--runThreadN 4 \
--runMode genomeGenerate \
--genomeDir star_hg38_no_alt \
--genomeFastaFiles hg38.fa \
--sjdbGTFfile gencode.v23.annotation.gtf \
--sjdbOverhang 99
This is my commandline (I have tried both **STAR** and **STARlong**):
/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b/bin/Linux_x86_64/STARlong --version
STAR_2.5.2b
/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b/bin/Linux_x86_64/STARlong \
--runThreadN 4 \
--genomeDir star_hg38_no_alt \
--readFilesIn CHP134_R1.fastq.gz CHP134_R2.fastq.gz \
--readFilesCommand zcat \
--outFileNamePrefix CHP134_ \
--outSAMtype BAM SortedByCoordinate \
--outSAMunmapped Within \
--quantMode TranscriptomeSAM \
--outSAMattributes NH HI AS NM MD \
--outFilterType BySJout \
--outFilterMultimapNmax 20 \
--outFilterMismatchNmax 999 \
--outFilterMismatchNoverReadLmax 0.04 \
--alignIntronMin 20 \
--alignIntronMax 1000000 \
--alignMatesGapMax 1000000 \
--alignSJoverhangMin 8 \
--alignSJDBoverhangMin 1 \
--sjdbScore 1 \
--limitBAMsortRAM 50000000000
But I am getting this error in for one sample:
EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length
@NB501069:27:HY32VBGXX:1:12111:11340:4339
GCGGGCGGGGAAGAGGGCACAGACGGGCGGGCGAGGGCCGGGGACCGCGAGGGCAAGGGCACCCGGGAGCCCGCAGAGGCGGCGGCTCGGGGAGAAACCTC
SOLUTION: fix your fastq file
This is the corresponding read in the two fastq files, the read corresponds to R1 but the quality string length is equal to sequence length:
$ zcat CHP134_R1.fastq.gz | head -n 17136372 | tail -4
@NB501069:27:HY32VBGXX:1:12111:11340:4339 1:N:0:GCCAAT
GCGGGCGGGGAAGAGGGCACAGACGGGCGGGCGAGGGCCGGGGACCGCGAGGGCAAGGGCACCCGGGAGCCCGCAGAGGCGGCGGCTCGGGGAGAAACCTC
+
AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEE<EEA
$ zcat CHP134_R2.fastq.gz | head -n 17136372 | tail -4
@NB501069:27:HY32VBGXX:1:12111:11340:4339 2:N:0:GCCAAT
GTTTTCCTGGTGGCCCGGCCGTGCCTGAGGTTTCTCCCCGAGCCGCCGCCTCTGCGGGCTCCCGGGTGCCCTTGCCCTCGCGGTCCCCGGCCCTCGCCCGC
+
AAAAAEEEEEEEEEEEEEEEEEE<EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEE<AEE<
I saw some solutions where you suggest to make STARlong (However this shouldnt be the case because my reads are not long). When I try to do that, I get no rule to make STARlong:
$ pwd
/mnt/isilon/cbmi/variome/bin/star/STAR-2.5.2b
$ ls
bin CHANGES.md doc extras LICENSE README.md RELEASEnotes.md source
$ make STARlong
make: *** No rule to make target 'STARlong'. Stop.
I am getting this error in a lot of samples but there are other samples that aligned just fine. I have attached the STAR log for this sample.