-quantMode GeneCounts output big differences between stranded and total

237 views
Skip to first unread message

Magic_tm

unread,
Jul 8, 2019, 5:35:00 AM7/8/19
to rna-star
Hi all,

I am using STAR regularly but I just encountered a really weird output with the -quantMode GeneCounts option. 
For some of the genes the column 2 output is much much lower than the sum of the stranded columns 3 +4:

here is my output:
N_unmapped 598099 598099 598099
N_multimapping 382731 382731 382731
N_noFeature 112716 10608161 10694861
N_ambiguous 1179474 44749 44951
AT1G01010 273 133 140
AT1G01020 249 123 127
AT1G01030 34 18 17
AT1G01040 1174 706 718
...
AT1G32630 26 11470 11613
AT1G32640 0 11604 11453

As you can see for the vast majority of genes column 2 is roughly column 3+4. I know it's not exactly the sum but still. However, for some genes, like AT1G32640 its 11k for 3 and 11k for 4, and column 2 is 0? Is this an issue with the gtf, although I got it from the Arabidopsis website?

My command line:
$bin_DIR/STAR-2.6.1c/bin/Linux_x86_64/STAR \
--runThreadN 64 \
--readFilesCommand zcat \
--readFilesIn $Fastq_DIR/trimmed/"$Samples"_1.trim.gz $Fastq_DIR/trimmed/"$Samples"_2.trim.gz \
--genomeDir $Genome_DIR \
--outSAMstrandField intronMotif \
--outFilterType BySJout \
--outFilterIntronMotifs RemoveNoncanonical \
--quantMode TranscriptomeSAM GeneCounts \
--twopassMode Basic \
--outFileNamePrefix $Output_DIR/$Samples

I would be grateful for any tips you may have.
Cheers Thomas

Alexander Dobin

unread,
Jul 9, 2019, 6:56:32 PM7/9/19
to rna-star
Hi @Magic_tm

if exons of these two genes overlap completely on *opposite* strands, the unstranded counting will yield 0 - all the reads will be considered "ambiguous".
The stranded counts, however, will be done towards the correspodning gene.
The easiest way to check this is to view annotations on a genome browser.

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages