Issue with my alignment or poor small RNA-seq dataset?

Mar 30, 2022, 9:04:21 AMMar 30
Hello Alex,

I am using STAR to analyze a small RNA-seq dataset for the first time. There were already some issues with depth so there's just about 5M reads per sample. I used ENCODE's small RNA parameters:

params=' --runThreadN 4
--genomeDir ~/genomes/star_indices/grch38.p13/index/
--readFilesCommand zcat
--sjdbGTFfile /home/blencowe/blencowe1/jlim/genomes/star_indices/grch38.p13/gencode.v38.primary_assembly.annotation.mirna.gtf
--alignEndsType EndToEnd
--outFilterMismatchNmax 1
--outFilterMultimapScoreRange 0
--quantMode TranscriptomeSAM GeneCounts
--outReadsUnmapped Fastx
--outSAMtype BAM SortedByCoordinate
--outFilterMultimapNmax 10
--outSAMunmapped Within
--outFilterScoreMinOverLread 0
--outFilterMatchNminOverLread 0
--outFilterMatchNmin 16
--alignSJDBoverhangMin 1000
--alignIntronMax 1
--outWigType wiggle
--outWigStrand Stranded
--outWigNorm RPM'

There are two major concerns I have from the alignment. These are a few red flags from the log file:

                   Uniquely mapped reads number |        614121
                        Uniquely mapped reads % |        22.03%
                          Average mapped length |        50.49

There is a low number of uniquely mapped reads and the mapped lengths are too long to be mirna.

Furthermore, when running featureCounts, none of my aligned reads were assigned to a mature miRNAs from the gff3 file on mirbase. 

Do you think these issues come from low depth and poor size selection?

Thank you in advance tor your assistance!


Alexander Dobin

May 2, 2022, 2:46:03 PMMay 2
Hi Justin,

sorry for the belayed reply.
For small RNA-seq, it's important to trim the adapter, otherwise yo may not be able to see microRNA.


