Parameters and expected output using STAR for mapping microRNA

Yuquan Tong

unread,

Mar 30, 2021, 10:40:22 AM3/30/21

to rna-star

Hi,

I found a 2013 email thread about using STAR to map small RNA seq data and have tested it with my own dataset. Since the parameters from that email is rather old, I want to post and confirm if I need to make any changes, and if my outputs look good or not.

The parameters I used are below (part from previous email discussion and part from ENCODE project guideline):

--sjdbGTFfile ~/STARfiles/GENCODE_miRNA_subset.gtf \

--readFilesCommand zcat \

--outFileNamePrefix ~/155micro/STARoutput/SA-463-01/463-01- \

--outSAMtype BAM SortedByCoordinate \

--outSAMattributes Standard \

--alignEndsType EndToEnd \

--outFilterMismatchNmax 1 \

--outFilterMultimapScoreRange 0 \

--quantMode TranscriptomeSAM GeneCounts \

--outReadsUnmapped Fastx \

--outFilterMultimapNmax 10 \

--outSAMunmapped Within \

--outFilterScoreMinOverLread 0 \

--outFilterMatchNminOverLread 0 \

--outFilterMatchNmin 16 \

--alignSJDBoverhangMin 1000 \

--alignIntronMax 1

The input files were trimmed by cutadapt with --minimum-length 1

The Log.out report screenshot is below.

Since this is my first time using STAR for small RNAs, may I ask if my output looks fine/good/bad? If not good, is there any parameter I should tweak to improve?

Thanks a lot!

Yuquan

Alexander Dobin

unread,

Mar 31, 2021, 5:35:27 PM3/31/21

to rna-star

Hi @yuquantong97

the parameters are still OK.

The results look good to me, with few unmappable reads and a nice proportion of unique mappers.

Cheers

Alex

Yuquan Tong

unread,

Apr 4, 2021, 1:51:24 AM4/4/21

to rna-star

Sorry I think I replied only by email, not on website. Here I copied below:

Thank you so much, Alex.

I have a follow up question. To analyze bam files, I used featurecounts to generate a read count table for Deseq2 analysis.

Here is the script I ran for featurecounts, using gff file downloaded from mirbase:

featureCounts -a hsa.gff3 -t miRNA -g 'Name' -o 460-01_counts.txt /path/463-01-Aligned.sortedByCoord.out.bam

I I attached the screenshot of my featurecounts result summary below. It seemed only 26.3% of reads were successfully assigned. Is this normal because many reads were not mapped to the mirome but to the genome?

If not normal, what can I try to improve the percentage of assigned reads?

Thanks!

Yuquan

Alexander Dobin

unread,

Apr 8, 2021, 6:28:47 PM4/8/21

to rna-star

Hi @yuquantong97

This is probably normal - it means that only ~26% of your reads are miRNA. The rest are probably other non-coding RNA, or fragments of them.

A high % of miRNA is hard to achieve unless the protocol uses very careful size selection steps filtering out RNAs longer than ~24b.