small RNA alignment

309 views
Skip to first unread message

Ron

unread,
Feb 13, 2018, 3:50:23 PM2/13/18
to rna-star
Hi Alex,

Could you suggest what parameters to use while aligning smallRNA samples?

Here is description of the samples :"
From the extracted RNA,  small RNA libraries were prepared for HiSeq sequencing.  50bp single reads and 10-15M reads per sample with small RNAseq. By multiplexing all 12 samples together in one lane of 1x50bp SR sequencing,  an average of approximately 12.5M reads per sample."

I read this post but does not mention about the read length here.

 
Also do I have to index genome separately for this or just the gtf files change as compared to regular RNAseq alignment ?

Thanks,
Ron

Alexander Dobin

unread,
Feb 19, 2018, 10:43:28 AM2/19/18
to rna-star
Hi Ron,

for small RNA you may want to prohibit splicing with --alignIntronMax 1 and, if you use indexes generated with annotations (GTF), --alignSJDBoverhangMin 10000 .

Cheers
Alex

Ron

unread,
Apr 30, 2018, 6:19:54 PM4/30/18
to rna-star
Hi  Alex,

Here are my parameters and mapping stats  from one of the small RNA sample


    --alignIntronMax 1 \
    --clip3pAdapterSeq TGGAATTCTC \
    --clip3pAdapterMMp 0.1 \
    --alignSJDBoverhangMin 10000 \
    --runThreadN 4 \
    --readFilesIn {1} {2} \
    --outSAMattributes All \
    --outFilterMultimapNmax 20 \
    --outFilterMismatchNmax 6 \
    --outFilterScoreMinOverLread 0 \
    --outFilterMatchNminOverLread 0 \
    --outFilterMismatchNoverLmax 0.05 \
    --outFilterMatchNmin 16 \
    --genomeSAindexNbases 14 \

Mapping Stats:


 Started job on |    Apr 18 18:49:31
                             Started mapping on |    Apr 18 18:51:31
                                    Finished on |    Apr 18 19:07:38
       Mapping speed, Million of reads per hour |    111.55

                          Number of input reads |    29963840
                      Average input read length |    76
                                    UNIQUE READS:
                   Uniquely mapped reads number |    10015821
                        Uniquely mapped reads % |    33.43%
                          Average mapped length |    40.90
                       Number of splices: Total |    0
            Number of splices: Annotated (sjdb) |    0
                       Number of splices: GT/AG |    0
                       Number of splices: GC/AG |    0
                       Number of splices: AT/AC |    0
               Number of splices: Non-canonical |    0
                      Mismatch rate per base, % |    0.25%
                         Deletion rate per base |    0.00%
                        Deletion average length |    1.00
                        Insertion rate per base |    0.00%
                       Insertion average length |    1.07
                             MULTI-MAPPING READS:
        Number of reads mapped to multiple loci |    17831021
             % of reads mapped to multiple loci |    59.51%
        Number of reads mapped to too many loci |    60431
             % of reads mapped to too many loci |    0.20%
                                  UNMAPPED READS:
       % of reads unmapped: too many mismatches |    5.77%
                 % of reads unmapped: too short |    0.99%
                     % of reads unmapped: other |    0.11%

Do you think they look fine ?

Also I wanted to quantify them for expression value/counts .But both cufflinks an htseq-count are not able to run on this ?Is there a better alternative?

Thanks for your help.
Ron

Alexander Dobin

unread,
May 4, 2018, 3:50:52 PM5/4/18
to rna-star
Hi Ron,

the parameters and the mapping stats look good to me.
htseq-count should work just fine with the BAM output - however, same as STAR's --quantMode GeneCounts, it will *not* count multi-mappers.
featureCounts can count multi-mappers with -M or --primary options.

Cufflinks (or other quantifiers such as RSEM, eXpress, etc) will not work well with small RNA data, as their models presume that reads are 
fragments of long RNA molecules.

Cheers
Alex

rohan1925

unread,
Jul 26, 2018, 1:09:46 PM7/26/18
to Alexander Dobin, rna-star
Hi Alex,

Can we use STAR to process 3' RNAseq samples?

Thanks
Ron


--
You received this message because you are subscribed to a topic in the Google Groups "rna-star" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rna-star/rm5LjNN2Gig/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rna-star+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/rna-star.

Alexander Dobin

unread,
Jul 27, 2018, 6:11:43 PM7/27/18
to rna-star
Hi Ron,

what exactly do you mean by 3' RNA-seq? Is it Illumina sequencing of the 3' ends of the RNAs?
If so, STAR will map it fine.

Cheers
Alex


On Thursday, July 26, 2018 at 1:09:46 PM UTC-4, Ron wrote:
Hi Alex,

Can we use STAR to process 3' RNAseq samples?

Thanks
Ron

Reply all
Reply to author
Forward
0 new messages