How to read the Log.final.out file

2,269 views
Skip to first unread message

KJ Lim

unread,
Jul 29, 2014, 11:56:07 AM7/29/14
to rna-...@googlegroups.com
Dear rna-Star community,

Good day.

I mapped HiSeq reads to a transcriptome. The mapped statistic was
reported in a Log.final.out file, but, I'm a bit confuse with the
information
(below) in the file.

> Number of input reads 28786793
> Average input read length 101
>
> UNIQUE READS:
> Uniquely mapped reads number 16758293
> Uniquely mapped reads % 58.22%
> Average mapped length 99.97
> Number of splices: Total 0
> Number of splices: Annotated (sjdb) 0
> Number of splices: GT/AG 0
> Number of splices: GC/AG 0
> Number of splices: AT/AC 0
> Number of splices: Non-canonical 0
> Mismatch rate per base, % 0.64%
> Deletion rate per base 0.00%
> Deletion average length 0.00
> Insertion rate per base 0.00%
> Insertion average length 0.00
>
> MULTI-MAPPING READS:
> Number of reads mapped to multiple loci 9287804
> % of reads mapped to multiple loci 32.26%
> Number of reads mapped to too many loci 1101031
> % of reads mapped to too many loci 3.82%
>
> UNMAPPED READS:
> % of reads unmapped: too many mismatches 0.00%
> % of reads unmapped: too short 5.66%
> % of reads unmapped: other 0.04%


It says reads uniquely mapped was 58.22% (16 758 293 reads); thus,
unmapped reads will 12 028 500

What about those reads mapped to multiple loci, reads mapped to too
many loci and unmapped too short and other?

How to interpret those information? Could someone please share with me
your experience.

Thank you very much for your help and time.

Best regards,
KJ Lim

Kipp A

unread,
Jul 30, 2014, 12:12:25 PM7/30/14
to rna-...@googlegroups.com
Hi KJ,

58.22% of your reads mapped to just 1 place in your reference
32.26% mapped to more than 1 place, but less than the limit set in your STAR parameter (--outFilterMultimapNmax) default is 10
3.82% mapped to more than the limit in the above line
5.70% did not map

It looks like there were no splices found, which may be why your uniquely mapped is a bit low and your multimapped is a bit high.  But it's hard to tell without knowing what parameters you used and what reference you aligned to.

I hope that helps!
Kipp

KJ Lim

unread,
Jul 30, 2014, 2:46:14 PM7/30/14
to Kipp A, rna-...@googlegroups.com
Dear Kipp
​ and rna-star community,​


Thanks for your explanation.

​1. If I understand correctly, the % of mapped reads are 94.3%.

Out of this 94.3%:

58.22% reads were uniquely mapped;
32.26% reads mapped to more than 1 place,
  3.82% reads mapped to too many loci

Only 5.66% reads were unmapped?

2. The parameter I used for mapping:

STAR   --runThreadN 16
       --genomeDir /wrk/lim/refGenome
       --readFilesIn /wrk/lim/hiSeq1/TGACCA_L002_R1_001.fastq.gz
       --readFilesCommand zcat
       --alignIntronMax 1
       --alignIntronMin 2
       --scoreDelOpen -10000
       --scoreInsOpen -10000
       --outFileNamePrefix /wrk/lim/STAR/test_02_
       --outReadsUnmapped Fastx
       --outSAMattributes All

​How can I increase the percentage of unique mapped reads?
 
Thanks for your time and help.

Best regards,
KJ Lim




--
You received this message because you are subscribed to the Google Groups "rna-star" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rna-star+u...@googlegroups.com.
Visit this group at http://groups.google.com/group/rna-star.

Kipp A

unread,
Jul 30, 2014, 4:06:15 PM7/30/14
to rna-...@googlegroups.com, kip...@gmail.com
What is your reason for these paramters?
       --alignIntronMax 1
       --alignIntronMin 2

You've essentially turned off STAR's ability to find new splice junctions.  I suspect if you put these back to default, a number of your reads that are spanning introns would suddenly map to one location rather than two.  Providing a splice junctions database ( a GTF file) can have the same effect.  See this thread for some discussion: https://groups.google.com/d/topic/rna-star/yIyfhyJaaR4/discussion

Cheers,
Kipp

KJ Lim

unread,
Jul 31, 2014, 1:20:29 AM7/31/14
to Kipp A, rna-...@googlegroups.com
Dear Kipp,

Thanks for your prompt replied.

The reason I used that parameter because I mapped my reads to a transcritopme not a genome. I'm working on non model species.

Thanks for your time.

Best regards,
KJ Lim

Alexander Dobin

unread,
Jul 31, 2014, 5:43:20 PM7/31/14
to rna-...@googlegroups.com, kip...@gmail.com
Hi KJ Lim,

in addition to nice comments and suggestions rom Kipp, I would point out that you can expect a large % of multimappers when you map to transcriptome, since many alternative isoforms differ by only a small portion of their sequence, so many reads will map to alternative isoforms equally well - at the same time, if mapped to the genome (if you had one), these reads will be unique mappers.

Cheers
Alex
To unsubscribe from this group and stop receiving emails from it, send an email to rna-star+unsubscribe@googlegroups.com.

KJ Lim

unread,
Aug 8, 2014, 1:27:15 AM8/8/14
to Alexander Dobin, rna-...@googlegroups.com, Kipp A
Dear Alexander, Kipp, and community,

Just back from vacation.

Thanks for the explanation.  I don't have a genome to map with as it is not a model species.

Thanks a lot for your help and time. Have a nice weekend.

Best regards,
KJ Lim





Reply all
Reply to author
Forward
0 new messages