Uniquely mapped reads number

1,374 views
Skip to first unread message

Kirill Tsyganov

unread,
May 5, 2016, 2:22:24 AM5/5/16
to rna-star
Hi Alex, 

I think the answer is obvious, but I can't figure this out.

Here is Log.out.final

                                 Started job on |       May 02 09:49:31
                             Started mapping on |       May 02 09:54:03
                                    Finished on |       May 02 10:41:09
       Mapping speed, Million of reads per hour |       82.82

                          Number of input reads |       65014569
                      Average input read length |       287
                                    UNIQUE READS:
                   Uniquely mapped reads number |       59712033
                        Uniquely mapped reads % |       91.84%
                          Average mapped length |       286.87
                       Number of splices: Total |       83823919
            Number of splices: Annotated (sjdb) |       82346093
                       Number of splices: GT/AG |       83077845
                       Number of splices: GC/AG |       555665
                       Number of splices: AT/AC |       29995
               Number of splices: Non-canonical |       160414
                      Mismatch rate per base, % |       0.45%
                         Deletion rate per base |       0.02%
                        Deletion average length |       2.72
                        Insertion rate per base |       0.02%
                       Insertion average length |       2.50
                             MULTI-MAPPING READS:
        Number of reads mapped to multiple loci |       2494453
             % of reads mapped to multiple loci |       3.84%
        Number of reads mapped to too many loci |       70178
             % of reads mapped to too many loci |       0.11%
                                  UNMAPPED READS:
       % of reads unmapped: too many mismatches |       0.00%
                 % of reads unmapped: too short |       3.67%
                     % of reads unmapped: other |       0.54%
                                  CHIMERIC READS:
                       Number of chimeric reads |       0
                            % of chimeric reads |       0.00%

However when I run samtools view -c -q 255 myBamFile.bam I get 119421641 I feel like I need to dived this by two, because this is paired end, but I'd like to know why exactly..
echo $((119421641/2)) --> 59710820 This is rather close to uniquely mapped reads number from Log.final.out, but not quite..different of 1213 reads..

This is samtools flagstat myBamFile.bam

139622984 + 0 in total (QC-passed reads + QC-failed reads)
9593858 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
134004257 + 0 mapped (95.98%:-nan%)
130029126 + 0 paired in sequencing
65014569 + 0 read1
65014557 + 0 read2
124407826 + 0 properly paired (95.68%:-nan%)
124407826 + 0 with itself and mate mapped
2573 + 0 singletons (0.00%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

I think this has something to do with singletons but I can't figure this out..

I also struggle a little to figure out how to filter singletons out..I understand those are paired-end reads that have only one mate mapped and the other unmapped.. This is from the samtools docs singletons both 0x1 and 0x8 bits set and bit 0x4 not set

samtools view -c -f 9 -F 4 myBamFile.bam --> 2795 again just not quite what should be.

Any help will be much appreciated 

Kirill


Alexander Dobin

unread,
May 5, 2016, 10:38:24 AM5/5/16
to rna-star
Hi Kirill

you are on the right track, this difference is caused by singletons, since STAR counts paired-end alignment or single-end alignment as one read.
Note that you have to count only unique singletons/pairs/lines, i.e. with -q 255:

Nlines=samtools view -c -q 255 myBamFile.bam
Nsingle=samtools view -c -q 255 -f 9 -F 4 myBamFile.bam
Npaired=samtools view -c -q 255 -f 3 myBamFile.bam


Then:
Nlines=Nsingle+Npaired
Uniquely mapped reads number=Nsingle+Npaired/2

Cheers
Ales 

Kirill Tsyganov

unread,
May 5, 2016, 9:13:47 PM5/5/16
to rna-star
I knew I was missing something, thanks heaps!
Reply all
Reply to author
Forward
0 new messages