samtools flagstat

672 views
Skip to first unread message

Joseph Dhahbi, PhD

unread,
Sep 7, 2010, 3:02:24 PM9/7/10
to bedtools...@googlegroups.com
Hi
I have a fastq file with 8265138 reads:

myFile.fastq
class: ShortReadQ
length: 8265138 reads; width: 18..36 cycles

I mapped it with bowtie to generate myFile.bam:
# reads processed: 8265138
# reads with at least one reported alignment: 6554909 (79.31%)
# reads that failed to align: 1479977 (17.91%)
# reads with alignments suppressed due to -m: 230252 (2.79%)
Reported 12240725 alignments to 1 output stream(s)

I counted the number of alignments; it matches bowtie output:
bamToBed -i myFile.bam | wc -l
12240725

When I used samtools to look at the bam file, it gave me 13950954 in total;
but I don't see this number in the above bowtie output:
samtools flagstat myFile.bam
[bam_header_read] EOF marker is absent.
13950954 in total
12240725 mapped (87.74%)

can you please explain the meaning of '13950954 in total'?


Regards,
Joseph

Joseph M. Dhahbi, PhD
Childrens Hospital Oakland Research Institute
5700 Martin Luther King Jr. Way
Oakland, CA 94609
USA
Ph.(510)428-3885 EXT.5743
Cell.(702)335-0795
Fax (510)450-7910
jdh...@chori.org
The email message (and any attachments) is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message (and any attachments). Thank You.

Aaron Quinlan

unread,
Sep 7, 2010, 3:08:46 PM9/7/10
to bedtools...@googlegroups.com
Hi Joseph,

I interpret this to mean that you have 1710229 (13950954 - 12240725) reads (or "ends" if this is paired-end data) that could not be aligned to the genome (aka unmaped). If you do a:

$ samtools view -f 0x4 myFile.bam | wc -l

my guess would be that the result is 1710229.

Best,
Aaron

Joseph Dhahbi, PhD

unread,
Sep 7, 2010, 3:51:13 PM9/7/10
to bedtools...@googlegroups.com
Thanks Aaron;

samtools view -f 0x4 myFile.bam | wc -l

1710229.

I am still confused, the starting number of the reads is 8265138 which is
less than 13950954.

These are the Reporting parameters I used with bowtie:
-a -m 5


Regards,
Joseph

Joseph M. Dhahbi, PhD
Childrens Hospital Oakland Research Institute
5700 Martin Luther King Jr. Way
Oakland, CA 94609
USA
Ph.(510)428-3885 EXT.5743
Cell.(702)335-0795
Fax (510)450-7910
jdh...@chori.org

Aaron Quinlan

unread,
Sep 7, 2010, 3:57:03 PM9/7/10
to bedtools...@googlegroups.com
Hi Joseph,

samtools flagstat reports statistics on the number of _alignments_, not statistics on the number of _reads_. In other words, it seems that you have reads that have more than one alignment in the BAM file. I'm not an expert with Bowtie, but I suspect the parms you used request multiple alignments. In contrast, BWA attempts to report a single alignment for each read.

Joseph Dhahbi, PhD

unread,
Sep 7, 2010, 3:59:54 PM9/7/10
to bedtools...@googlegroups.com
Thank you for your help.


Regards,
Joseph

Joseph M. Dhahbi, PhD
Childrens Hospital Oakland Research Institute
5700 Martin Luther King Jr. Way
Oakland, CA 94609
USA
Ph.(510)428-3885 EXT.5743
Cell.(702)335-0795
Fax (510)450-7910
jdh...@chori.org

On Tue, 7 Sep 2010 15:57:03 -0400

Reply all
Reply to author
Forward
0 new messages