multiBamCov reports a count of alignments for each BAM at each interval.

534 views
Skip to first unread message

Kevin Lam

unread,
Dec 22, 2011, 1:27:27 AM12/22/11
to bedtools...@googlegroups.com
Hi Aaron, 
Thanks for the reply, I had indeed supplied 2 bams. 

I like that multiBamCov is very fast and works on the bams directly. 
but am I wrong in saying that multiBamCov reports coverage as count of alignments in the intervals
which is different from genomeCoverageBed which reports as a per base coverage. 

Would it be wrong then to multiple the 'multiBamCov coverage' by seq length (e.g. 75 bp for both SE and PE aligned reads or shorter if i trimmed all the reads) 
to get a number which would mean number of bases covered in the intervals 
(like another way to bin the per base depth coverage)

Or am I missing something here? 

Cheers
Kevin

On Tue, Dec 20, 2011 at 7:59 PM, Aaron Quinlan <aaronq...@gmail.com> wrote:
Hi Kevin,

multiBamCov will report a count of alignments for each BAM at each interval.  Based on your output (not seeing your command line), I suspect you gave two BAM files as input.  Thus the last two columns are the count of alignments in the first and second BAM, respectively.

If you did not provide two BAM files as input, please let me know.

Best,
Aaron


On Dec 20, 2011, at 2:23 AM, Kevin Lam wrote:

Sorry found the offending bam with the different bam header. 

the sample output looks like 
chr20 0 100000 19312 16844
chr20 100000 200000 43910 37579
chr20 200000 300000 43245 43215
chr20 300000 400000 41653 47556
chr20 400000 500000 42929 43165
chr20 500000 600000 44265 45325

I know what the 1st 3 columns are and I was expecting only one new column. 
So what's in the last two columns? 

Cheers
Kevin

On Tue, Dec 20, 2011 at 3:03 PM, Kevin Lam <abo...@gmail.com> wrote:
Hi, 
I can't seem to find more information on multiBamcov, (would be great if anyone can point me in the right direction)
trying to run it I encountered this problem 

multiBamCov BamMultiReader ERROR: mismatched number of references in SS600.bam expected 24 reference sequences but only found 25

I used samtools -view -H to check the headers and they look the same across the bams (which I presume is the main importance)

Does anyone have a sample output of the multiBamcov? and what should I do to format the bams in a suitable format? 

on a side note, I am trying ways to compare evenness of coverage between whole genome seq samples. Is there an established way of doing this? 

Cheers
Kevin





Aaron Quinlan

unread,
Dec 29, 2011, 7:53:50 PM12/29/11
to bedtools...@googlegroups.com
Hi Kevin,

I like that multiBamCov is very fast and works on the bams directly. 
but am I wrong in saying that multiBamCov reports coverage as count of alignments in the intervals
which is different from genomeCoverageBed which reports as a per base coverage. 
No, you are not wrong.  multiBamCov counts alignments.  I recognize that this may be confusing.

Would it be wrong then to multiple the 'multiBamCov coverage' by seq length (e.g. 75 bp for both SE and PE aligned reads or shorter if i trimmed all the reads) 
to get a number which would mean number of bases covered in the intervals 
(like another way to bin the per base depth coverage)
This is an option, though the mean could obviously be very misleading in cases where all of the alignments stack up in a very small portion of a large interval.  With that as a caveat, it is certainly something you could do.

Another option is to run coverageBed individually on each BAM.  Yet another is for me to add a feature to multiBamCov that optionally produces the same type of output as coverageBed, but with stats for _each_ BAM file at a given interval.

Best,
Aaron
Reply all
Reply to author
Forward
0 new messages