bedtools genomecov - new output?!

1,499 views
Skip to first unread message

Seb

unread,
Apr 30, 2014, 12:22:11 PM4/30/14
to bedtools...@googlegroups.com
hi all

I'm struggling to understand what the output I got means.
I'm running my command on bam files coming from tophat as:

samtools view -b accepted_hits.bam | bedtools genomecov -ibam stdin -g hg19.fa > g_cvrg.txt

and the g_cvrg.txt output file looks like this:

chr1 0 184679918 249250621 0.740941
chr1 1 10205478 249250621 0.0409446
chr1 2 6531063 249250621 0.0262028
chr1 3 4667464 249250621 0.018726
chr1 4 3622280 249250621 0.0145327
chr1 5 2723840 249250621 0.0109281
chr1 6 2066780 249250621 0.00829198
chr1 7 1683803 249250621 0.00675546
chr1 8 2096809 249250621 0.00841245
chr1 9 1581081 249250621 0.00634334


however, the documentation on here (http://bedtools.readthedocs.org/en/latest/content/tools/genomecov.html) doesn't show any similar output so I am trying to understand what the cols "should be":

1- chr
2- index on chr
3- length of fragment from position in col 2??
4- size chr
5- fraction of bases in 3 over size chr (col3/co4)


so...looking that I have 184679918 from position 0 (0 --> 184679918) with 0.74% coverage and then I have one nt after (second row) a 10205478nt long fragment (1-->10205479) with 0.04% coverage..does it mean that the fragments should look like:

nt index          0        1        2        .......
chr                 ------------------------------------------------------------------------------------------------------------------------ .........
first row data   --------------------------------------------------------------------------------
2nd row data             --------------------


if this is true, please let me know and I'll add this to the github doc about the output.


thanks






Aaron Quinlan

unread,
Apr 30, 2014, 12:25:06 PM4/30/14
to bedtools...@googlegroups.com
--
You received this message because you are subscribed to the Google Groups "bedtools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bedtools-discu...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Seb

unread,
Apr 30, 2014, 12:35:16 PM4/30/14
to bedtools...@googlegroups.com
that output says

1 chromosome (or entire genome)
2 depth of coverage from features in input file
3 number of bases on chromosome (or genome) with depth equal to column 2.
4 size of chromosome (or entire genome) in base pairs
5 fraction of bases on chromosome (or entire genome) with depth equal to column 2.

so, col 2 grows linearly, i.e. first and last rows in chr 1 is 
chr1 0 184679918  249250621 0.740941
[..]
chr1 36333 1 249250621 4.01203e-09

does that mean that I have 

first : depth 0 (i.e. no reads) for 184679918nt
last : 1 read (or nt) at 36333x depth

...I'm a confused about how can I have a 36333x depth...
thanks

Aaron Quinlan

unread,
Apr 30, 2014, 12:44:16 PM4/30/14
to bedtools...@googlegroups.com
Yes, that is what it means.  I don't know the details of your experiment, but this can occur in highly repetitive genomic regions.  You can try to isolate the specific regions with really high coverage (depth greater than 10000) with the following:

bedtools genomecov -ibam in.bam -bga | awk '$4 >10000'

The -bga option is described here:

aykaz....@gmail.com

unread,
Mar 13, 2019, 11:28:34 AM3/13/19
to bedtools-discuss
I don't know why the manual is so misleading, but they give you a hint: "As an example, let's produce a histogram of coverage of the exons throughout the genome", so keep in mind that it as a histogram. The second value is not an index, it is actually depth, and the third value is a number of bases with such depth. So you haven't sequenced 74% of the chr1.
Reply all
Reply to author
Forward
0 new messages