how to create a genome file

1,768 views
Skip to first unread message

anish das

unread,
Jul 2, 2011, 5:44:06 PM7/2/11
to bedtools...@googlegroups.com
I am trying to use the genomecoverageBed. The genome file I created to
indicate the length of the chromosome is in the same folder as with the
bed file. But the genomecoverageBed can not find the chromosome in the
genome file. What am I doing wrong?

Anish

Aaron Quinlan

unread,
Jul 3, 2011, 8:12:12 AM7/3/11
to bedtools...@googlegroups.com
Could you post the contents of your genome file and the exact command you are using?

Best,
Aaron

anish

unread,
Jul 4, 2011, 11:26:11 AM7/4/11
to bedtools-discuss
Hi Aaron,
Thanks for the reply. I managed to solve that issue. I guess a header
line with chr# \t\ size was the culprit.

But now I have another question. I have a depth of coverage
(genomeCoverageBed -d) along one of the chromosome. Now I am
interested to group adjoining rows where I have counts above certain
default values. Like here, I want to find a way to pull out position
157869 to 157889 which has more counts than the neighbouring region
and possibly indicate a peak. Do you have any easy suggestion for
this?

Tb927_01_v4 157087 6
Tb927_01_v4 157088 6
Tb927_01_v4 157089 6
Tb927_01_v4 157090 6
Tb927_01_v4 157862 6
Tb927_01_v4 157863 7
Tb927_01_v4 157864 7
Tb927_01_v4 157869 50
Tb927_01_v4 157870 52
Tb927_01_v4 157871 55
Tb927_01_v4 157872 56
Tb927_01_v4 157873 57
Tb927_01_v4 157874 57
Tb927_01_v4 157875 57
Tb927_01_v4 157876 57
Tb927_01_v4 157877 57
Tb927_01_v4 157878 57
Tb927_01_v4 157879 57
Tb927_01_v4 157880 57
Tb927_01_v4 157881 57
Tb927_01_v4 157882 57
Tb927_01_v4 157883 57
Tb927_01_v4 157884 57
Tb927_01_v4 157885 57
Tb927_01_v4 157886 57
Tb927_01_v4 157887 52
Tb927_01_v4 157888 50
Tb927_01_v4 157889 50
Tb927_01_v4 157894 7
Tb927_01_v4 157949 6
Tb927_01_v4 157950 7
Tb927_01_v4 157951 8
Tb927_01_v4 157952 54
Tb927_01_v4 157953 60


Thanks again,
Anish

Stéphane Plaisance

unread,
Jul 5, 2011, 3:54:23 AM7/5/11
to bedtools-discuss
Hi Aaron,

Like Anish, I encountered quite often the problem with bed files
having a header line with '>chrom start end name' ... and blocking
bedtool's at run time

Could it be possible to add an option like --ignore n where the top n
lines of input are ignored?

This would also prove handy when merging beds and then sorting them as
the header can sometimes 'survive' and end up in the middle of the new
bed, crashing later processes.

This option would benefit if present in many of your tools, I do not
know how easy it is to implement this on a global way!

Stephane

Aaron Quinlan

unread,
Jul 5, 2011, 5:43:03 PM7/5/11
to bedtools...@googlegroups.com
Hi Anish,

There is no general purpose peak finding algorithm in BEDTools. You might try FindPeaks [1] or have a look at some of the other ChIP-seq peak finding algorithms. Unfortunately, I have zero expertise in this regard.

[1] http://www.bcgsc.ca/platform/bioinfo/software/findpeaks

Best,
Aaron

Aaron Quinlan

unread,
Jul 5, 2011, 5:45:38 PM7/5/11
to bedtools...@googlegroups.com
Hi Stephane,

This is an interesting idea, though headers are already handled so long as they are preceded by a hash symbol (#), which, historically, is the most common indicator for header lines. Is it not possible to use # symbols instead of > symbols?

Best,
Aaron

Kevin Lam

unread,
Jul 5, 2011, 11:10:34 PM7/5/11
to bedtools...@googlegroups.com
Sorry to regress,
but in case you did your genome file manually.
there's actually two ways you can do the genome file

using the .fai file
$  samtools faidx reference_genome.fasta

using the output from idxstats (contributed by Micheal)
samtools idxstats on the indexed BAM file

documented here

genomeCoverageBed to look at coverage of your WGS


Stéphane Plaisance

unread,
Jul 6, 2011, 3:56:31 AM7/6/11
to bedtools-discuss
Great to hear this.
I confess I did not see this in my readings and it makes a huge
difference
Of course I will from now use the hash to comment header lines out
(the > came from some other exports which inherited it from FASTA I
guess).

Thanks a lot Aaron for this info.
Stephane
Reply all
Reply to author
Forward
0 new messages