intersectBed for Multiple files

586 views
Skip to first unread message

Manoj

unread,
Sep 28, 2010, 4:23:05 PM9/28/10
to bedtools-discuss
Hello,
I was wondering if we can perform intersection of multiple bed files
using the intersectBed module...
For eg., I have ChIPSeq datasets for Pol2 (or any TF) done in several
different conditions, in multiple cell lines. Can we derive a set of
peaks which is common across datasets?
Any help would be appreciated.
Thanks,
Manoj.

Davide Cittaro

unread,
Sep 28, 2010, 4:41:14 PM9/28/10
to bedtools...@googlegroups.com
On Sep 28, 2010, at 10:23 PM, Manoj wrote:

Hello,
I was wondering if we can perform intersection of multiple bed files
using the intersectBed module...
For eg., I have ChIPSeq datasets for Pol2 (or any TF) done in several
different conditions, in multiple cell lines. Can we derive a set of
peaks which is common across datasets?

If you need common regions only you can try an intersect pipeline, i.e.

intersectBed -a 1.bed -b 2.bed | intersectBed -a stdin -b 3.bed | intersectBed -a stdin -b 4.bed...

You may want to add some slopBed between the various pipes to extend tolerance... 

d
/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

*/



Aaron Quinlan

unread,
Sep 28, 2010, 5:53:17 PM9/28/10
to bedtools...@googlegroups.com
Alternatively, if your peaks are in BEDGRAPH format, you could use unionBedGraphs to find regions with peaks in all samples.  One would just use awk to require that the value from each sample was >= some rational cutoff.  See section 5.23 of the manual for an example.

arq

Manoj

unread,
Sep 28, 2010, 6:07:03 PM9/28/10
to bedtools-discuss
What I have are simple bed files with 4 columns. (chr, start, end,
name).
> > e-mail: davide.citt...@ifom-ieo-campus.it
> > */

Aaron Quinlan

unread,
Sep 28, 2010, 6:13:05 PM9/28/10
to bedtools...@googlegroups.com
Both Davide's suggestion and unionBedGraphs should work in this case. Note that for the latter, you'll want to make sure that your files are sorted by chrom, then start position (i.e. sort -k1,1 -k2,2n).

Since you have names for your "value" column, you'll want to use the "-filler" option (for example, "-" or "N/A") with unionBedGraphs to define what a "non-peak" value will look like. You can then restrict your output to those intervals where none of your samples have the filler value. See section 5.23.6

Aaron

Manoj

unread,
Sep 28, 2010, 7:17:02 PM9/28/10
to bedtools-discuss
Thanks very much to both of you...
Regards,
Manoj.
Reply all
Reply to author
Forward
0 new messages