Problem with intersect bed files: missing chr10-22,Y

51 views
Skip to first unread message

JL

unread,
Jul 26, 2016, 4:52:14 PM7/26/16
to bedops-discuss
Hi, I'm trying to intersect 2 bed files with bedops(V2.4.9) `intersect` with the following command:
`bedops --intersect targets.bed query.all.bed > query.all.intersect.bed`

Only chr1-9 and X were in the result file. All the intervals in 10-22 and Y were missing.
But if I subset `targets.bed` and `query.all.bed` with only intervals on chr22, for example, it worked with no issues.

Another issue is that 
For example:
query1.bed
`22 16287318 16287981`
query2.bed
`22 16287253 16287885`
When run the following command, the output was empty.
`bedops --intersect query1.bed query2.bed`

I shared the 2 bed files I used.

Thanks,
JL
debug.tar.gz

Brad Gulko

unread,
Jul 26, 2016, 4:59:41 PM7/26/16
to bedops-...@googlegroups.com
Hi JL,

Files must be sorted in lexicographic, not numerical, order.
Your bed files are sorted by chromosome in numerical order,
not lexicographically,

This would positions from chr10 immediately after chr1 and before chr2.

Try using the sort-bed program in bed-ops on both input files before intersecting.

--Brad

Avast logo

This email has been checked for viruses by Avast antivirus software.
www.avast.com


JL

unread,
Jul 26, 2016, 5:03:44 PM7/26/16
to bedops-discuss
Thanks Brad, I will try to sort the files first. I missed that part.
Do you know what's going on with the second issue?
Thanks,

JL

Alex Reynolds

unread,
Jul 26, 2016, 5:49:48 PM7/26/16
to JL, bedops-discuss
In addition to sorting your inputs with sort-bed, you are using a fairly old version of BEDOPS (we're up to v2.4.19 now, soon to be v2.4.20). You may want to update your copy from v2.4.9 to v2.4.19 (assuming that wasn't a typo). 

Please see the "downloads" section in the documentation for more details: https://bedops.readthedocs.io/en/latest/

Regards,
Alex

Alex Reynolds

unread,
Jul 26, 2016, 5:54:17 PM7/26/16
to JL, bedops-discuss
Once I sorted inputs, I was able to get results, e.g.:

$ sort-bed query.all.bed > query.all.bed.sorted
$ sort-bed targets.bed > targets.bed.sorted
$ bedops --intersect targets.bed.sorted query.all.bed.sorted | head
1 861319 861395
1 865606 865610
1 865611 865620
1 865621 865624
1 865625 865627
1 865628 865639
1 865640 865653
1 865659 865660
1 866416 866471
1 871149 871278

Keep in mind that --intersect generates new regions from where there are overlaps between input files. 

If you, instead, want the set of elements that overlap between targets and query, then use --element-of.

Please see the bedops documentation for more details about the difference:


And:


Regards,
Alex

merckey

unread,
Jul 27, 2016, 9:06:18 AM7/27/16
to Alex Reynolds, bedops-discuss
Thanks Alex for the detailed explanation. That was a typo. I'm using 2.4.19. sort-bed  solved the problem.
Cheers!
JL

Brad Gulko

unread,
Jul 30, 2016, 8:41:13 AM7/30/16
to bedops-...@googlegroups.com
Just to close the loop on the second issue, I do not reproduce
your results.

When I create 2 bed files, each with one line

a.bed:
22 16287318 16287981

b.bed:
22 16287253 16287885

and run
bedops --intersect a.bed b.bed

I get
22 16287318 16287885

As expected, (bedops V2.4.19).

Be careful with field seperators, though. It appears you use spaces to separate fields.
The bed standard allows the use of space or tab to separate fields. However
I've encountered some tools will accept only tabs. Tabs (HT, TAB, ascii 9, or \t in C)
are safer.

https://genome.ucsc.edu/FAQ/FAQformat.html#format1

Good luck!

--Brad
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
> Avast logo <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
> This email has been checked for viruses by Avast antivirus software.
> www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
>

Reply all
Reply to author
Forward
0 new messages