interval set operations

20 views
Skip to first unread message

Brent Pedersen

unread,
Feb 11, 2017, 12:04:32 PM2/11/17
to biogo...@googlegroups.com
Is there anything in biogo for interval sets? For example, I'd like to
be able to have a set of intervals A and another set B and to get,
e.g. the union,intersection,subtraction, etc.

I didn't see this; would there be a place for this somewhere if I were
to implement? Perhaps as functions on feat.Range and feat.Set?

thanks,
-Brent

Dan Kortschak

unread,
Feb 11, 2017, 4:59:51 PM2/11/17
to Brent Pedersen, biogo...@googlegroups.com
When I want unordered sets (i.e. most of the time) I use
map[struct{chr, start, end int}]struct{} or map[struct{chr string;
start, end int}]. If I want an ordered set I use interval.IntTree[1].

[1]https://godoc.org/github.com/biogo/store/interval#IntTree

Dan Kortschak

unread,
Feb 11, 2017, 5:03:20 PM2/11/17
to Brent Pedersen, biogo...@googlegroups.com
Actually, the first is more likely to be map[struct{chr, start, end
int}]T or map[struct{chr string;> start, end int}]T where T is the type
storing the interval. It depends on whether I'm keeping coverage or I
actually need access to the actual value that is part of the set. With
the T payload, it becomes necessary to do insertion uniqueness
validation.

Also note that interval.IntTree will give you a multiset, not a true
set unless insertion uniqueness validation is performed.

Brent Pedersen

unread,
Feb 11, 2017, 11:11:50 PM2/11/17
to Dan Kortschak, biogo...@googlegroups.com
I mean a set where e.g. I can union (1, 9) and (7, 14) and it will
give (1, 14) and then I can subtract (5, 8) and get (1, 5), (9, 14)

Dan Kortschak

unread,
Feb 11, 2017, 11:18:47 PM2/11/17
to Brent Pedersen, biogo...@googlegroups.com
Ahh, OK. biogo/store/step will do that for you.

Brent Pedersen

unread,
Feb 13, 2017, 12:18:21 PM2/13/17
to Dan Kortschak, biogo...@googlegroups.com
Great! So, does this seem reasonable for Intersection of multiple *step.Vector

https://gist.github.com/brentp/4419f6711917096bce83e7eddb648c85 ?

Or is there a more efficient (in LOC or CPU) way?
If this is generally close, then having a Vector.Clone() method would
be convenient.

thanks,
-Brent

Brent Pedersen

unread,
Feb 13, 2017, 4:22:52 PM2/13/17
to Dan Kortschak, biogo...@googlegroups.com
I ended up with this: https://godoc.org/github.com/brentp/rangeset
embedding step.Vector

Dan Kortschak

unread,
Feb 13, 2017, 4:51:46 PM2/13/17
to Brent Pedersen, biogo...@googlegroups.com
Is there a reason to have the step vector embedded anonymously?

Brent Pedersen

unread,
Feb 13, 2017, 5:00:50 PM2/13/17
to Dan Kortschak, biogo...@googlegroups.com
You mean instead of naming it, like:

type RangeSet struct {
v *step.Vector
}

?
I guess originally, I was thinking the user might want to access some
of the public methods from the step.Vector.

On Mon, Feb 13, 2017 at 2:51 PM, Dan Kortschak

Dan Kortschak

unread,
Feb 13, 2017, 5:59:50 PM2/13/17
to Brent Pedersen, biogo...@googlegroups.com

You've made it type safe by providing the comparer, so you may as well go the whole way and stop people doing the wrong thing. If there are things that are worth exposing, expose them explicitly.

--
afk
Dan


From: Brent Pedersen
Sent: Tuesday, 14 February, 08:30
Subject: Re: [biogo-user] interval set operations
To: Dan Kortschak
Cc: biogo...@googlegroups.com

You mean instead of naming it, like: type RangeSet struct { v *step.Vector } ? I guess originally, I was thinking the user might want to access some of the public methods from the step.Vector. On Mon, Feb 13, 2017 at 2:51 PM, Dan Kortschak wrote: > Is there a reason to have the step vector embedded anonymously? > > On Mon, 2017-02-13 at 14:22 -0700, Brent Pedersen wrote: >> I ended up with this: https://godoc.org/github.com/brentp/rangeset >> embedding step.Vector >> >> On Mon, Feb 13, 2017 at 10:18 AM, Brent Pedersen >> wrote: >> > >> > Great! So, does this seem reasonable for Intersection of multiple >> > *step.Vector >> > >> > https://gist.github.com/brentp/4419f6711917096bce83e7eddb648c85 ? >> > >> > Or is there a more efficient (in LOC or CPU) way? >> > If this is generally close, then having a Vector.Clone() method >> > would >> > be convenient. >> > >> > thanks, >> > -Brent >> > >> > On Sat, Feb 11, 2017 at 9:18 PM, Dan Kortschak >> > wrote: >> > > >> > > Ahh, OK. biogo/store/step will do that for you. >> > > >> > > On Sat, 2017-02-11 at 21:11 -0700, Brent Pedersen wrote: >> > > > >> > > > I mean a set where e.g. I can union (1, 9) and (7, 14) and it >> > > > will >> > > > give (1, 14) and then I can subtract (5, 8) and get (1, 5), (9, >> > > > 14)

Brent Pedersen

unread,
Feb 13, 2017, 6:23:49 PM2/13/17
to Dan Kortschak, biogo...@googlegroups.com
Agreed.
I made the change.
Reply all
Reply to author
Forward
0 new messages