Faster ways to unionize intersecting sets?

11 views
Skip to first unread message

lsha

unread,
Oct 4, 2006, 6:27:15 AM10/4/06
to
Hi,

I need to search a list of sets and unionize any sets that intersect. The
following functions seems to work but may be there are faster ways?

Thanks in advance.

intersectQ[s1_, s2_] := If[Intersection[s1, s2] != {}, True, False]

mergeSets[s_List] := Module[
{h, r, singles, club, cnt},
cnt = Length[s];
If[cnt < 2, Return[s]];
singles = {};
club = s;

While[cnt >= 2,
h = club[[1]];
r = Rest[club];
hit = 0;
club = If[intersectQ[h, #], hit = 1; Union[h, #], #] & /@ r;
If[hit == 0, singles = Append[singles, h]];
--cnt;
];
Join[singles, club]
]


Jean-Marc Gulliet

unread,
Oct 5, 2006, 4:01:28 AM10/5/06
to
Using functional programming, mergeSets2 (see In[5]) is 15 to 20 times
faster than the original procedural code.

In[1]:=


intersectQ[s1_, s2_] := If[Intersection[s1, s2] != {},
True, False]
mergeSets[s_List] := Module[{h, r, singles, club,
cnt}, cnt = Length[s]; If[cnt < 2, Return[s]];
singles = {}; club = s; While[cnt >= 2,
h = club[[1]]; r = Rest[club]; hit = 0;

club = (If[intersectQ[h, #1], hit = 1;
Union[h, #1], #1] & ) /@ r;


If[hit == 0, singles = Append[singles, h]];
--cnt; ]; Join[singles, club]]

In[3]:=
s = Table[Table[Random[Integer, {1, 1000}],
{Random[Integer, {1, 20}]}], {100}];

In[4]:=
t1 = Timing[res1 = mergeSets[s]; ][[1]]

Out[4]=
1.234 Second

In[5]:=
mergeSets2[s_List] := Module[{list, pos, singles,
club}, list = MapIndexed[Intersection[#1,
Flatten[Drop[s, #2]]] & , s];
pos = Position[list, {}]; singles =
Extract[s, pos]; club =
Union[Flatten[Complement[s, singles]]];
{singles, club}]

In[6]:=
t2 = Timing[res2 = mergeSets2[s]; ][[1]]

Out[6]=
0.063 Second

In[7]:=
t1/t2

Out[7]=
19.5873

Regards,
Jean-Marc

lsha

unread,
Oct 7, 2006, 7:07:29 AM10/7/06
to
Hi,

I tried mergeSets2[] and the answer is different from mergeSets[].
An example:
In[34]:=
s = {{1, 2}, {2, 3}, {4, 5}, {6, 7}, {7, 5}, {9, 10}, {11, 12}, {13, 14}};


In[35]:=
mergeSets[s]
Out[35]=
{{1, 2, 3}, {4, 5, 6, 7}, {9, 10}, {11, 12}, {13, 14}}


In[36]:=
mergeSets2[s]
Out[36]=
{{{9, 10}, {11, 12}, {13, 14}}, {1, 2, 3, 4, 5, 6, 7}}

The non-intersecting singles are correct but the intersecting sets are all
merged together with mergeSets2[].

Regards,
Ling Sha

"Jean-Marc Gulliet" <jeanmarc...@gmail.com> wrote in message
news:eg2e4o$70u$1...@smc.vnet.net...

dkr

unread,
Oct 11, 2006, 1:57:14 AM10/11/06
to
Here is a functional programming approach to your problem. It is not
necessarily any faster than your procedural approach, but is more in
the spirit of how Mathematica programming is normally done.

In[1]:=
intersectQ[s1_, s2_] := If[Intersection[s1, s2] =!= {}, True, False];
g[t1_List,t2_List]/;intersectQ[t1,t2]:=(hit=1;Union[t1,t2]);
g[t1_List,t2_List]:=t2;
mergeSetsAlt[s:{__List}]:=
Module[{z,part1x,restx},
Reap[Nest[
Function[x,hit=0;part1x=x[[1]];restx=Rest[x];
z=(g[part1x,#]&/@restx); If[hit==0,Sow[part1x];z,z]],s,
Length[s]]][[2,1]]];

In[5]:=
s1={{1,2},{2,3},{4,5},{6,7},{7,5},{9,10},{11,12},{13,14}};
mergeSetsAlt[s1]
Out[6]=
{{1,2,3},{4,5,6,7},{9,10},{11,12},{13,14}}

The g function essentially substitutes for your If function, the
Reap/Sow construct handles your saving of the singles via Append, and
Nest essentially substitutes for While. In cases where you have a
large number of sets, most containing a large number of elements, the
following alternative version of intersectQ may be faster:

intersectAltQ[s1_List, s2_List] := If[Cases[s1, a_ /; MemberQ[s2,
a], {1}, 1] =!= {}, True, False];

Since all you want to determine is whether the two sets intersect, it
is not necessary to compute the entire Intersection[s1,s2]. As soon as
intersectAltQ finds the first common element, it stops.

dkr

Ray Koopman

unread,
Oct 11, 2006, 2:03:24 AM10/11/06
to
I think this gives what you want, but in a different order.

In[1]:= mergeSets3[s_List] := Module[
{c = Outer[Sign@Length@Intersection@##&,s,s,1]},
While[c != (c = Sign[c.c])]; Union @@@ (Pick[s,#,1]& /@ Union @ c) ]

In[2]:= s = {{1,2},{2,3},{4,5},{6,7},{7,5},{9,10},{11,12},{13,14}};

In[3]:= mergeSets3[s]

Out[3]= {{13,14},{11,12},{9,10},{4,5,6,7},{1,2,3}}

Peter Pein

unread,
Oct 12, 2006, 5:49:53 AM10/12/06
to
lsha schrieb:

> Hi,
>
> I tried mergeSets2[] and the answer is different from mergeSets[].
> An example:
> In[34]:=
> s = {{1, 2}, {2, 3}, {4, 5}, {6, 7}, {7, 5}, {9, 10}, {11, 12}, {13, 14}};
>
>
> In[35]:=
> mergeSets[s]
> Out[35]=
> {{1, 2, 3}, {4, 5, 6, 7}, {9, 10}, {11, 12}, {13, 14}}
>
>
> In[36]:=
> mergeSets2[s]
> Out[36]=
> {{{9, 10}, {11, 12}, {13, 14}}, {1, 2, 3, 4, 5, 6, 7}}
>
> The non-intersecting singles are correct but the intersecting sets are all
> merged together with mergeSets2[].
>
> Regards,
> Ling Sha
>

Hi,

just my 2 cents:

s={{1,2},{2, 3},{4, 5},{6,7},{7,5},{9,10},{11,12},{13,14}};
s//.{a___,x_List,b___,y_List,c___}/;Intersection[x,y]=!={}:>{a,Union[x,y],b,c}

--> {{1, 2, 3}, {4, 5, 6, 7}, {9, 10}, {11, 12}, {13, 14}}

Peter

Andrzej Kozlowski

unread,
Oct 12, 2006, 6:05:06 AM10/12/06
to
Actually this question has already been considered on this list more
than once but with a different interpretation. It is actually
equivalent to finding the transitive closure of a set of equivalence
classes. You can find many interesting and very fast solutions in the
thread entitled: "Computing sets of equivalences" that run in 2004
(and there have been other essentially equivalent ones). Here I will
just quote the solution provided by Carl Woll, which may well have
been the fastest:


addequiv[a_, a_] := 1
addequiv[ptr[a_], ptr[b_]] := (ptr[a] = ptr[b] = class[a];
equivset[class[a]] = {a, b})
addequiv[ptr[a_], b_class] := (ptr[a] = b; equivset[b] = {equivset
[b], a})
addequiv[a_class, ptr[b_]] := (ptr[b] = a; equivset[a] = {equivset
[a], b})
addequiv[a_class, b_class] := (equivset[a] = {equivset[a], equivset[b]};
equivset[b] =.; b = a)
getequivs[eq_] := Block[{ptr, class, equivset},
Apply[addequiv, Map[ptr, eq, {2}], {1}];
Flatten /@
DownValues[equivset][[All, 2]]]

In your case:

s = {{1, 2}, {2, 3}, {4, 5}, {6, 7}, {7, 5}, {9, 10}, {11, 12}, {13,
14}};


getequivs[s]

{{1,2,3},{6,7,4,5},{9,10},{11,12},{13,14}}

test it on large examples and you will see how fast it is.

Andrzej Kozlowski


On 7 Oct 2006, at 20:06, lsha wrote:

> Hi,
>
> I tried mergeSets2[] and the answer is different from mergeSets[].
> An example:
> In[34]:=
> s = {{1, 2}, {2, 3}, {4, 5}, {6, 7}, {7, 5}, {9, 10}, {11, 12},
> {13, 14}};
>
>
> In[35]:=
> mergeSets[s]
> Out[35]=
> {{1, 2, 3}, {4, 5, 6, 7}, {9, 10}, {11, 12}, {13, 14}}
>
>
> In[36]:=
> mergeSets2[s]
> Out[36]=
> {{{9, 10}, {11, 12}, {13, 14}}, {1, 2, 3, 4, 5, 6, 7}}
>
> The non-intersecting singles are correct but the intersecting sets
> are all
> merged together with mergeSets2[].
>
> Regards,
> Ling Sha
>

> "Jean-Marc Gulliet" <jeanmarc...@gmail.com> wrote in message
> news:eg2e4o$70u$1...@smc.vnet.net...

Andrzej Kozlowski

unread,
Oct 12, 2006, 6:10:09 AM10/12/06
to
I wrote the post below in a hurry and as a result a mathematical non-sequitur
"the transitive closure of a set of equivalence classes" managed to
sneak in. In fact, there are two related mathematical problems that
got confused here: that of finding the equivalence classes of an
equivalence relation specified by a list of pairs, each consisting of
equivalent elements and that of finding the transitive closure of a
relation (not necessarily an equivalence relation), given in a
similar way. In the latter case one obtains a list of pairs which
defines a relation that is transitive rather than a list of
equivalence classes.

In this case, of course, the original problem was to find the
equivalence classes (as sets) of an equivalence relation generated by
a set of pairs. This is equivalent to the problem of "Unionizing
interesecting pairs", rather than sets, where each pair represents
equivalent elements. To make the method work on lists containing sets
with more than two elements we need to modify the method:


addequiv[a_,a_]:=1
addequiv[ptr[a_],ptr[b_]]:=(ptr[a]=ptr[b]=class[a];


equivset[class[a]]={a,b})

addequiv[ptr[a_],b_class]:=(ptr[a]=b;equivset[b]={equivset[b],a})
addequiv[a_class,ptr[b_]]:=(ptr[b]=a;equivset[a]={equivset[a],b})


addequiv[a_class,b_class]:=(equivset[a]={equivset[a],equivset[b]};
equivset[b]=.;b=a)
getequivs[eq_]:=Block[{ptr,class,equivset},Apply[addequiv,Map[ptr,eq,
{2}],{
1}];
Flatten/@DownValues[equivset][[All,2]]]

unionizeInteresectingSets[l_]:=getequivs[Flatten[Partition[#,2,1]&/@l,
1]]

Now, for example,


s={{1,2,3},{3,4},{5,6,7},{7,8},{9,10}};


unionizeInteresectingSets[s]


{{1,2,3,4},{5,6,7,8},{9,10}}

I would expect this to be still a very fast method, but I have not
made any tests.

Andrzej Kozlowski

ab_...@prontomail.com

unread,
Oct 17, 2006, 3:26:26 AM10/17/06
to

Each sublist can be viewed as a chain of vertices in a graph; then
mergeSets effectively finds the connected components:

<<discretemath`
mergeSetsCC[$LL_] := Module[
{LL = $LL, Lelem, Lpair, M, n},
n = Length[Lelem = Union @@ LL];
LL = LL /. Dispatch@ Thread[Lelem -> Range@ n];
Lpair = Flatten[Partition[#, 2, 1]& /@ LL, 1];
M = SparseArray[Lpair -> Array[1&, Length@ Lpair], {n, n}];
M += Transpose@ M;
Lelem[[#]]& /@ StrongComponents[M]
]

This is quite fast because of the efficient internal implementation of
StrongComponents:

In[3]:= data = Array[Random[Integer, {1, 2*10^5}]&, {10^4, 10}];
Timing[mergeSetsCC[data];]

Out[4]= {0.562*Second, Null}

Maxim Rytin
m...@inbox.ru

Adriano Pascoletti

unread,
Oct 17, 2006, 3:29:29 AM10/17/06
to
The problem of unionizing a collection s={s1,s2,...} of intersecting
sets si={i1,i2,...} posted by "lsha" <lsha at earthlink.net> on Sat,
7 Oct 2006 has been solved in several ways. Here is another one.

Consider the elements i1, i2,... of each set si as vertices of an
undirected graph with edges connecting every pair of vertices of si.
Then the unionization of intersecting sets amounts to finding the
simply connected components of the graph.

Define the function adjacencies s.t. adjacencies[s] returns the
adjacency lists of the graph
In[1]:=
adjacencies = Function[s, Reap[Function[i, (Sow[#1, s[[i]]] & ) /@ s
[[i]]] /@ Range[Length[s]],
_, {#1, Union[#2]} & ][[2]]];

findcomponents[s] will find the connected components
In[2]:=
findcomponents = Function[vertices, Scan[If[unvisited[#1], v[#1] = +
+m; dfsvisit[#1]] & ,
vertices]];
dfsvisit = Function[i, Scan[If[unvisited[#1], v[#1] = m; dfsvisit
[#1]] & , a[i]]];

with

In[4]:=
unvisited = Head[v[#1]] === v & ;

The main function is

In[5]:=
unionize[s:{___List}] := Block[{a, m = 0, v, L = Union[Flatten[s]]},
Scan[(a[#1[[1]]] = #1[[2]]) & , adjacencies[s]]; findcomponents[L];
Reap[(Sow[#1, v[#1]] & ) /@ L, _][[2]]];

Some examples,


In[6]:=
unionize[{{1, 2, 3}, {3, 4}, {5, 6, 7}, {7, 8}, {9, 10}}]

Out[6]=
{{1, 2, 3, 4}, {5, 6, 7, 8}, {9, 10}}

In[7]:=
unionize[{{1, 2, 3}, {3, 4}, {5, 6, 7}, {7, 8}, {8, 3, 10}}]

Out[7]=
{{1, 2, 3, 4, 5, 6, 7, 8, 10}}

In[8]:=
unionize[{{1, 2}, {2, 3}, {4, 5}, {6, 7}, {7, 5}, {9, 10}, {11, 12},
{13, 14}}]

Out[8]=


{{1, 2, 3}, {4, 5, 6, 7}, {9, 10}, {11, 12}, {13, 14}}

Adriano Pascoletti

Reply all
Reply to author
Forward
0 new messages