Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Generalizing Complement to handle multiple occurrences

2 views
Skip to first unread message

Niels R. Walet

unread,
Sep 15, 2010, 8:05:29 PM9/15/10
to
Mark Coleman wrote:
> Greetings,
>
> I'm wondering how one can efficiently generalize the built-in
> Complement function to return multiple occurrences of elements. For
> instance, the current version generates
>
> a={1,2,2,3,4,4,4,4}
> b={3,5,7}
>
> Complement[a,b]={1,2,4}
>
> I'm looking for a way to define
>
> myComplement[a,b]={1,2,2,4,4,4,4}
>
> My current code is very inefficient!
>
> Thanks,
>
> Mark
>
>
Stupid oneliner:
c = a; Apply[(c = DeleteCases[c, #]) &, b]

--
Prof. Niels R. Walet Phone: +44(0)1613063693
School of Physics and Astronomy Fax: +44(0)1613064303
The University of Manchester Mobile: +44(0)7905438934
Manchester, M13 9PL, UK room 7.7, Schuster Building
email: Niels...@manchester.ac.uk
web: http://www.theory.physics.manchester.ac.uk/~mccsnrw


Leonid Shifrin

unread,
Sep 15, 2010, 8:05:40 PM9/15/10
to
Hi Mark,

The following will be pretty fast:

Clear[memberPositions];
memberPositions[x_List, y_List] :=
Module[{tag},
Pick[Range[Length[x]],
Replace[x, Dispatch[Thread[Rule[Intersection[x, y], tag]]], {1}],
tag]]

Clear[multComplement]
multComplement[x_, y_] :=
Delete[x, List /@ memberPositions[x, Union[y]]];

In[5]:=
a={1,2,2,3,4,4,4,4};
b={3,5,7};

In[7]:= multComplement[a,b]
Out[7]= {1,2,2,4,4,4,4}

Performance test:

In[8]:=

la = RandomInteger[{1,20000},100000];
lb = RandomInteger[{1,20000},100000];

In[10]:= multComplement[la,lb]//Short//Timing

Out[10]=
{0.156,{2319,10829,2465,13044,18567,15250,11226,18576,<<638>>,1836,9996,15929,9805,15732,7853,2919,2070}}

The result comes out unsorted, in the order in which the elements were
originally stored in the first list. If this is a problem, just sort the
result.

Hope this helps.

Regards,
Leonid

Christoph Lhotka

unread,
Sep 16, 2010, 5:57:54 AM9/16/10
to
Hello,

for 1 000 000 random elements on my computer a factor 3 slower than the
built in Complement function:

a = RandomInteger[{1, 100}, 1000000];
b = RandomInteger[{30, 70}, 1000000];

DeleteCases[a, Alternatives@@Union[b]]

Ciao,

Christoph

Leonid Shifrin

unread,
Sep 16, 2010, 5:58:05 AM9/16/10
to
Hi Mark,

This is a follow-up to my previous post. There is a much simpler
method than the one I posted before,
but with somewhat different performance characteristics:

Clear[unsortedComplement];
unsortedComplement[x_, y_] :=
x /. Dispatch[Thread[Union[y] -> Sequence[]]];

This one can be either somewhat (up to 2-3 times) faster or somewhat (up to
2-3 times) slower than
the previous one, depending on the fraction of deleted elements in a first
list and on the size of the
(union of) the second list. It will likely be faster if the more significant
part of the list is deleted, or if
the second list gets smaller, and slower otherwise.

Regards,
Leonid

Leonid Shifrin

unread,
Sep 17, 2010, 6:39:57 AM9/17/10
to
Christoph,

This is only so because your ranges for random integers are so much smaller
than the number
of generated integers. After Union in your code is done, you are left with
at most 41 numbers,
which is a small enough list so that Alternatives can do a pretty good job
to hide the true
Length[a]*Length[b] complexity of this approach. Try this:

a = RandomInteger[{1, 100000}, 1000000];
b = RandomInteger[{30, 70000}, 1000000];

I was not patient enough to see it finished. Of course, that matters mostly
for the second list,
in this method.

Regards,
Leonid


On Thu, Sep 16, 2010 at 1:58 PM, Christoph Lhotka <
christop...@univie.ac.at> wrote:

> Hello,
>
> for 1 000 000 random elements on my computer a factor 3 slower than the
> built in Complement function:
>
> a = RandomInteger[{1, 100}, 1000000];
> b = RandomInteger[{30, 70}, 1000000];
>
> DeleteCases[a, Alternatives@@Union[b]]
>
> Ciao,
>
> Christoph
>

0 new messages