Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Remove duplicate elements from a list

18 views
Skip to first unread message

Tony Tanzillo

unread,
Mar 20, 1998, 3:00:00 AM3/20/98
to

Just wondering if anyone has come up with a more
effecient way to do this (in AutoLISP, not VLISP),
assuming 1/3 of the total number of elements in
the list are unique e.g: '(1 2 3 4 5 5 6 6 7 7 8 8)

(defun remove-duplicates (inlist / dupes)
(apply 'append
(mapcar
'(lambda (x)
(if (not (member x dupes))
(list (car (setq dupes (cons x dupes))))
)
)
inlist
)
)
)

--
/*********************************************************/
/* Tony Tanzillo Design Automation Consulting */
/* Programming & Customization for AutoCAD & Compatibles */
/* ----------------------------------------------------- */
/* Member of the OpenDWG Alliance */
/* Co-Author of Maximizing AutoCAD R13 and */
/* Maximizing AutoLISP for AutoCAD R13/R14 */
/* ----------------------------------------------------- */
/* tony.t...@worldnet.att.net */
/* http://ourworld.compuserve.com/homepages/tonyt */
/*********************************************************/

PTB

unread,
Mar 20, 1998, 3:00:00 AM3/20/98
to tony.t...@worldnet.att.net

Here is how I would do it:

(defun c:qq ( / x)
(setq
x '(1 2 3 4 5 5 6 6 7 7 8 8 8)
x (qq x)
)
(princ "\n")
(princ x)
(princ)
)

(defun qq (oldlst / newlst)
(foreach x oldlst
(if (not (member x newlst))
(setq newlst (cons x newlst))
)
)
(reverse newlst)
)

Tony Tanzillo wrote:

--
_____________________________________________
|
| Paul Boersma - Administrator
| Architecture, Engineering & Construction Online Network
| http://aec-online.net
| Phone: (608) 845-3202
| Fax: (608) 845-5409
| Email: beca...@execpc.com
| admini...@aec-online.net
|_____________________________________________

Patrick Hughes

unread,
Mar 20, 1998, 3:00:00 AM3/20/98
to

"Tony Tanzillo" <tony.t...@worldnet.att.net> wrote:

>Just wondering if anyone has come up with a more
>effecient way to do this (in AutoLISP, not VLISP),
>assuming 1/3 of the total number of elements in
>the list are unique e.g: '(1 2 3 4 5 5 6 6 7 7 8 8)

>(defun remove-duplicates (inlist / dupes)
> (apply 'append
> (mapcar
> '(lambda (x)
> (if (not (member x dupes))
> (list (car (setq dupes (cons x dupes))))
> )
> )
> inlist
> )
> )
>)


Here's my offering:

(defun my-rem-dups (inlist / dupes)
(setq dupes (list (car inlist)))
(foreach memb inlist
(if (not (member memb dupes))
(setq dupes (append dupes (list memb) ))
)
)
)

(defun test1 ()
(setq time-in (getvar "TDUSRTIMER"))
(repeat 1000 (remove-duplicates inlist))
(setq time-out (rtos (- (getvar "TDUSRTIMER") time-in) 2 10))
)


(defun test2 ()
(setq time-in (getvar "TDUSRTIMER"))
(repeat 1000 (my-rem-dups inlist))
(setq time-out (rtos (- (getvar "TDUSRTIMER") time-in) 2 10 ))
)

Command: (test1)
".0000152778"

Command: (test2)
".0000076389"

Interestly heres what happens under different senarios;

make "dupes" global, reset to nil between repeat passes

Command: (test1)
".0000120370"

Command: (test2)
".0000076389"


in my-rem-dups comment out (setq dupes (list (car inlist)))

Command: (test2)
".0000076389"

in my-rem-dups make "dupes" global and
comment out (setq dupes (list (car inlist)))

Command: (test2)
".0000031250"

Each routine was run individually in a new session


Patrick Hughes
Engineered Design Solutions
e-mail duhv...@inwave.com
homepage http://www.inwave.com/~duhvinci/
--------------------------------------------------
Fire-up AutoCAD with Ignite! - a sizeable
"file open" dialog box for AutoCAD r13 & r14
visit http://www.inwave.com/~duhvinci/Ignite!.htm
to download a copy.
--------------------------------------------------


Matt Stachoni

unread,
Mar 20, 1998, 3:00:00 AM3/20/98
to

Tony,

Here's my best try:

(defun remove-duplicates (inlist / dupes)

(foreach x inlist


(if (not (member x dupes))

(setq dupes (cons x dupes))
)
)
(reverse dupes)
)

I don't know if it's more efficient internally, but I could try benchmarking it :)

Matt
stac...@bellatlantic.net

Tony Tanzillo

unread,
Mar 21, 1998, 3:00:00 AM3/21/98
to

Matt (and the others who replied) - Thanks.

This is the fastest one I've seen so far, but by
only a very small margin compared to others posted.

(defun remove-dupes (inlist / out)
(setq inlist (cons nil inlist))
(while (setq inlist (cdr inlist))
(if (not (member (car inlist) out))
(setq out (cons (car inlist) out))
)
)
(reverse out)
)

--
/*********************************************************/
/* Tony Tanzillo Design Automation Consulting */
/* Programming & Customization for AutoCAD & Compatibles */
/* ----------------------------------------------------- */
/* Member of the OpenDWG Alliance */
/* Co-Author of Maximizing AutoCAD R13 and */
/* Maximizing AutoLISP for AutoCAD R13/R14 */
/* ----------------------------------------------------- */
/* tony.t...@worldnet.att.net */
/* http://ourworld.compuserve.com/homepages/tonyt */
/*********************************************************/

Matt Stachoni wrote in message <3515e8c1....@adesknews.autodesk.com>...

Stephen Tate

unread,
Mar 21, 1998, 3:00:00 AM3/21/98
to

Hi Tony

You could try this (if it's not too late), in my test it's about twice as
fast as the best so far!

(Defun steves-remdup (inlist / out)
(foreach x inlist
(If (member x out) T ; This saves a NOT
(setq out (cons x out))
); If
); Foreach
(Reverse out)
)

Stephen Tate

Tony Tanzillo wrote in message <6evmg2$3g...@adesknews2.autodesk.com>...

Stephen Tate

unread,
Mar 21, 1998, 3:00:00 AM3/21/98
to

Apologies to Matt,

I've just read your solution which is almost identical to mine. I could have
saved myself the trouble!

Stephen Tate


Stephen Tate wrote in message <6f1hvl$3g...@adesknews2.autodesk.com>...

Tony Tanzillo

unread,
Mar 21, 1998, 3:00:00 AM3/21/98
to

Stephen - Thanks. I think you might want to
check your testing methodology. My testing
indicates that your version is ~4X slower
than some of the others. See test results
below.

This is the fastest I've seen to date:

(defun remove-dupes (inlist / out)
(setq inlist (cons nil inlist))
(while (setq inlist (cdr inlist))
(if (not (member (car inlist) out))
(setq out (cons (car inlist) out))
)
)
(reverse out)
)


Command: REMDUPETEST

Total number of elements: 1000

Number of unique elements: 250

Working...

-------------------------------------------------
All times in seconds
Benchmark expression: (QQ INPUT)
Iterations/sample: 3

Sample Hi Low Mean Avg Total
=================================================
1 0.99 0.82 0.91 0.88 2.63
2 0.83 0.77 0.80 0.81 2.42
3 0.83 0.82 0.83 0.82 2.47

Working...

-------------------------------------------------
All times in seconds
Benchmark expression: (REMOVE-DUPES INPUT)
Iterations/sample: 3

Sample Hi Low Mean Avg Total
=================================================
1 0.88 0.77 0.83 0.82 2.47
2 0.83 0.77 0.80 0.81 2.42
3 0.82 0.77 0.79 0.79 2.36

Working...

-------------------------------------------------
All times in seconds
Benchmark expression: (STEVES-REMDUP INPUT)
Iterations/sample: 3

Sample Hi Low Mean Avg Total
=================================================
1 2.80 2.58 2.69 2.71 8.13
2 2.80 2.58 2.69 2.73 8.18
3 2.75 2.58 2.67 2.67 8.02

Tony Tanzillo

unread,
Mar 21, 1998, 3:00:00 AM3/21/98
to

David - It will be included with Maximizing AutoLISP.

--
/*********************************************************/
/* Tony Tanzillo Design Automation Consulting */
/* Programming & Customization for AutoCAD & Compatibles */
/* ----------------------------------------------------- */
/* Member of the OpenDWG Alliance */
/* Co-Author of Maximizing AutoCAD R13 and */
/* Maximizing AutoLISP for AutoCAD R13/R14 */
/* ----------------------------------------------------- */
/* tony.t...@worldnet.att.net */
/* http://ourworld.compuserve.com/homepages/tonyt */
/*********************************************************/

David Garrigues wrote in message <35147451...@adesknews.autodesk.com>...
>Tony is this bench marking program something you created? If so do
>you give it away? I have a couple of things I would love to test from
>time to time nothing of course that would be of major significance,
>but would thrill me just the same to know that I took some hard edges
>off a program.
>
>Thanks
>David at the CADapult
>http://home1.gte.net/davidgus
>

Tony Tanzillo

unread,
Mar 21, 1998, 3:00:00 AM3/21/98
to

Stephen - My benchmark code has no 'mistakes' in it.

If there was something wrong with it, it would effect
all of the results for all versions, not just yours.

Feel free to send me the code you used, and I'll have
a look, but I'm confident that mine is indifferent.

--
/*********************************************************/
/* Tony Tanzillo Design Automation Consulting */
/* Programming & Customization for AutoCAD & Compatibles */
/* ----------------------------------------------------- */
/* Member of the OpenDWG Alliance */
/* Co-Author of Maximizing AutoCAD R13 and */
/* Maximizing AutoLISP for AutoCAD R13/R14 */
/* ----------------------------------------------------- */
/* tony.t...@worldnet.att.net */
/* http://ourworld.compuserve.com/homepages/tonyt */
/*********************************************************/

Stephen Tate wrote in message <6f21gh$4f...@adesknews2.autodesk.com>...
>I re-ran the test 10 times using 1000 numbers/250 unique, for each of the
>solutions. Not surprisingly, Paul, Matt and I were very close, because they
>were almost identical. Pat was the slowest, because he used Append instead
>of Cons. The only minute difference in mine was eliminating the 'NOT'
>statement. Both your original file and your final choice will give varying
>results depending on the size of of the list. As the 'inlist' gets smaller
>the routine gets faster. The only explanations for the difference in our
>results is that one of us has made a mistake, or there is an enormous
>difference between hardware/software configurations.
>
>I can email you my bench mark routine if you'd like to check it!
>
>Stephen Tate
>
> MIN MAX MEAN TOTAL
>STEVE 0.753 0.730 0.744 2.227
>MATT 0.788 0.777 0.779 2.344
>PAUL 0.916 0.777 0.798 2.491
>TONY2 1.102 1.078 1.090 3.270
>TONY1 2.064 1.900 2.032 5.996
>PAT 7.105 7.059 7.081 21.245
>
>Tony Tanzillo wrote in message <6f1jo4$3g...@adesknews2.autodesk.com>...

>>--
>>/*********************************************************/
>>/* Tony Tanzillo Design Automation Consulting */
>>/* Programming & Customization for AutoCAD & Compatibles */
>>/* ----------------------------------------------------- */
>>/* Member of the OpenDWG Alliance */
>>/* Co-Author of Maximizing AutoCAD R13 and */
>>/* Maximizing AutoLISP for AutoCAD R13/R14 */
>>/* ----------------------------------------------------- */
>>/* tony.t...@worldnet.att.net */
>>/* http://ourworld.compuserve.com/homepages/tonyt */
>>/*********************************************************/

>>>>--
>>>>/*********************************************************/
>>>>/* Tony Tanzillo Design Automation Consulting */
>>>>/* Programming & Customization for AutoCAD & Compatibles */
>>>>/* ----------------------------------------------------- */
>>>>/* Member of the OpenDWG Alliance */
>>>>/* Co-Author of Maximizing AutoCAD R13 and */
>>>>/* Maximizing AutoLISP for AutoCAD R13/R14 */
>>>>/* ----------------------------------------------------- */
>>>>/* tony.t...@worldnet.att.net */
>>>>/* http://ourworld.compuserve.com/homepages/tonyt */
>>>>/*********************************************************/

Tony Tanzillo

unread,
Mar 21, 1998, 3:00:00 AM3/21/98
to

Stephen - Oops. I pasted your code directly
to the command line, and mis-placed a paren,
which put the (reverse out) inside of the
(foreach) body.

So, yours is basically the same as the others in
time, and there doesn't seem to be any noticable
difference between using (not (member...)) and
using an "else" expression:

Sample Hi Low Mean Avg Total
=================================================

1 3.19 3.13 3.16 3.17 9.51
2 3.19 3.07 3.13 3.13 9.39
3 3.19 3.13 3.16 3.17 9.50


Sample Hi Low Mean Avg Total
=================================================

1 3.19 3.13 3.16 3.17 9.50
2 3.19 3.13 3.16 3.15 9.45
3 3.24 3.13 3.19 3.18 9.55

--
/*********************************************************/
/* Tony Tanzillo Design Automation Consulting */
/* Programming & Customization for AutoCAD & Compatibles */
/* ----------------------------------------------------- */
/* Member of the OpenDWG Alliance */
/* Co-Author of Maximizing AutoCAD R13 and */
/* Maximizing AutoLISP for AutoCAD R13/R14 */
/* ----------------------------------------------------- */
/* tony.t...@worldnet.att.net */
/* http://ourworld.compuserve.com/homepages/tonyt */
/*********************************************************/

David Garrigues

unread,
Mar 22, 1998, 3:00:00 AM3/22/98
to

Stephen Tate

unread,
Mar 22, 1998, 3:00:00 AM3/22/98
to

I re-ran the test 10 times using 1000 numbers/250 unique, for each of the
solutions. Not surprisingly, Paul, Matt and I were very close, because they
were almost identical. Pat was the slowest, because he used Append instead
of Cons. The only minute difference in mine was eliminating the 'NOT'
statement. Both your original file and your final choice will give varying
results depending on the size of of the list. As the 'inlist' gets smaller
the routine gets faster. The only explanations for the difference in our
results is that one of us has made a mistake, or there is an enormous
difference between hardware/software configurations.

I can email you my bench mark routine if you'd like to check it!

Stephen Tate

MIN MAX MEAN TOTAL
STEVE 0.753 0.730 0.744 2.227
MATT 0.788 0.777 0.779 2.344
PAUL 0.916 0.777 0.798 2.491
TONY2 1.102 1.078 1.090 3.270
TONY1 2.064 1.900 2.032 5.996
PAT 7.105 7.059 7.081 21.245

Tony Tanzillo wrote in message <6f1jo4$3g...@adesknews2.autodesk.com>...

Patrick Hughes

unread,
Mar 22, 1998, 3:00:00 AM3/22/98
to

"Stephen Tate" <Stephe...@dial.pipex.com> wrote:

>I re-ran the test 10 times using 1000 numbers/250 unique, for each of the
>solutions. Not surprisingly, Paul, Matt and I were very close, because they
>were almost identical. Pat was the slowest, because he used Append instead
>of Cons. The only minute difference in mine was eliminating the 'NOT'
>statement. Both your original file and your final choice will give varying
>results depending on the size of of the list. As the 'inlist' gets smaller
>the routine gets faster. The only explanations for the difference in our
>results is that one of us has made a mistake, or there is an enormous
>difference between hardware/software configurations.

>I can email you my bench mark routine if you'd like to check it!

>Stephen Tate

> MIN MAX MEAN TOTAL
>STEVE 0.753 0.730 0.744 2.227
>MATT 0.788 0.777 0.779 2.344
>PAUL 0.916 0.777 0.798 2.491
>TONY2 1.102 1.078 1.090 3.270
>TONY1 2.064 1.900 2.032 5.996
>PAT 7.105 7.059 7.081 21.245

ok ok ok - I concede my solution is dog slow (I was "duped" by Tony's
original post which had append in it) but I sheepishly point out that
while all contenders reversed the list to arrive at an ascending list
none actually returned it by (setq dupes (reverse...)

So in a way I still win don't I??

Stephen Tate

unread,
Mar 22, 1998, 3:00:00 AM3/22/98
to

Sorry Pat,

No criticism of your routine was intended, it's just the way it worked out.
Append is the right function for the job, but the cons/reverse workaround is
quicker. In practice the routine probably wouldn't be called a ten times and
the difference would be negligible.

If you read my results they're gibberish anyway - the mins are greater that
the max and so on, they got messed up in Excel some how. They should have
read like this, some of the names have been changed to protect the innocent
;)

Min Max Mean Total
STEVE 0.63 0.75 0.71 2.09
MATT 0.65 0.77 0.71 2.13
PAUL 0.67 0.80 0.71 2.18
TONY2 0.80 0.93 0.86 2.59
TONY1 1.56 1.90 1.68 5.14
MR-X 5.31 7.36 6.12 18.79

With the other solutions, because the (reverse list) is the last function
evaluated, it is the one returned.

Stephen Tate

Patrick Hughes wrote in message <6f3uvl$4f...@adesknews2.autodesk.com>...


>"Stephen Tate" <Stephe...@dial.pipex.com> wrote:
>
>>I re-ran the test 10 times using 1000 numbers/250 unique, for each of the
>>solutions. Not surprisingly, Paul, Matt and I were very close, because
they
>>were almost identical. Pat was the slowest, because he used Append instead
>>of Cons. The only minute difference in mine was eliminating the 'NOT'
>>statement. Both your original file and your final choice will give varying
>>results depending on the size of of the list. As the 'inlist' gets smaller
>>the routine gets faster. The only explanations for the difference in our
>>results is that one of us has made a mistake, or there is an enormous
>>difference between hardware/software configurations.
>
>>I can email you my bench mark routine if you'd like to check it!
>
>>Stephen Tate
>
>> MIN MAX MEAN TOTAL
>>STEVE 0.753 0.730 0.744 2.227
>>MATT 0.788 0.777 0.779 2.344
>>PAUL 0.916 0.777 0.798 2.491
>>TONY2 1.102 1.078 1.090 3.270
>>TONY1 2.064 1.900 2.032 5.996
>>PAT 7.105 7.059 7.081 21.245
>

Tony Tanzillo

unread,
Mar 22, 1998, 3:00:00 AM3/22/98
to

There are two basic differences:

1. Foreach is replaced by (while (setq list (cdr list))...).

2. The first element in the list is referenced directly
with (car) multiple times, rather than being assigned to,
and referenced through a variable.

I don't know if either or both are responsible, but in
any case, the speed difference is only very slight.

--
/*********************************************************/
/* Tony Tanzillo Design Automation Consulting */
/* Programming & Customization for AutoCAD & Compatibles */
/* ----------------------------------------------------- */
/* Member of the OpenDWG Alliance */
/* Co-Author of Maximizing AutoCAD R13 and */
/* Maximizing AutoLISP for AutoCAD R13/R14 */
/* ----------------------------------------------------- */
/* tony.t...@worldnet.att.net */
/* http://ourworld.compuserve.com/homepages/tonyt */
/*********************************************************/

Matt Stachoni wrote in message <3515ad75...@adesknews.autodesk.com>...
>Tony,


>
>>(defun remove-dupes (inlist / out)
>> (setq inlist (cons nil inlist))
>> (while (setq inlist (cdr inlist))
>> (if (not (member (car inlist) out))
>> (setq out (cons (car inlist) out))
>> )
>> )
>> (reverse out)
>>)
>

>I'm curious - did you write this to explicitly avoid a (foreach),
>because it is a slower subr? It looks like your code is basically a
>(foreach) written in longhand.
>
>In the interests of composing the fastest and most efficient AutoLISP
>code, I'd be interested in knowing what built-in functions are
>internally faster or slower than others. I know (entget)'s a dog but
>there's no way around that one, I guess.
>
>I suppose you could have coded it


>
>(defun remove-dupes (inlist / out)

> (while (setq x (car inlist))
> (if (not (member x out))


> (setq out (cons x out))
> )

> (setq inlist (cdr inlist))
> )
> (reverse out)
>)
>
>Didn't bother to benchmark it, though. Once was enough for me!
>
>Matt

David Garrigues

unread,
Mar 23, 1998, 3:00:00 AM3/23/98
to

Again I am like a little child impatiently waiting for you book....
Pump it out there big boy :-)

David at the CADapult
http://home1.gte.net/davidgus

Matt Stachoni

unread,
Mar 23, 1998, 3:00:00 AM3/23/98
to

Patrick Hughes

unread,
Mar 23, 1998, 3:00:00 AM3/23/98
to

"Stephen Tate" <Stephe...@dial.pipex.com> wrote:

>Sorry Pat,

>No criticism of your routine was intended, it's just the way it worked out.
>Append is the right function for the job, but the cons/reverse workaround is
>quicker. In practice the routine probably wouldn't be called a ten times and
>the difference would be negligible.

>If you read my results they're gibberish anyway - the mins are greater that
>the max and so on, they got messed up in Excel some how. They should have
>read like this, some of the names have been changed to protect the innocent
>;)

> Min Max Mean Total
>STEVE 0.63 0.75 0.71 2.09
>MATT 0.65 0.77 0.71 2.13
>PAUL 0.67 0.80 0.71 2.18
>TONY2 0.80 0.93 0.86 2.59
>TONY1 1.56 1.90 1.68 5.14
>MR-X 5.31 7.36 6.12 18.79

>With the other solutions, because the (reverse list) is the last function
>evaluated, it is the one returned.

>Stephen Tate

MR-X ... LOL

Not to worry, I wasn't offended or put off by any "criticism", it was
just an exercise. I did notice the Min/Max thingy but knowing someone
would see through my smoke screen about the way the list was returned
I decided not to pick nits.

0 new messages