Implementation limits when accessing circular lists

Adam Warner

unread,

Jun 29, 2003, 12:32:26 AM6/29/03

to

Hi all,

Today is a great day. I grokked internal list structures, conses and
dotted lists as I came to understand why association lists were the
preferred form for lists of name/value pairs. Until this BFO (Blinding
Flash of the Obvious) the list (("name1" . "value1") ("name2" . "value2"))
felt more verbose than (("name1" "value1") ("name2" "value2")). Of course
the second list is actually a larger structure, as can be seen by
explicitly writing out the conses:

(("name1" . ("value1" . nil)) ("name2" . ("value2" . nil)))

After these BFOs circular lists just fell into place:

(setf *print-circle* t)
(defparameter c (list "adam"))
(setf (cdr c) c)

So now I can take however many cdrs of c as I like and never reach the end
of the list. A few seconds later I test how far implementations can cope
with this. First CLISP:

(nth 100000000 c)

*** - nth: 100000000 is not a nonnegative fixnum and therefore not a valid index

Then CMUCL:

* (nth 1000000000 c)

Type-error in kernel::object-not-type-error-handler:
1000000000 is not of type (mod 536870911)

Restarts:
0: [abort] Return to Top-Level.

Debug (type H for help)

(nth 2 1000000000 #1=("adam" . #1#))[:external]
Source:
; File: target:code/list.lisp
(defun nth (n list)
"Returns the nth object in a list where the car is the zero-th element."
(car (nthcdr n list)))

And a similar result with SBCL. Do people concur that is not OK for these
implementations to choke? The HyperSpec states that nth can be supplied
with a "non-negative integer", not just a non-negative fixnum. Would
people approve of adding this as a conformance test?

While it is not presently efficient to access such high value cdrs of a
circular list, circular lists are a nice abstraction and I don't like to
see code break for no better reason than a fixnum becomes a bignum.

Regards,
Adam

Nick Levine

unread,

Jun 29, 2003, 6:18:20 AM6/29/03

to

It fails in LispWorks (Windows, version 4.2.7) too:

CL-USER 8 > (nth most-positive-fixnum x)
NIL

CL-USER 9 > (nth (1+ most-positive-fixnum) x)

Error: 8388608 is not a valid index for #1=(NIL . #1#).
1 (abort) Return to level 0.
2 Return to top loop level 0.

Type :b for backtrace, :c <option number> to proceed, or :? for other options

CL-USER 10 : 1 >

- nick

Paul F. Dietz

unread,

Jun 29, 2003, 9:11:32 AM6/29/03

to

Adam Warner wrote:

> And a similar result with SBCL. Do people concur that is not OK for these
> implementations to choke? The HyperSpec states that nth can be supplied
> with a "non-negative integer", not just a non-negative fixnum. Would
> people approve of adding this as a conformance test?

Yes, that's a bug. Standardized functions that take lists are not normally
required to handle circular lists, but NTH explicitly mentions that it should
be able to.

I'm not going to add a test for it to ansi-tests, though, since it would take
too long to run.

Paul

Adam Warner

unread,

Jun 29, 2003, 10:31:23 AM6/29/03

to

Hi Paul F. Dietz,

Thanks for the reply Paul. At the moment it's an instantaneous failure on
CLISP, CMUCL and SBCL.

(nth (1+ most-positive-fixnum) '#1=(t . #1#))

*** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index

And even if it wasn't an instantaneous failure we're talking about
approximate a second when running interpreted on modest 32-bit hardware.
$ cat /proc/cpuinfo | grep 'MHz'
cpu MHz : 530.046

CLISP:
(time (nth most-positive-fixnum '#1=(t . #1#)))

Real time: 0.394791 sec.
Run time: 0.39 sec.
Space: 0 Bytes

I wouldn't recommend testing this on 64-bit implementations though :-)

With CMUCL/SBCL the test also fails with a fixnum:

* (nth most-positive-fixnum '#1=(t . #1#))

Type-error in kernel::object-not-type-error-handler:
536870911 is not of type (mod 536870911)

Restarts:
0: [abort] Return to Top-Level.

Debug (type H for help)

(nth 2 536870911 (t t t t t ...))[:external]
Source: (defun nth (n list)

"Returns the nth object in a list where the car is the zero-th element."
(car (nthcdr n list)))

I wonder if it's widespread that tests for a fixnum miss the most-positive-fixnum!

Regards,
Adam

Adam Warner

unread,

Jun 29, 2003, 10:35:15 AM6/29/03

to

Hi Paul F. Dietz,

Thanks for the reply Paul. At the moment it's an instantaneous failure on
CLISP, CMUCL and SBCL.

(nth (1+ most-positive-fixnum) '#1=(t . #1#))

*** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index

And even if it wasn't an instantaneous failure we're talking about

approximately a second when running interpreted on modest 32-bit hardware.

$ cat /proc/cpuinfo | grep 'MHz'
cpu MHz : 530.046

CLISP:
(time (nth most-positive-fixnum '#1=(t . #1#)))

Real time: 0.394791 sec.
Run time: 0.39 sec.
Space: 0 Bytes

I wouldn't recommend testing this on 64-bit implementations though :-)

With CMUCL/SBCL the test also fails with a fixnum:

* (nth most-positive-fixnum '#1=(t . #1#))

Type-error in kernel::object-not-type-error-handler:
536870911 is not of type (mod 536870911)

Restarts:
0: [abort] Return to Top-Level.

Debug (type H for help)

(nth 2 536870911 (t t t t t ...))[:external]
Source: (defun nth (n list)

"Returns the nth object in a list where the car is the zero-th element."
(car (nthcdr n list)))

Paul F. Dietz

unread,

Jun 29, 2003, 10:56:43 AM6/29/03

to

Adam Warner wrote:

> *** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index
>
> And even if it wasn't an instantaneous failure we're talking about
> approximate a second when running interpreted on modest 32-bit hardware.
> $ cat /proc/cpuinfo | grep 'MHz'
> cpu MHz : 530.046

It's not going to take only 1 second if you're searching for the 536 millionth
element, though, using the naive algorithm.

It *is* possible to implement this in time proportional to the number
of cons cells in the circular list, rather than in the integer n passed
to NTH (or NTHCDR) (idea: detect circularity using the standard two pointer
trick and use MOD to reduce n to a manageable size.) I've suggested that
to the cmucl and sbcl implementors.

> I wonder if it's widespread that tests for a fixnum miss the most-positive-fixnum!

Good question; I have MOST-POSITIVE-FIXNUM (and integer near it) in the 'universe'
of values that ansi-tests passes to various functions.

Paul

prunes...@attbi.com

unread,

Jun 29, 2003, 12:15:17 PM6/29/03

to

"Paul F. Dietz" <di...@dls.net> writes:

> Yes, that's a bug. Standardized functions that take lists are not normally
> required to handle circular lists, but NTH explicitly mentions that it should
> be able to.
>
> I'm not going to add a test for it to ansi-tests, though, since it would take
> too long to run.

You have your own opinion of what is `too long', but taking
most-positive-fixnum CDRs should be pretty damn quick. Especially if
the list in question is all in the cache.

Kent M Pitman

unread,

Jun 30, 2003, 2:41:17 AM6/30/03

to

Adam Warner <use...@consulting.net.nz> writes:

> Thanks for the reply Paul. At the moment it's an instantaneous failure on
> CLISP, CMUCL and SBCL.
>
> (nth (1+ most-positive-fixnum) '#1=(t . #1#))
>
> *** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index

Perhaps it should instead quickly determine the cycle length using the cute
algorithm integer-length uses and and then use modular arithmetic to figure
out what integer element to access...

> I wouldn't recommend testing this on 64-bit implementations though :-)

In practice, the likelihood of an actual list of distinct elements that
long is quite small, though yes, the theoretical problem couuld be large.

Adam Warner

unread,

Jun 30, 2003, 4:21:16 AM6/30/03

to

Impressively, Alexey Dejneka has reported to me that this is fixed in the
CVS version of SBCL (0.8.1.11). There has also been a short efficiency
discussion on the development mailing list:
<http://news.gmane.org/thread.php?group=gmane.lisp.steel-bank.devel>

Regards,
Adam

Rob Warnock

unread,

Jun 30, 2003, 4:59:07 AM6/30/03

to

Adam Warner <use...@consulting.net.nz> wrote:
+---------------
| Paul F. Dietz:

| > I'm not going to add a test for it to ansi-tests, though, since it
| > would take too long to run.
|
| Thanks for the reply Paul. At the moment it's an instantaneous failure on

| CLISP, CMUCL and SBCL. ...
...

+---------------

On CMUCL-18e, using (1- most-positive-fixnum), it's *far* faster than that!!

cmu> (time (nth (1- most-positive-fixnum) '#1=(t . #1#)))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:

; Evaluation took:
; 0.0 seconds of real time
; 5.0e-6 seconds of user run time
; 7.0e-6 seconds of system run time
; 3,212 CPU cycles
; 0 page faults and
; 64 bytes consed.
;
T
cmu>

Now, granted, I have a 1.4 GHz Athlon ("1600+"), but it's not *that*
much faster than Adam's machine. Hmmm... Let's try it on a lowly 133 MHz
Pentium II:

cmu> (time (nth (1- most-positive-fixnum) '#1=(t . #1#)))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:

; Evaluation took:
; 0.0 seconds of real time
; 6.6e-5 seconds of user run time
; 10.0e-7 seconds of system run time
; 922 CPU cycles
; 0 page faults and
; 64 bytes consed.
;
T
cmu>

Now that's *really* weird! Is CMUCL's *compiler* somehow detecting that
the circularity implies a constant result?

Answer: No! Looks like some artifact with the TIME macro instead.
When you run it this way you get *much* more reasonable results:

cmu> (defun foo (x)
(declare (fixnum x) (optimize (speed 3)))
(nth x '#1=(t . #1#)))

FOO
cmu> (compile 'foo)
; Compiling LAMBDA (X):
; Compiling Top-Level Form:

FOO
NIL
NIL
cmu> (time (foo (1- most-positive-fixnum)))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:

; Evaluation took:
; 1.59 seconds of real time
; 1.572572 seconds of user run time
; 0.013583 seconds of system run time
; 2,234,478,100 CPU cycles
; 0 page faults and
; 64 bytes consed.
;
T
cmu>

on the 1.4 GHz Athlon, and:

cmu> (time (foo (1- most-positive-fixnum)))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:

; Evaluation took:
; 25.19 seconds of real time
; 24.542253 seconds of user run time
; 0.047189 seconds of system run time
; 3,317,991,740 CPU cycles
; 0 page faults and
; 64 bytes consed.
;
T
cmu>

on the 133 MHz Pentium.

-Rob

-----
Rob Warnock, PP-ASEL-IA <rp...@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607

Christophe Rhodes

unread,

Jun 30, 2003, 5:12:25 AM6/30/03

to

rp...@rpw3.org (Rob Warnock) writes:

> cmu> (time (nth (1- most-positive-fixnum) '#1=(t . #1#)))

> ; 0.0 seconds of real time

> Now that's *really* weird! Is CMUCL's *compiler* somehow detecting that
> the circularity implies a constant result?

No, the circularity has nothing to do with it. It's detecting that
both (1- most-positive-fixnum) and (quote <anything>) are constants,
and CL:NTH is a known function with defined semantics, so it's
constant-folding the call at compile-time.

> Answer: No! Looks like some artifact with the TIME macro instead.
> When you run it this way you get *much* more reasonable results:
>
> cmu> (defun foo (x)
> (declare (fixnum x) (optimize (speed 3)))
> (nth x '#1=(t . #1#)))

X here isn't a constant, so the compiler can't constant-fold the call
to NTH.

Christophe
--
http://www-jcsu.jesus.cam.ac.uk/~csr21/ +44 1223 510 299/+44 7729 383 757
(set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b)))
(defvar b "~&Just another Lisp hacker~%") (pprint #36rJesusCollegeCambridge)

Tim Daly, Jr.

unread,

Jun 30, 2003, 10:19:36 PM6/30/03

to

"Paul F. Dietz" <di...@dls.net> writes:

> It *is* possible to implement this in time proportional to the number
> of cons cells in the circular list, rather than in the integer n passed
> to NTH (or NTHCDR) (idea: detect circularity using the standard two pointer
> trick and use MOD to reduce n to a manageable size.) I've suggested that
> to the cmucl and sbcl implementors.
>

Could someone please clue me in on the "standard two pointer trick"?

Thanks,
Tim

--

Rob Warnock

unread,

Jun 30, 2003, 10:25:49 PM6/30/03

to

Christophe Rhodes <cs...@cam.ac.uk> wrote:
+---------------

| rp...@rpw3.org (Rob Warnock) writes:
| > cmu> (time (nth (1- most-positive-fixnum) '#1=(t . #1#)))
| > ; 0.0 seconds of real time
| > Now that's *really* weird! Is CMUCL's *compiler* somehow detecting
| > that the circularity implies a constant result?
|
| No, the circularity has nothing to do with it. It's detecting that
| both (1- most-positive-fixnum) and (quote <anything>) are constants,
| and CL:NTH is a known function with defined semantics, so it's
| constant-folding the call at compile-time.

+---------------

*DOH!!*

Thanks, that also explains the long silence (much longer on the
slower machine!) that I saw between the "Compiling LAMBDA NIL:"
and "Compiling Top-Level Form:" messages... ;-} ;-}

Paul F. Dietz

unread,

Jun 30, 2003, 10:38:23 PM6/30/03

to

Tim Daly, Jr. wrote:

> Could someone please clue me in on the "standard two pointer trick"?

It's used to detect if a list is circular. Two pointers are marched
down the list, one twice as fast as the other, and if they ever become
equal after time 0 the list is circular (and vice versa.)

(defun circular-list-p (x &aux (y x))
(loop
(when (or (atom y) (atom (cdr y))) (return nil))
(setf y (cddr y)
x (cdr x))
(when (eq x y) (return t))))

Paul

Rob Warnock

unread,

Jun 30, 2003, 10:43:32 PM6/30/03

to

Tim Daly, Jr. <t...@tenkan.org> wrote:
+---------------

| "Paul F. Dietz" <di...@dls.net> writes:
| > (idea: detect circularity using the standard two pointer trick
| > and use MOD to reduce n to a manageable size.)
|

| Could someone please clue me in on the "standard two pointer trick"?

+---------------

Do a web search for:

tortoise hare circular

or perhaps better:

tortoise hare "circular list"

Also see CLHS "Function LIST-LENGTH". As shown there, a common approach
is to have the hare move at twice the speed of the tortoise, but many
other winning strategies are possible [iff your termination tests are
correct!]. E.g., make the tortoise move at half the speed of the hare.
[It is useful to ask why/how this might be different than the previous,
and whether or not one might be "better" than the other in some sense.]

Tim Daly, Jr.

unread,

Jul 1, 2003, 5:24:42 PM7/1/03

to

Thanks guys!

-Tim

--