Today is a great day. I grokked internal list structures, conses and
dotted lists as I came to understand why association lists were the
preferred form for lists of name/value pairs. Until this BFO (Blinding
Flash of the Obvious) the list (("name1" . "value1") ("name2" . "value2"))
felt more verbose than (("name1" "value1") ("name2" "value2")). Of course
the second list is actually a larger structure, as can be seen by
explicitly writing out the conses:
(("name1" . ("value1" . nil)) ("name2" . ("value2" . nil)))
After these BFOs circular lists just fell into place:
(setf *print-circle* t)
(defparameter c (list "adam"))
(setf (cdr c) c)
So now I can take however many cdrs of c as I like and never reach the end
of the list. A few seconds later I test how far implementations can cope
with this. First CLISP:
(nth 100000000 c)
*** - nth: 100000000 is not a nonnegative fixnum and therefore not a valid index
Then CMUCL:
* (nth 1000000000 c)
Type-error in kernel::object-not-type-error-handler:
1000000000 is not of type (mod 536870911)
Restarts:
0: [abort] Return to Top-Level.
Debug (type H for help)
(nth 2 1000000000 #1=("adam" . #1#))[:external]
Source:
; File: target:code/list.lisp
(defun nth (n list)
"Returns the nth object in a list where the car is the zero-th element."
(car (nthcdr n list)))
And a similar result with SBCL. Do people concur that is not OK for these
implementations to choke? The HyperSpec states that nth can be supplied
with a "non-negative integer", not just a non-negative fixnum. Would
people approve of adding this as a conformance test?
While it is not presently efficient to access such high value cdrs of a
circular list, circular lists are a nice abstraction and I don't like to
see code break for no better reason than a fixnum becomes a bignum.
Regards,
Adam
CL-USER 8 > (nth most-positive-fixnum x)
NIL
CL-USER 9 > (nth (1+ most-positive-fixnum) x)
Error: 8388608 is not a valid index for #1=(NIL . #1#).
1 (abort) Return to level 0.
2 Return to top loop level 0.
Type :b for backtrace, :c <option number> to proceed, or :? for other options
CL-USER 10 : 1 >
- nick
> And a similar result with SBCL. Do people concur that is not OK for these
> implementations to choke? The HyperSpec states that nth can be supplied
> with a "non-negative integer", not just a non-negative fixnum. Would
> people approve of adding this as a conformance test?
Yes, that's a bug. Standardized functions that take lists are not normally
required to handle circular lists, but NTH explicitly mentions that it should
be able to.
I'm not going to add a test for it to ansi-tests, though, since it would take
too long to run.
Paul
Thanks for the reply Paul. At the moment it's an instantaneous failure on
CLISP, CMUCL and SBCL.
(nth (1+ most-positive-fixnum) '#1=(t . #1#))
*** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index
And even if it wasn't an instantaneous failure we're talking about
approximate a second when running interpreted on modest 32-bit hardware.
$ cat /proc/cpuinfo | grep 'MHz'
cpu MHz : 530.046
CLISP:
(time (nth most-positive-fixnum '#1=(t . #1#)))
Real time: 0.394791 sec.
Run time: 0.39 sec.
Space: 0 Bytes
I wouldn't recommend testing this on 64-bit implementations though :-)
With CMUCL/SBCL the test also fails with a fixnum:
* (nth most-positive-fixnum '#1=(t . #1#))
Type-error in kernel::object-not-type-error-handler:
536870911 is not of type (mod 536870911)
Restarts:
0: [abort] Return to Top-Level.
Debug (type H for help)
(nth 2 536870911 (t t t t t ...))[:external]
Source: (defun nth (n list)
"Returns the nth object in a list where the car is the zero-th element."
(car (nthcdr n list)))
I wonder if it's widespread that tests for a fixnum miss the most-positive-fixnum!
Regards,
Adam
Thanks for the reply Paul. At the moment it's an instantaneous failure on
CLISP, CMUCL and SBCL.
(nth (1+ most-positive-fixnum) '#1=(t . #1#))
*** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index
And even if it wasn't an instantaneous failure we're talking about
approximately a second when running interpreted on modest 32-bit hardware.
$ cat /proc/cpuinfo | grep 'MHz'
cpu MHz : 530.046
CLISP:
(time (nth most-positive-fixnum '#1=(t . #1#)))
Real time: 0.394791 sec.
Run time: 0.39 sec.
Space: 0 Bytes
I wouldn't recommend testing this on 64-bit implementations though :-)
With CMUCL/SBCL the test also fails with a fixnum:
* (nth most-positive-fixnum '#1=(t . #1#))
Type-error in kernel::object-not-type-error-handler:
536870911 is not of type (mod 536870911)
Restarts:
0: [abort] Return to Top-Level.
Debug (type H for help)
(nth 2 536870911 (t t t t t ...))[:external]
Source: (defun nth (n list)
"Returns the nth object in a list where the car is the zero-th element."
(car (nthcdr n list)))
> *** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index
>
> And even if it wasn't an instantaneous failure we're talking about
> approximate a second when running interpreted on modest 32-bit hardware.
> $ cat /proc/cpuinfo | grep 'MHz'
> cpu MHz : 530.046
It's not going to take only 1 second if you're searching for the 536 millionth
element, though, using the naive algorithm.
It *is* possible to implement this in time proportional to the number
of cons cells in the circular list, rather than in the integer n passed
to NTH (or NTHCDR) (idea: detect circularity using the standard two pointer
trick and use MOD to reduce n to a manageable size.) I've suggested that
to the cmucl and sbcl implementors.
> I wonder if it's widespread that tests for a fixnum miss the most-positive-fixnum!
Good question; I have MOST-POSITIVE-FIXNUM (and integer near it) in the 'universe'
of values that ansi-tests passes to various functions.
Paul
> Yes, that's a bug. Standardized functions that take lists are not normally
> required to handle circular lists, but NTH explicitly mentions that it should
> be able to.
>
> I'm not going to add a test for it to ansi-tests, though, since it would take
> too long to run.
You have your own opinion of what is `too long', but taking
most-positive-fixnum CDRs should be pretty damn quick. Especially if
the list in question is all in the cache.
> Thanks for the reply Paul. At the moment it's an instantaneous failure on
> CLISP, CMUCL and SBCL.
>
> (nth (1+ most-positive-fixnum) '#1=(t . #1#))
>
> *** - nth: 16777216 is not a nonnegative fixnum and therefore not a valid index
Perhaps it should instead quickly determine the cycle length using the cute
algorithm integer-length uses and and then use modular arithmetic to figure
out what integer element to access...
> I wouldn't recommend testing this on 64-bit implementations though :-)
In practice, the likelihood of an actual list of distinct elements that
long is quite small, though yes, the theoretical problem couuld be large.
Regards,
Adam
On CMUCL-18e, using (1- most-positive-fixnum), it's *far* faster than that!!
cmu> (time (nth (1- most-positive-fixnum) '#1=(t . #1#)))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:
; Evaluation took:
; 0.0 seconds of real time
; 5.0e-6 seconds of user run time
; 7.0e-6 seconds of system run time
; 3,212 CPU cycles
; 0 page faults and
; 64 bytes consed.
;
T
cmu>
Now, granted, I have a 1.4 GHz Athlon ("1600+"), but it's not *that*
much faster than Adam's machine. Hmmm... Let's try it on a lowly 133 MHz
Pentium II:
cmu> (time (nth (1- most-positive-fixnum) '#1=(t . #1#)))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:
; Evaluation took:
; 0.0 seconds of real time
; 6.6e-5 seconds of user run time
; 10.0e-7 seconds of system run time
; 922 CPU cycles
; 0 page faults and
; 64 bytes consed.
;
T
cmu>
Now that's *really* weird! Is CMUCL's *compiler* somehow detecting that
the circularity implies a constant result?
Answer: No! Looks like some artifact with the TIME macro instead.
When you run it this way you get *much* more reasonable results:
cmu> (defun foo (x)
(declare (fixnum x) (optimize (speed 3)))
(nth x '#1=(t . #1#)))
FOO
cmu> (compile 'foo)
; Compiling LAMBDA (X):
; Compiling Top-Level Form:
FOO
NIL
NIL
cmu> (time (foo (1- most-positive-fixnum)))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:
; Evaluation took:
; 1.59 seconds of real time
; 1.572572 seconds of user run time
; 0.013583 seconds of system run time
; 2,234,478,100 CPU cycles
; 0 page faults and
; 64 bytes consed.
;
T
cmu>
on the 1.4 GHz Athlon, and:
cmu> (time (foo (1- most-positive-fixnum)))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:
; Evaluation took:
; 25.19 seconds of real time
; 24.542253 seconds of user run time
; 0.047189 seconds of system run time
; 3,317,991,740 CPU cycles
; 0 page faults and
; 64 bytes consed.
;
T
cmu>
on the 133 MHz Pentium.
-Rob
-----
Rob Warnock, PP-ASEL-IA <rp...@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607
> cmu> (time (nth (1- most-positive-fixnum) '#1=(t . #1#)))
> ; 0.0 seconds of real time
> Now that's *really* weird! Is CMUCL's *compiler* somehow detecting that
> the circularity implies a constant result?
No, the circularity has nothing to do with it. It's detecting that
both (1- most-positive-fixnum) and (quote <anything>) are constants,
and CL:NTH is a known function with defined semantics, so it's
constant-folding the call at compile-time.
> Answer: No! Looks like some artifact with the TIME macro instead.
> When you run it this way you get *much* more reasonable results:
>
> cmu> (defun foo (x)
> (declare (fixnum x) (optimize (speed 3)))
> (nth x '#1=(t . #1#)))
X here isn't a constant, so the compiler can't constant-fold the call
to NTH.
Christophe
--
http://www-jcsu.jesus.cam.ac.uk/~csr21/ +44 1223 510 299/+44 7729 383 757
(set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b)))
(defvar b "~&Just another Lisp hacker~%") (pprint #36rJesusCollegeCambridge)
> It *is* possible to implement this in time proportional to the number
> of cons cells in the circular list, rather than in the integer n passed
> to NTH (or NTHCDR) (idea: detect circularity using the standard two pointer
> trick and use MOD to reduce n to a manageable size.) I've suggested that
> to the cmucl and sbcl implementors.
>
Could someone please clue me in on the "standard two pointer trick"?
Thanks,
Tim
--
*DOH!!*
Thanks, that also explains the long silence (much longer on the
slower machine!) that I saw between the "Compiling LAMBDA NIL:"
and "Compiling Top-Level Form:" messages... ;-} ;-}
> Could someone please clue me in on the "standard two pointer trick"?
It's used to detect if a list is circular. Two pointers are marched
down the list, one twice as fast as the other, and if they ever become
equal after time 0 the list is circular (and vice versa.)
(defun circular-list-p (x &aux (y x))
(loop
(when (or (atom y) (atom (cdr y))) (return nil))
(setf y (cddr y)
x (cdr x))
(when (eq x y) (return t))))
Paul
Do a web search for:
tortoise hare circular
or perhaps better:
tortoise hare "circular list"
Also see CLHS "Function LIST-LENGTH". As shown there, a common approach
is to have the hare move at twice the speed of the tortoise, but many
other winning strategies are possible [iff your termination tests are
correct!]. E.g., make the tortoise move at half the speed of the hare.
[It is useful to ask why/how this might be different than the previous,
and whether or not one might be "better" than the other in some sense.]
-Tim
--