If I understand things, what one would like DEFCONSTANT to do is to
define a constant variable which is bound to object (of any class, not
just something that MAKE-LOAD-FORM works for) such that you always get
`the same' object, where `the same' really means EQL (and perhaps that
is the same as `similar' in the hyperspec?)
(And I, anyway, would like those objects to be mutable, if possible).
But this is hard, because of file compilation. If file A has a
DEFCONSTANT for some variable, and source files B & C refers to it,
then the compiled version of files B & C can't refer to the same
object because they don't share an address space (with each other or
file A). So you end up either having different objects perhaps, or
having references to constant variables go through the variable name
and thus do the whole looking-up-dynamic-bindings thing that you're
trying to avoid.
So I was thinking, how would Fortran do this? I think the answer is
`by fixing it up in the linker'. The unlinked object files would have
unsatisfied references in them to the constant, which would be fixed
up at link time to make a reference to its memory location. If the
constant is an immediate object you wouldn't do this, but then you
wouldn't have the problem in Lisp either.
Couldn't CL do the same thing? File compiling code that referenced a
constant would (for non-immediate constants), result in a fasl file
which had dangling references in it, which would be stitched up at
load time. This would require there to be a little table of
registered constants somewhere, but it shouldn't be hard to arrange
that, and the stitching things up at load time is just what linkers do
in any case. (Note that by `stitching up' I mean `wiring the address
into the code' and not going through the symbol value.) A naive
implementation would also cause there to be a load-order dependence
between the files that define the constant and any users of it, but a
more sophisticated implementation could cause code that would create
and register the constant if it did not exist to be dumped with each
client file (probably optionally, otherwise files that referred to
many constants would get big).
But perhaps this is no good because it won't allow you to optimise things like:
(defconstant foo '(a b))
(defun blib ()
(cdr foo))
(Can other languages deal with this kind of case, presumably by
running code at link time?). You can optimise away the common numeric
cases though.
Or maybe I'm just confused again.
--tim
That's what I would like and I would guess most non language lawyers would
expect. Most of the code that I've seen that uses defconstant expects that
anyway.
> (And I, anyway, would like those objects to be mutable, if possible).
I would too but that's a separate issue. It'd be nice if whether it's
immutable or not was spelled out a little clearer.
> But this is hard, because of file compilation. If file A has a
> DEFCONSTANT for some variable, and source files B & C refers to it,
> then the compiled version of files B & C can't refer to the same
> object because they don't share an address space (with each other or
> file A). So you end up either having different objects perhaps, or
> having references to constant variables go through the variable name
> and thus do the whole looking-up-dynamic-bindings thing that you're
> trying to avoid.
>
> So I was thinking, how would Fortran do this? I think the answer is
> `by fixing it up in the linker'.
> Couldn't CL do the same thing?
I don't see any obvious reason it couldn't. But since the spec doesn't
require it, you can't currently count on it. Maybe it could be fixed in the
next revision of the ANSI spec.
> But perhaps this is no good because it won't allow you to optimise things like:
>
> (defconstant foo '(a b))
>
> (defun blib ()
> (cdr foo))
There's an old cliche about getting it right first, then worry about
optimizations.
Mike McDonald
mik...@mikemac.com
Your analysis of what one might want defconstant to do makes sense to
me.
> But perhaps this is no good because it won't allow you to optimise things like:
>
> (defconstant foo '(a b))
>
> (defun blib ()
> (cdr foo))
I like to view this kind of optimization, which I refer to as partial
evaluation, as an issue of declaring how rigid a binding should be, and
that this is orthogonal to the other issues you describe.
For example, if you tell a compiler that the binding for CDR is
inlinable at compile time (which, in the case of cl:cdr, of course, it
already knows), AND if you tell it that the binding of FOO is inlinable
at compile time (which presumably would be the default for things
defined by DEFCONSTANT, then BLIB can be compiled to just return '(b).
Will (eq (blib) (cdr foo)) in all combinations of what was compiled
where? Perhaps not.
If you think that (eq '(a b) '(a b)) should be true in file-compiled
code, then you probably want to use a Lisp which coalesces literals in
the file compiler, in which case if FOO and BLIB are compiled in the
same file, then (eq (blib) (cdr foo)) will be true. But this has to do
with coalescing and nothing to do with the semantics of defconstant.
If your compiler doesn't do this, or if FOO and BLIB are in different
files, then maybe you want to declare that FOO cannot be inlined at
compile time, but at load time. In this case, perhaps (cdr foo) gets
transformed by the compiler into something like (load-time-value (cdr
foo)) -- if your compiler supports this sort of thing.
My point is that the semantics of these kinds of optimizations ought to
be separate issue (and separately controllable) from the issue of what
defconstant should be doing.
In my opinion, an implementation of defconstant should:
1. At compile time, and if the implementation so chooses, evaluate the
form and make the value available during compilation, providing some
suitable defaults for whatever implmentation-specific declaration
machinery is available, regarding the stability of the binding.
2. Arrange things such that at after load time:
- the value is made available through the symbol-value of the symbol
(as for
all global "variables"
- the value is available for any code which has been compiled to
reference it
(thought whatever other slick mechanism that the implementation
provides).
- Rebinding is forbidden.
Maybe there's bugs here, but the really important thing is that
regardless of whether the spec does or does not now say anything about
EQL'ness for 1 and 2, it would be possible for us to CREATE a definition
which does say something about it, one way or the other, and that this
issue does not necesarrilly cut off issues about controlling/allowing
optimizations, which could be decided/documented separately.
As I sketched in my message yesterday (which may be hard to
locate, since you started a new thread) stitching things up
is one of the capabilities of LOAD-TIME-VALUE.
[I wrote]
> > But perhaps this is no good because it won't allow you to optimise things like:
> >
> > (defconstant foo '(a b))
> >
> > (defun blib ()
> > (cdr foo))
>
> There's an old cliche about getting it right first, then worry about
> optimizations.
>
Yes, and Lisp wins big time here! Using the `fix it up at load time'
approach, you can obviously have the dangling constant reference
essentially be a *function* of the constant which will compute the
actual thing you want once, at load time (`link time'). The compiler
just has to work out the largest expression which depends on the
constant, convert it to a function (since it depends only on the
constant, there can be no variable capture, so it just amounts to
wrapping it in lambda, and arrange for the loader to call that
function, which will return the thing to splice in.
--tim
> > (defconstant foo '(a b))
> > (defun blib () (cdr foo))
>
> Will (eq (blib) (cdr foo)) in all combinations of what was compiled
> where? Perhaps not.
Right. This is an important case that shows that the internals of the
datastructure need to retain their identity as well in order for the
"magic" to be invisible. Making this work is tough.
> ... perhaps (cdr foo) gets
> transformed by the compiler into something like (load-time-value (cdr
> foo)) -- if your compiler supports this sort of thing.
This sounds scary and hard. Maybe someone who does a lot of compiler work
can convince me otherwise, but this is the kind of thing that pushes the
edge of what I see compilers being able to straightforwardly do, and so is
beyond the point where I'm prepared to insist the language should do all
the work itself.
> In my opinion, an implementation of defconstant should:
>
> 1. At compile time, and if the implementation so chooses, evaluate the
> form and make the value available during compilation, providing some
> suitable defaults for whatever implmentation-specific declaration
> machinery is available, regarding the stability of the binding.
I'm not sure I totally understand what you said here about stability,
but nothing that I did understand sounded wrong. :-)
> 2. Arrange things such that at after load time:
> - the value is made available through the symbol-value of the symbol
> (as for
> all global "variables"
> - the value is available for any code which has been compiled to
> reference it
> (thought whatever other slick mechanism that the implementation
> provides).
> - Rebinding is forbidden.
These all sound ok. Mike is asking for more than this, though, right?
> Maybe there's bugs here, but the really important thing is that
> regardless of whether the spec does or does not now say anything about
> EQL'ness for 1 and 2, it would be possible for us to CREATE a definition
> which does say something about it, one way or the other, and that this
> issue does not necesarrilly cut off issues about controlling/allowing
> optimizations, which could be decided/documented separately.
I certainly agree the spec could be more clear. To the extent that the
behavior isn't nailed down firmly, I blame J13 collectively. To the extent
that it's poorly explained what the present defaults are, I'll take the
blame for that. I knew it was a mess and probably should have said so
more clearly somewhere. To some degree, I raised it in discussions but
probably not pointedly enough. We were busy folks putting this thing together
and there never seemed to be time enough to handle every last loose end...
Right! If compile-file was required to use (load-time-value +FOO+) instead
of using a "similar" value, then EQLness of "constant variables" would be
preserved. This is what I want!
Mike McDonald
mik...@mikemac.com
This may be what you want, but it just pushes the bubble under the carpet
to a different place. Howard's example illustrates the full complexity of
the problem and the reason this is not so simply done: component structures.
If you inline a constant (another thing defconstant is for) and you inline
operations on the constant (another thing defconstant is for), then you
get in a place where the identity of the subparts isn't tagged and the
bookkeeping is hard. It is not nearly as easy to keep track of the fact,
as he cited that in
+---- file1.lisp ---
| (defconstant +abc+ '(A . #1=(B C)))
+-------------------
+---- file2.lisp ---
| (defun bc (x) (cdr +abc+))
+-------------------
If CDR is inlined and aggressively optimized, then this is the same as
+---- file2.lisp ---
| (defun bc (x) #1#)
+-------------------
but the question is what will make the compiler notice (since we're compiling
a separate file than file1.lisp) what will make the compiler know that this
constant came from +abc+ in order to assure it's unique. It's not like
it walks around with a tag on it saying "I am the cdr of +abc+'s list." so
the compiler might not reconsolidate it.
To make this work, unless I'm not seeing something, you have to presupose
that either the defconstant registers not only the object but all of its
subforms and you have to assume that at minimum coalescing is done for all
expressions that are subforms of declared constants, or (probably easier)
just for all expressions. But the language doesn't require coalescing,
it only permits it. And certainly requiring it would create a burden on
interpreters, virtually making them impossible to write. It would also make
a predictably strong division between non-file-compiled and file-compiled
things where the compiler was reliably providing a service that the
in-core compiler was not, and that in turn would mean that people would
write more fragile code like we did in Maclisp where people relied
on the file compiler being called and code randomly broke if it wasn't
written that way. In the present scheme you can get what you want with
a bit more work, and get it reliably, but you don't disempower the existing
language features.
Howard's example is easy to handle. What compile-file has t do is transform
(defun bc ()
(cdr +abc+))
into
(eval-when (:load-toplevel)
(let ((abc-value (load-time-value +abc+)))
; check to see if the optimization conditions still hold
(if (not (consp abc-value))
(error "The +ABC+ changed it's value incompatibly with the compile time
value. A CONS was expected."))
(patch-address XXXX (cdr abc-value))
))
(defun bc ()
(return-contents-of-address XXXX))
Then (eq (cdr +abc+) (bc)) will be T.
> If you inline a constant (another thing defconstant is for) and you inline
> operations on the constant (another thing defconstant is for), then you
> get in a place where the identity of the subparts isn't tagged and the
> bookkeeping is hard. It is not nearly as easy to keep track of the fact,
> as he cited that in
>
> +---- file1.lisp ---
> | (defconstant +abc+ '(A . #1=(B C)))
> +-------------------
>
> +---- file2.lisp ---
> | (defun bc (x) (cdr +abc+))
> +-------------------
>
> If CDR is inlined and aggressively optimized, then this is the same as
>
> +---- file2.lisp ---
> | (defun bc (x) #1#)
> +-------------------
Nope, it could be
(defun bc () (something-similar-to #1#))
instead.
Mike McDonald
mik...@mikemac.com
> But perhaps this is no good because it won't allow you to optimise things like:
>
> (defconstant foo '(a b))
>
> (defun blib ()
> (cdr foo))
I'd do
(defun blib ()
'#.(cdr foo))
I like to compute stuff at read time when DEFCONSTANT is involved.
Christopher
Sometimes this works; but not _always_, for the simple reason that
read time is earlier than compile time which is the time when
constants are expanded at the earliest.
Vassil Nikolov <vnik...@poboxes.com> www.poboxes.com/vnikolov
(You may want to cc your posting to me if I _have_ to see it.)
LEGEMANVALEMFVTVTVM (Ancient Roman programmers' adage.)
-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
Then don't do that. Surely if DEFCONSTANT is supposed to preserve
identity (and I must say it was a schock to me to hear that some
implementors thought it needn't), then the compiler is in error if
it constant-folds (I guess that's what you mean by "inlines an
operation") in a way that fails to preserve that identity.
> It is not nearly as easy to keep track of the fact,
> as he cited that in
>
> +---- file1.lisp ---
> | (defconstant +abc+ '(A . #1=(B C)))
>
> +---- file2.lisp ---
> | (defun bc (x) (cdr +abc+))
>
> If CDR is inlined and aggressively optimized, then this is the same as
>
> +---- file2.lisp ---
> | (defun bc (x) #1#)
>
> but the question is what will make the compiler notice (since we're compiling
> a separate file than file1.lisp) what will make the compiler know that this
> constant came from +abc+ in order to assure it's unique.
That's not hard: The compiler _doesn't_ constant-fold over references to
constant variables whose values are data structures. I know LispWorks
works like that. Steven M. Haflich indicated (in the "`fast' global
variables" thread) that ACL "only dereferences constants like numbers
and symbols". Is there any implementation except MCL who don't get it
right?
In any case, implementing constant identity is exactly the kind of
thing that should be left to compiler writers, the language should
just offer a way to declare a constant. I must say that if it was the
intention of the committee to let DEFCONSTANT break a basic language
feature like that, the standard should have included a clear warning.
--
Pekka P. Pirinen Harlequin Group plc, Cambridge, UK
"If you don't want to be replaced by a machine, don't act like one."
from _Ideas and Information_ by Arno Penzias
> Kent M Pitman <pit...@world.std.com> writes:
> > If you inline a constant (another thing defconstant is for) and you
> > inline operations on the constant (another thing defconstant is
> > for), then you get in a place where the identity of the subparts
> > isn't tagged and the bookkeeping is hard.
>
> Then don't do that. Surely if DEFCONSTANT is supposed to preserve
> identity (and I must say it was a schock to me to hear that some
> implementors thought it needn't), then the compiler is in error if
> it constant-folds (I guess that's what you mean by "inlines an
> operation") in a way that fails to preserve that identity.
No, the problem is you don't know what the compiler will inline, so you
can't know what to "not do".
Further, a useful property of programs is that if you attach a name
to something, it doesn't change its semantics. You're basically saying
in this case that it does. I think that's wrong.
I'm not concerned with the use of Lisp as an implementation language for
getting something done. Lisp is not about that. If I wanted that,
I'd use C. Lisp is about allowing me to express things in terms of my
high-level intent by saying things that I want clearly, not obliquely.
One of its strengths is that it allows you to freely use anonymous things.
If shifting between named and unnamed things starts to have mysticism
associated with it, that kills that property of the language.
> That's not hard: The compiler _doesn't_ constant-fold over references to
> constant variables whose values are data structures.
This may be a truth about some compiler (or many) but is not a truth
about Lisp.
> I know LispWorks
> works like that. Steven M. Haflich indicated (in the "`fast' global
> variables" thread) that ACL "only dereferences constants like numbers
> and symbols". Is there any implementation except MCL who don't get it
> right?
Defining that 60 compilers do it this way would be meaningless unless
you can convince me that there is a reason to do it this way. I don't
see the conceptual basis for saying it has to be done this way.
> In any case, implementing constant identity is exactly the kind of
> thing that should be left to compiler writers, the language should
> just offer a way to declare a constant. I must say that if it was the
> intention of the committee to let DEFCONSTANT break a basic language
> feature like that, the standard should have included a clear warning.
I'm missing what language feature you're saying DEFCONSTANT breaks.
> I'm not concerned with the use of Lisp as an implementation language for
> getting something done. Lisp is not about that. If I wanted that,
> I'd use C.
Do *you* mean that *you* never got anything done using Lisp? :)
(sorry, couldn't resist :) )
Cheers
--
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
http://www.parades.rm.cnr.it/~marcoxa
> Kent M Pitman <pit...@world.std.com> writes:
>
> > I'm not concerned with the use of Lisp as an implementation language for
> > getting something done. Lisp is not about that. If I wanted that,
> > I'd use C.
>
> Do *you* mean that *you* never got anything done using Lisp? :)
> (sorry, couldn't resist :) )
You're not the only person to catch on this, so I'll reply with some
seriousness.
No, I use Lisp for implementation of interesting things all the time.
But for many things (not all), the implementation could be done in any
language... but the "expression" could not.
Lisp is about rising above implementation to saying something of lasting
value.
Kent, I must disagree with you on this. I *am* concerned with the use
of Lisp as an implementation language, precisely *because* it allows me
to express things in terms of my high-level intent.
Modern Common Lisp compilers allow me to tune the critical parts of my
code to within a few percent of C code, but without the pain of using a
fragile, brittle quasi-portable assembly language.
Back to DEFCONSTANT: Is there a consensus that something about
DEFCONSTANT is counter-intuitive? With the revival of X3J13, now might
be the time to do something about that.
-- Chuck
--
Chuck Fry -- Jack of all trades, master of none
chu...@chucko.com (text only please) chuc...@home.com (MIME enabled)
Lisp bigot, mountain biker, car nut, sometime guitarist and photographer
The addresses above are real. All spammers will be reported to their ISPs.
> Back to DEFCONSTANT: Is there a consensus that something about
> DEFCONSTANT is counter-intuitive? With the revival of X3J13, now might
> be the time to do something about that.
>
> -- Chuck
I can only speak for myself but I find MCL's legal but perverse
interpretation of "same" to mean "similar" extremely counter-intuitive. And
based upon all of the code where I see other people checking DEFCONSTANT
symbols with EQ, I'd guess I'm not the only one.
Mike McDonald
mik...@mikemac.com
> I can only speak for myself but I find MCL's legal but perverse
> interpretation of "same" to mean "similar" extremely counter-intuitive. And
> based upon all of the code where I see other people checking DEFCONSTANT
> symbols with EQ, I'd guess I'm not the only one.
For myself I would replace counter-intuitive with unexpected. I can see how
and why they might have done it, and I think it makes sense. The existence
of load-time-value (and #.) solve most of the problems that I would have
had with constants. On the other hand, the spec doesn't make this pitfall
appropriately obvious, and something should certainly be done about that.
Sunil
I think it's pretty clear that IWBN to do something about it. To sum
up, in the very least the programmer should be given control over
the inlining of constants, similar to what is there for functions.
Another thing that might be useful would be control over the
mutability of the object that is the value of a constant (similar
to what is there for LOAD-TIME-VALUE).
> In article <7dr23c$2re$1...@shell5.ba.best.com>,
> chu...@best.com (Chuck Fry) wrote:
> (...)
> > Back to DEFCONSTANT: Is there a consensus that something about
> > DEFCONSTANT is counter-intuitive? With the revival of X3J13, now might
> > be the time to do something about that.
>
> I think it's pretty clear that IWBN to do something about it.
FWIW, it's not clear to me.
There might be a market for something else, but I don't see a problem
with DEFCONSTANT as it stands and is implemented.
I'll say it again: a "constant" is an abstract, not an object with
identity. It can't be an object with identity because it lives in two
different address spaces: the pre-compile-file-compiler address space
and the loaded addresss space. It's like "justice" or "pi", not like
a particular pebble on a particular beach--it's something you
recognize not by its object-pointer but by its description and use.
A "literal" in code is the same. It has identity only to the extent that
you arrange for it, and the reason coalescing is allowed isn't that
object identity isn't important but that any object which can be named
a compile time can't also live at runtime. There is no meaning to that
other than similarity.
Within the same address space (that is, speaking about COMPILE rather
than COMPILE-FILE) object identity is not and should not be violated
because it's not the act of compilation that is the problem--it's the
act of externalization and internalization, and that requires registration.
DEFCONSTANT has no problem with in-core compilation because that has no
way to and no reason to shift object identity. DEFCONSTANT has no problem
with out-of-core compilation because there is no "reasonable expectation
of identity" in a place where there are two address spaces involved other
than those identities which are arranged for. Therefore, IMO, DEFCONSTANT
has no problem period.
If you want to say there is a market for a registration thing, you are
welcome to fill it with an operator and to lobby for its standardization
but I will oppose it if it has the property described earlier, which
is that the object's "outer identity" is registered but the "inner identity"
is not. IMO, if I have identity, then my kidneys have identity, and you
can't change one without the other. To say that if I want the same
kidneys in two different environment, I have to register their name as well
is nonsensical to me. It's fine for engineers to use this kind of
practical accomodation to their cost/benefit trade-off, but it's not fine
for languages or language designers to make such short-sighted trades.
> To sum
> up, in the very least the programmer should be given control over
> the inlining of constants, similar to what is there for functions.
The programmer HAS control. LOAD-TIME-VALUE gives you the choice not
to do inlining. QUOTE gives you the ability to do inlining. There is
no such concept as "in between" (that I have seen articulated in a way
that is not intimately tied up with a particular processing mode and set
of tools).
> Another thing that might be useful would be control over the
> mutability of the object that is the value of a constant (similar
> to what is there for LOAD-TIME-VALUE).
I don't know what this is about. LOAD-TIME-VALUE seems entirely
powerful enough.
(Is this something I missed in your earlier table? It suffices to say
I should go back and look at a particular entry in that table if
that's all that's going on here. I don't have it handy, but I could
go look it up again if you thought it would help. It seemed to me
when I glanced at it earlier that each of the things in that table
were present or writable in user code.)
Kent M Pitman wrote:
> Lisp is about rising above implementation to saying something of lasting
> value.
After several years of 6 days a week, 14 hours a day development
and 200K of C++ code in 3DJam, it's crumbling under its own weight
and all I want is throw away that code and rewrite it in Lisp.
That would never happen in Lisp. Talk about diamonds.
When you write in Lisp, _that_ is forever.
--
Fernando D. Mato Mira
Real-Time SW Eng & Networking
Advanced Systems Engineering Division
CSEM
Jaquet-Droz 1 email: matomira AT acm DOT org
CH-2007 Neuchatel tel: +41 (32) 720-5157
Switzerland FAX: +41 (32) 720-5720
I am aware that this has become rather long but please bear with me.
In article <sfwpv5q...@world.std.com>,
Kent M Pitman <pit...@world.std.com> wrote:
> Vassil Nikolov <vnik...@poboxes.com> writes:
>
> > In article <7dr23c$2re$1...@shell5.ba.best.com>,
> > chu...@best.com (Chuck Fry) wrote:
> > (...)
> > > Back to DEFCONSTANT: Is there a consensus that something about
> > > DEFCONSTANT is counter-intuitive? With the revival of X3J13, now might
> > > be the time to do something about that.
> >
> > I think it's pretty clear that IWBN to do something about it.
>
> FWIW, it's not clear to me.
>
> There might be a market for something else, but I don't see a problem
> with DEFCONSTANT as it stands and is implemented.
My reason to write that was not that there was a problem with DEFCONSTANT
as it is defined in ANSI Common Lisp (I, too, don't see a problem with
that). The reason was that people want (perhaps even need...)
something else sometimes, and moreover it is apparently easy to
incorrectly believe that DEFCONSTANT provides that something else.
(So I am not calling for changes to DEFCONSTANT but for making
its exact functionality a little more explicit perhaps, and
augmenting the language with some new defining forms and/or some new
declarations.)
I called that something else a constant reference (I don't insist that
this is the best terminological choice). I am not quite sure if it
is fully implementable by the user. By `fully implementable' I mean
that a variable that is a constant reference must not be bindable,
and I am not sure that my solution which does DEFCONSTANT of the
same symbol that is defined as a symbol macro is quite legal (I
refer here to my previous posting). And in any case implementing
a constant reference by a closure may mean some performance loss.
(Of course, there might be a better way to implement a constant
reference that I haven't thought of.)
A constant reference is (at least informally) like a notinline
constant variable. The object that is its value may or may not
be mutable, however.
With DEFCONSTANT, the value must be available no later than compile
time. With a constant reference, I see no reason why it shouldn't
be possible to supply the value even as late as run time. But even
if I am wrong, and the value must be made available earlier, then
perhaps programmer control over mutability can be done via the
second argument of LOAD-TIME-VALUE (as someone pointed out some
time ago).
> I'll say it again: a "constant" is an abstract, not an object with
> identity. It can't be an object with identity because it lives in two
> different address spaces: the pre-compile-file-compiler address space
> and the loaded addresss space. It's like "justice" or "pi", not like
> a particular pebble on a particular beach--it's something you
> recognize not by its object-pointer but by its description and use.
Once again, could it be that `constant' and `constant reference' are
sometimes confused, and this is the source of discontent about
DEFCONSTANT?
To me, the notion of a constant reference makes sense and is distinct
from the notion of a constant, but I am not absolutely sure I am not
missing something.
> A "literal" in code is the same. It has identity only to the extent that
> you arrange for it, and the reason coalescing is allowed isn't that
> object identity isn't important but that any object which can be named
> a compile time can't also live at runtime. There is no meaning to that
> other than similarity.
And a constant reference is definitely not the same as a literal object.
(...)
> Therefore, IMO, DEFCONSTANT
> has no problem period.
Indeed. IMO the problem is with expectations, arising from real or
perceived needs, that are directed at DEFCONSTANT and that need to
be `redirected.' (This sounds like social engineering... Big
Brother is watching your constants!)
> If you want to say there is a market for a registration thing, you are
> welcome to fill it with an operator and to lobby for its standardization
> but I will oppose it if it has the property described earlier, which
> is that the object's "outer identity" is registered but the "inner identity"
> is not. IMO, if I have identity, then my kidneys have identity, and you
> can't change one without the other. To say that if I want the same
> kidneys in two different environment, I have to register their name as well
> is nonsensical to me. It's fine for engineers to use this kind of
> practical accomodation to their cost/benefit trade-off, but it's not fine
> for languages or language designers to make such short-sighted trades.
I must honestly admit that I don't quite get the point here.
Does this mean that outer identity is the reference and inner identity
is the value?
(If so, I don't quite get the kidney example as I see a part-of, not
a reference-value, relationship between a human and the human's kidneys,
but I guess this is not very important.)
What is registration? Is e.g. interning a specific case of registration?
Can registration refer also to references, e.g. to what a linker does
in order to resolve references, i.e. when references become values
themselves?
`The act of externalization and internalization, and that requires
registration'---this means that to preserve eqlness of the value across
externalisation and subsequent internalisation, one needs some
registration mechanism?
I can see 4 different scenarios of interplay between inner and outer
identity (in the sense I understand) in which EQL identity can be
preserved or lost for different reasons.
For the examples,
(defun make-thing ()
"Return an uninterned object---a fresh vector of bignums."
(make-array 17 :initial-value (1+ most-positive-fixnum)))
(1) DEFCONSTANT where the value is not interned in any way, e.g.
(defconstant =foo= (make-thing) "some object")
The value returned by another evaluation of the same form that
initialises =FOO= would not be EQL to =FOO= and (EQL =FOO= =FOO=)
may or may not be true depending on whether =FOO= is inlined.
No registration of any kind here.
This is similar to
(define-symbol-macro =foo-as-symbol-macro= (make-thing))
*but* (MAKE-THING) is evaluated only once for _all_ files that
refer to =FOO=, and evaluation may take place as early as compile
time: with =FOO-AS-SYMBOL-MACRO=, evaluation will take place at
run time, and with
(define-symbol-macro =foo-ltv= (load-time-value (make-thing)))
evaluation will take place at load time (and not once for _all_
files but once for _each_ file).
(2) DEFCONSTANT where the value is interned in some way, e.g.
(defconstant =bar= 'baz "an interned symbol")
EQL identity is guaranteed not by virtue of what DEFCONSTANT does
but by virtue of the properties of the value given to the constant
variable: (EQL =BAR= =BAR=) and (EQL =BAR= 'BAZ) will both be true
because this is how interned symbols work, and it doesn't matter
if =BAR= is inlined or not. The reference is not registered,
the value is.
(By the way, this reminds me of the concept of otherwise accessible
parts.)
(3) A constant reference where the value is not interned in any way:
EQL identity is guaranteed only via the same reference but not between
the constant reference and the value produced by another evaluation
of the initialisation form. The reference is registered (terminology?),
the value isn't.
File 0:
(define-constant-reference +quux+ (make-thing) "something else")
(defconstant =foo= (make-thing) "something") ;just for comparison
File 1:
(defun quux1 () +quux+)
(defun foo1 () =foo=)
File 2:
(defun quux2 () +quux+)
(defun foo2 () =foo=)
After compiling and loading everything:
(eql (quux1) (quux2)) => t
(eql (foo1) (foo2)) => unspecified
(eql +quux+ (make-thing)) => nil
(eql =foo= (make-thing)) => nil
(4) A constant reference where the value is interned: EQL identity
is guaranteed in all cases, i.e. one can obtain the same (EQL) object
not only by another occurrence of the same constant reference but
also by another evaluation of the initialisation form (or its
equivalent, in the same sense that 'BAZ and (INTERN "BAZ") are
equivalent (ceteris paribus: same package, same read case)).
> > To sum
> > up, in the very least the programmer should be given control over
> > the inlining of constants, similar to what is there for functions.
>
> The programmer HAS control. LOAD-TIME-VALUE gives you the choice not
> to do inlining. QUOTE gives you the ability to do inlining. There is
> no such concept as "in between" (that I have seen articulated in a way
> that is not intimately tied up with a particular processing mode and set
> of tools).
Sorry, that was a mistake I made.
I wrote just `constants' where I should have written `constant
variables.' Correct me if I am wrong, but inlining may happen no matter
if the initialisation form in DEFCONSTANT is QUOTE or LOAD-TIME-VALUE.
Or should the compiler care how the value of a constant variable was
produced when it decides to inline it?
(By the way, the description of LOAD-TIME-VALUE says, `the result
of this evaluation then being treated as a literal object at run time'
(referring to the evaluation as arranged by the file compiler). Maybe
the words `literal object' are not fully appropriate here as this
form may produce, depending on circumstances, a modifiable object.
Would it have been better to say `immediate object'?)
> > Another thing that might be useful would be control over the
> > mutability of the object that is the value of a constant (similar
> > to what is there for LOAD-TIME-VALUE).
>
> I don't know what this is about. LOAD-TIME-VALUE seems entirely
> powerful enough.
Once again, please read `constant variable' for `constant' in my
sentence. A constant variable is always immutable. (There is
no `read-only-p' parameter to DEFCONSTANT.)
> (Is this something I missed in your earlier table? It suffices to say
> I should go back and look at a particular entry in that table if
> that's all that's going on here. I don't have it handy, but I could
> go look it up again if you thought it would help. It seemed to me
> when I glanced at it earlier that each of the things in that table
> were present or writable in user code.)
I can see now the problem is not with that table not being comprehensive
but with the need to clarify the concepts before or alongside with
providing the table.
And as I wrote above, I am not quite satisfied that DEFINE-xxx-REFERENCE
are adequately writable in user code because I am concerned about the
legality of
(progn
(define-symbol-macro foo ...) ;use FOO as if it is a variable
(defconstant foo ...)) ;but make it unbindable
I see no reason why it shouldn't be illegal but I don't have that
`warm, fuzzy feeling' (CLtL2, 11.9, p. 284, near bottom). My worry
comes from the fact that the name of a global special variable
cannot be used as a symbol macro. Constant variables are not global
special variables but still...
(And `adequately' here also means access in O(1); implementing
somehow a constant reference on top of a `private' global variable
does not ensure that because of the possibility of deep binding.)
Thanks for reading up to here, and for any further remarks,
Vassil.
just for the heck of it: I have tuned a _very_ CPU-heavy function I wrote
in Common Lisp over a year ago so it went from the unsatisfactory 623 µs
per call to the very pleasant 4.7 µs per call.
the strictly equivalent C function that people are entirely _satisfied_
with, performance-wise, takes 92 µs per call. very frequently, I find
that Common Lisp allows me to experiment with algorithms so much faster
than I can in C and the like, so I can change methodology and approach as
fast as they can do another optimization attempt. this means that a good
Common Lisp programmer can find the optimal algorithm _and_ the optimal
implementation in less time than the C programmer can find the optimal
implementation.
the C mind-set is that C is fast. this is even less true than their idea
that CL is slow. writing really fast C code is _incredibly_ hard, and
you might as well write it in assembly after you have seen what the
compiler is doing to the overall code. I have squeezed the last drop of
blood out of many a CPU in my time, but never has it been easier to do it
than with Allegro CL with its instruction-level profiler, hackable LAP
code (thanks, Duane!), and code transformation with compiler macros (a
standard CL facility). this stuff just isn't available to C programmers.
if you can't outperform C in CL, you're too good at C.
#:Erik
I think what you're looking for is a read-only variable, i.e. the only
thing special about it is that it may not be used as the target of an
assignment.
Unfortunately, CL doesn't provide this as a feature. DEFCONSTANT makes the
variable read-only, but it also allows inline substition.
However, if the value assigned with DEFCONSTANT is not something that can
be determined at compile time, you should get what you want. E.g.
(defun my-list (&rest args)
(apply #'list args))
(defconstant +foo+ (my-list 1 2 3))
should be guaranteed to bind +FOO+ to a mutable list (1 2 3). Since
MY-LIST isn't proclaimed INLINE, the compiler may not open-code the call.
--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
I don't know if this might be useful behaviour, but this is not how
DEFCONSTANT is defined to work. Portable programs must make the value
available at compile time as it is left to the discretion of the
implementation whether the initialisation value supplied to DEFCONSTANT
will be evaluated at compile time or at load time.
Besides, it doesn't matter if the initialisation value is provided by
an inline call or not. The point of this thread, as far as I see it,
is whether the value of the constant variable is inlined or not wherever
there is a reference to the constant variable.
(As an aside, for the sake of nit-picking, the compiler may not open-code
a call when a function is proclaimed NOTINLINE, not when it isn't
proclaimed INLINE. Of course, the compiler is not likely to open-code
a call when a function isn't proclaimed (or declared) INLINE.)
> (defun my-list (&rest args)
> (apply #'list args))
>
> (defconstant +foo+ (my-list 1 2 3))
>
> should be guaranteed to bind +FOO+ to a mutable list (1 2 3). Since
> MY-LIST isn't proclaimed INLINE, the compiler may not open-code the call.
A minor nit: Although the above is probably very safe (i.e. portable)
code, I disagree that the compiler is required not to open-code my-list.
On one hand, it would very space inefficient for an implementation
to provide inlining for user-defined functions that are not declaimed
as inline (unless it was doing some sort of block-compilation). But,
OTOH, I don't know of anything in the spec that prohibits some
overzealous implementation from doing so.
However, if my-list had been declaimed notinline, then the compiler
is definitely constrained by the spec from open-coding it, because the
notinline declaration/declamation is one of only 4 delaration specifiers
(besides declaration, (greater) safety, and special) that the compiler
cannot ignore.
Thus, to make this code completely portable, you could place a
(declaim (notinline mylist))
before the defun.
--
Duane Rettig Franz Inc. http://www.franz.com/ (www)
1995 University Ave Suite 275 Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253 du...@Franz.COM (internet)
> > [...kidneys...]
> I must honestly admit that I don't quite get the point here.
As with my equality paper, it is an intentional issue just how far down
identity goes. In a cons, identity is one cons deep. In a list, it's
one backbone deep. In an alist, it's two-ply deep.
When you write a user-defined datatype, the compiler can't just be free to
lose the identity of the sub-parts becuase it doesn't understand the
interrelation. If a compiler dumped me out into a fasl file and then read
me back in, it would not suffice to just conjure two new kidneys and say
"well, you should have declared them separate constants if you wanted the
human beingness of kent to work again after you loaded him". It's better
for the language to make it clear that you can't save a complex object
at all and expect to get it back unless you write the code to do it because
it would be lying to you if it declared otherwise.
I think the reason a lot of you think the language should do this is
that you're not used to thinking like a language designer. I don't
mean the technical part--there are a lot of sharp technical people. I
mean the guilt thing. Language design is a lot about seeing people
flail because you gave them enough rope to hang themselves and they
did. And I claim that what separates good design from bad design is
that good designers not only look for things that are hard in hopes of
making them easier, but also they look to where people get in trouble
a lot and are willing to take the personal guilt involved in making
life more difficult by saying "you won't be allowed to get X for free"
because they know that "for free" is a lie and they don't want you to
just end in a mire.
Just about anyone could design a more functional set of tools than a
language designer can design a language because tools only have to
work most of the time and you can kinda shrug and document the places
where they don't work. But the importance of being linguistic is that
you are the very concrete of the foundation on which all future things
will sit, and if you are not designed to be strong, even the tiniest
little cracks will add up after a while and their implications will
start to be felt in places that are far removed from the places that
people are focused on.
I'll give you an example that is off-topic but maybe illustrates what I
mean by things felt other places. Just the other day, I was talking
to someone about having systems tell you when you make certain dumb
little errors. An example might be calling a function with stupid arguments.
Not wrong, just stupid. For example:
(read-from-string stream :start 2)
is a classic. This is completely valid code and it doesn't start
reading from position 2. I would allege that it's appropriate to warn
about it because the probability is so high that even a mechanical program
wouldn't do it that it's worth the warning. But take the following case
that Maclisp used to warn about a lot:
(member x '(a))
I used to get warnings saying one argument to OR was probably a mistake
because it would open code this to
(or (eq x 'a))
and it would think it was dumb for me to use OR without at least two things.
This warning happened a lot and was well-appreciated for years until
macros came into common use, and then it was disastrous. I had code
that said:
(member x '(a $make-compiler-happy$))
because it was constructed by other code that did
`(member x '(,@things $make-compiler-happy$))
and I did not know how many things I would be given. The argument made
by the people who wanted the warning about the OR was that the warning was
"useful" but the argument made by the others was that "useful or not,
it doesn't scale". The language foundation has to be built on things
that scale. Similar problems came up about dead code. It's handy to
get warnings about code like
(cond (foo x) (t y) (bar z))
if no macros are involved, but if a macro is involved, and if the compiler
is deadset on issuing a warning, you're in a real problem and you have
to do weird things like:
(cond (foo x) (*lisp-wont-be-told-until-after-compilation-that-this-is-t* y) (bar z))
And even then, it's probably going to result in slower code.
> (defun make-thing ()
> "Return an uninterned object---a fresh vector of bignums."
> (make-array 17 :initial-value (1+ most-positive-fixnum)))
>
> (1) DEFCONSTANT where the value is not interned in any way, e.g.
>
> (defconstant =foo= (make-thing) "some object")
(Confess: This whole discussion was just an excuse to slip in some
subliminal references to =...= for constants, wasn't it? :-)
> The value returned by another evaluation of the same form that
> initialises =FOO= would not be EQL to =FOO= and (EQL =FOO= =FOO=)
> may or may not be true depending on whether =FOO= is inlined.
> No registration of any kind here.
>
> This is similar to
>
> (define-symbol-macro =foo-as-symbol-macro= (make-thing))
>
> *but* (MAKE-THING) is evaluated only once for _all_ files that
> refer to =FOO=, and evaluation may take place as early as compile
> time: with =FOO-AS-SYMBOL-MACRO=, evaluation will take place at
> run time, and with
>
> (define-symbol-macro =foo-ltv= (load-time-value (make-thing)))
>
> evaluation will take place at load time (and not once for _all_
> files but once for _each_ file).
I don't get the thing with the files. Files are an artificial concept
that you shouldn't introduce unless you absolutely have to. If you want
to talk about files with respect to externalization, there may be
no alternative. But "evaluation" is not about externalization, and indeed
evaluation may be insensitive to what file is involved (for
nested files, for cases where no file is involved, etc.).
> (2) DEFCONSTANT where the value is interned in some way, e.g.
>
> (defconstant =bar= 'baz "an interned symbol")
>
> EQL identity is guaranteed not by virtue of what DEFCONSTANT does
> but by virtue of the properties of the value given to the constant
> variable: (EQL =BAR= =BAR=) and (EQL =BAR= 'BAZ) will both be true
> because this is how interned symbols work, and it doesn't matter
> if =BAR= is inlined or not. The reference is not registered,
> the value is.
>
> (By the way, this reminds me of the concept of otherwise accessible
> parts.)
I just knew someone was going to raise that. It seems like it should
be related, but it's not. Indeed, just the opposite. The whole PROBLEM
is that the parts of a constant ARE accessible. Why? Because the whole
point of a constant is to use it in code. If you didn't use it, you
wouldn't need it. And having used it, you can't know what the system
has done with it--unless you forbid the system from doing useful things,
which seems silly.
> (3) A constant reference where the value is not interned in any way:
> EQL identity is guaranteed only via the same reference but not between
> the constant reference and the value produced by another evaluation
> of the initialisation form. The reference is registered (terminology?),
> the value isn't.
I'm not sure what it means to be the same reference.
In (let* ((x +quux+) (y x)) (eql x y)) is there one reference or two?
>
> File 0:
> (define-constant-reference +quux+ (make-thing) "something else")
> (defconstant =foo= (make-thing) "something") ;just for comparison
>
> File 1:
> (defun quux1 () +quux+)
> (defun foo1 () =foo=)
>
> File 2:
> (defun quux2 () +quux+)
> (defun foo2 () =foo=)
>
> After compiling and loading everything:
>
> (eql (quux1) (quux2)) => t
> (eql (foo1) (foo2)) => unspecified
> (eql +quux+ (make-thing)) => nil
> (eql =foo= (make-thing)) => nil
>
> (4) A constant reference where the value is interned: EQL identity
> is guaranteed in all cases, i.e. one can obtain the same (EQL) object
> not only by another occurrence of the same constant reference but
> also by another evaluation of the initialisation form (or its
> equivalent, in the same sense that 'BAZ and (INTERN "BAZ") are
> equivalent (ceteris paribus: same package, same read case)).
I guess what I'm saying is that similarity is about interning "as
needed", and this is closest to correct. For numbers, interning isn't
needed (though is often practiced for storage reasons) because there are
no comparison operators that can tell the difference between numbers
other than EQ which you are encouraged never to use, and there are no
mutators on numbers much though Drew McDermott used to lament the passing
of "bignums you could rplaca and rplacd" like you used to have in Maclisp.
For packages, interning is needed because there are mutators on packages.
For some structures, mutating is an issue and for some it's not, and
so appropriate MAKE-LOAD-FORM definitions have to be written.
> I wrote just `constants' where I should have written `constant
> variables.' Correct me if I am wrong, but inlining may happen no matter
> if the initialisation form in DEFCONSTANT is QUOTE or LOAD-TIME-VALUE.
I'm not positive I know the answer or that there is an answer.
I can see differing interpretations. It seems to me the whole point of
DEFCONSTANT is to say "the compiler may aggressively view the result of
this computation" and the whole point of saying LOAD-TIME-VALUE it to
say "the compiler may not aggressively view the result of this
computation". Do they cancel? Is the entire form evaluated as if by
EVAL and is "load-time" for the "EVAL" the same as compile-time? I don't
know if I know the answer. I know I've spent so long replying to this that
I'm not going to spend more time researching it today.
> > > Another thing that might be useful would be control over the
> > > mutability of the object that is the value of a constant (similar
> > > to what is there for LOAD-TIME-VALUE).
> >
> > I don't know what this is about. LOAD-TIME-VALUE seems entirely
> > powerful enough.
>
> Once again, please read `constant variable' for `constant' in my
> sentence. A constant variable is always immutable. (There is
> no `read-only-p' parameter to DEFCONSTANT.)
It's not obvious to me what it would mean. Suppose I have a non-read-only
constant and I write it to a file, then I modify it, then I write it to
another file. What is the world like when I load both files in the target
environment? The whole point of a thing you can "aggressively reason about"
is that "you know its contents". It doesn't make sense to say you can know
its identity in advance because it has no pointer identity until runtime.
Only its parts matter, so to say that you're going to sacrifice the parts
for the iddentity so that you can modify it but you can't trust the contents
of that modification seems very odd.
> > (Is this something I missed in your earlier table? It suffices to say
> > I should go back and look at a particular entry in that table if
> > that's all that's going on here. I don't have it handy, but I could
> > go look it up again if you thought it would help. It seemed to me
> > when I glanced at it earlier that each of the things in that table
> > were present or writable in user code.)
>
> I can see now the problem is not with that table not being comprehensive
> but with the need to clarify the concepts before or alongside with
> providing the table.
>
> And as I wrote above, I am not quite satisfied that DEFINE-xxx-REFERENCE
> are adequately writable in user code because I am concerned about the
> legality of
>
> (progn
> (define-symbol-macro foo ...) ;use FOO as if it is a variable
> (defconstant foo ...)) ;but make it unbindable
Heh, I wondered about this, too. I don't know the answer and I bet
implementations could differ on which they prefer to come first.
> I see no reason why it shouldn't be illegal but I don't have that
> `warm, fuzzy feeling' (CLtL2, 11.9, p. 284, near bottom). My worry
> comes from the fact that the name of a global special variable
> cannot be used as a symbol macro. Constant variables are not global
> special variables but still...
Yeah...
> (And `adequately' here also means access in O(1); implementing
> somehow a constant reference on top of a `private' global variable
> does not ensure that because of the possibility of deep binding.)
Oh, this is easy to do:
(defparameter +lexical-unbound+ (make-symbol "UNBOUND"))
(defun lexical-cell (x)
(or (get x 'lexical-cell)
(setf (get x 'lexical-cell) (cons x +lexical-unbound+))))
(defmacro lexical-ref (x)
`(cdr (load-time-value (lexical-cell 'x))))
(defmacro deflexical (var value)
`(progn (define-symbol-macro ,var (lexical-ref ,var))
(setq ,var ,value)
',var))
This is the reason that that define-symbol-macro is so useful.
It allows one to implement a variety of other constructs.
NOTE: I tried the above example in both Harlequin and Allegro.
It fails in LispWorks 4.1 Personal Edition in the interactive
read-eval-print loop (though it works in both interpreted and
compiled load scenarios). Allegro seems to do better. The
simplest test case is:
(progn (define-symbol-macro foo 'bar) foo)
which should work but doesn't in LWW4.1. I assume Hqn is reading
this so I'll leave it to them to fix it if they care. I'm not
on support.
But I think the code is right. You should be able to do:
(deflexical x 3)
(defun foo () x)
(defun bar (f x) (funcall f x))
(defun baz (x) (declare (special x)) (foo))
(bar #'baz 7) => 3
As opposed to what you'd get if you did:
(setq z 3) ;most implementations assume Z special here
(defun foo () z) ;and here
(defun bar (f z) (funcall f z))
(defun baz (z) (declare (special z)) (foo))
(bar #'baz 7) => 7
> Thanks for reading up to here, and for any further remarks,
> Vassil.
No problem.
Erik> * chu...@best.com (Chuck Fry)
Erik> | Modern Common Lisp compilers allow me to tune the critical parts of my
Erik> | code to within a few percent of C code, but without the pain of using a
Erik> | fragile, brittle quasi-portable assembly language.
Erik> just for the heck of it: I have tuned a _very_ CPU-heavy function I wrote
Erik> in Common Lisp over a year ago so it went from the unsatisfactory 623 µs
Erik> per call to the very pleasant 4.7 µs per call.
Would it be possible to provide the before and after versions of the
code?
I think it would be very instructive.
Ray
Actually, several steps in the evolution might be even more interesting
Hartmann Schaffer
schaffer at netcom dot ca
OK, I'll do the steps thing. at issue was hacking up and printing a time
representation. the naïve solution is to use a bignum of seconds since
some human-inspired epoch, use FORMAT with ~D to print it. however, I
also needed proper timezone support (CL's support is suitable only if you
deal with times in a single timezone), and milliseconds (since a second
is a really long time to modern computers, and even milliseconds are).
the zone was represented as the number of hours west of Greenwich.
the time representation is now split up into day, time, and milliseconds.
the zone is changed to seconds eas of Greenwich. day 0 is 2000-03-01,
because that's when a new 400-year leap year period starts, which means
that the leap day is at the end of the four years, the centennium, and
the quatercentennium, and thus no expensive computations are needed to
check for leap years. however, the main culprit was division. it is an
expensive operation, table lookup not, so I precomputed tables for the
146097 days of a quatercentennium and the 86400 seconds of a day, and did
a nifty data representation hack so the values fit in a single fixnum.
the full time representation was changed from a class to a structure, and
proper declarations meant that the access functions were inlined and the
type (fixnum) propagated properly.
time comparison functions now compared three fixnums instead of bignums
and a fixnum (milliseconds) and that meant the timezone tables (extracted
from the Unix timezone database that the C library uses (but also gives
you exactly one handle on)), were searched much faster. in addition, the
interval of the timezone was made available to the caller upon demand (it
affects the length of a local day, which can be 23, 24, or 25 hours!) and
also cached, since most times decoded were in the same timezone. the
time zone adjustment function also did simple addition and subtraction
instead of new divisions into day and time.
what really made a difference once I had figured out how to store the
values in a fixnum (I'll leave that as an exercise for the reader --
since it's so far from obvious how to do it, some people will get a real
kick out of how simple it is, and I don't want to take that kind of joy
away form anyone), was to pass a preallocated vector for the decoding
functions to store its values into instead of returning them as multiple
values.
finally, printing turned out to be very division-intensive, too, so I
precomputed a pair of tables of high and low digits for 100 numbers.
streams were also very expensive, so I deposited characters directly into
a preallocated string and returned a copy or wrote it out (depending on
whether the stream argument was NIL or a stream, like FORMAT). centuries
were easy to change into numbers in the 16 to 99 range instead of 1600 to
9900.
(granted, I have built myself a system with a serious Y10K problem, but
I'll print a warning in the year 9900 if it makes anyone less worried.
rumor has it that Y1K brought us the Dark Ages because it took 400 years
to update all the software to four instead of three digits in the year.)
the only real problem I had with tuning this stuff was that a 400 MHz CPU
is damn hard to get useful statistics from every 10 ms. it manages to do
a tremendous lot of work in 10 ms, so I had to let profiling run for many
minutes to collect useful data as I got closer to the optimum.
#:Erik
On 02 Apr 1999 11:52:50 +0000, Erik Naggum wrote:
> check for leap years. however, the main culprit was division. it is an
> expensive operation, table lookup not, so I precomputed tables for the
> 146097 days of a quatercentennium and the 86400 seconds of a day, and did
> a nifty data representation hack so the values fit in a single fixnum.
With "the values" you mean the human readable form of the (day,
millisec, timezone) triplet?
..
> what really made a difference once I had figured out how to store the
> values in a fixnum (I'll leave that as an exercise for the reader --
> since it's so far from obvious how to do it, some people will get a real
> kick out of how simple it is, and I don't want to take that kind of joy
So (year,month,day,hour,minute,second) in a fixnum, right?
..
> finally, printing turned out to be very division-intensive, too, so I
*nod*nod*
> the only real problem I had with tuning this stuff was that a 400 MHz CPU
> is damn hard to get useful statistics from every 10 ms. it manages to do
> a tremendous lot of work in 10 ms, so I had to let profiling run for many
> minutes to collect useful data as I got closer to the optimum.
Aha. Is it not possible to use the cycle-counter on modern x86's to
check this. You would have to insert get-the-count instructions here and
there, but it gives you a resolution of 1 clock tick. IIRC
Groetjes, Peter
--
It's logic Jim, but not as we know it. | pvan...@debian.org for pleasure,
"God, root, what is difference?",Pitr | pvan...@inthan.be for more pleasure!
no, the individual (day month year) and (second minute hour) triples.
| So (year,month,day,hour,minute,second) in a fixnum, right?
no. keep trying. :)
| Aha. Is it not possible to use the cycle-counter on modern x86's to
| check this. You would have to insert get-the-count instructions here and
| there, but it gives you a resolution of 1 clock tick. IIRC
yes, this is possible, but as far as I can see, Allegro CL does not
support this mode of operation. I think it's a little too much to ask,
too, but if I can get over the _disgusting_ machine language in the Intel
CPU's, I might play with this one day...
#:Erik
I knew that this issue is going to teach me more than just about
constants and constantness. I can see more clearly now that
my mindset about objects goes like this: the identity of an
object lasts only as long as it lives in core, i.e. there is
a `live' reference to it. Externalising and internalising
in the general case means losing its identity. There is no
_general_ way to prevent this: achieving this prevention can only
be considered on a case per case basis.
(I am not saying that my mindset is right or wrong, just want to
grasp it better. And it does need more refinement.)
(...) ;on language design
You are right that I don't think like a language designer, and I
should try to imagine myself in a language designer's shoes, as it
were, more often. I guess I am now rather irresponsibly testing
how far the cracks would open...
As an aside, isn't one of the good things about Lisp that with it
one plays with a _language_, while with C one plays with a _machine_
(and with C++ one---well, is engaged in non-playful activities with
a machine).
> I'll give you an example that is off-topic
(...) ;an instructive example so what if it might be off-topic
> I had code that said:
> (member x '(a $make-compiler-happy$))
> because it was constructed by other code that did
> `(member x '(,@things $make-compiler-happy$))
> and I did not know how many things I would be given.
(...)
I think that this is at least one of the reasons why it is important
to distinguish between program writing time and compile time.
(Or should it be `program write time'?)
(...)
> > (defconstant =foo= (make-thing) "some object")
>
> (Confess: This whole discussion was just an excuse to slip in some
> subliminal references to =...= for constants, wasn't it? :-)
:-)
You got me!
<annoying-Latin-mode-again>
Next I'll be putting into my signature `Ceterum, censeo, names of
constant variables should have equal signs around them.'
</annoying-Latin-mode-again>
(...) ;number of times a DEFCONSTANT initialisation form is evaluated
; vs. number of times a LOAD-TIME-VALUE form in a symbol macro
; expansion is evaluated
> I don't get the thing with the files. Files are an artificial concept
> that you shouldn't introduce unless you absolutely have to. If you want
> to talk about files with respect to externalization, there may be
> no alternative. But "evaluation" is not about externalization, and indeed
> evaluation may be insensitive to what file is involved (for
> nested files, for cases where no file is involved, etc.).
Trying to get this right: `artificial' in the sense that if a program
consists of several files, an equivalent program can be constructed
from combining all files into one? But doesn't such equivalence have
only theoretical value?
I may be missing the point again but the concept of files has a clear
presence via LOAD and COMPILE-FILE. My concern here is related not to
externalisation but to side effects (including functions such as RANDOM
that return different values on each invocation), and I wanted to
highlight differences I can see between DEFCONSTANT and
DEFINE-SYMBOL-MACRO (straighforward use of the latter in particular,
imitating #define).
As far as I can see,
(defconstant =foo= '(a b c) "a list")
and
(define-symbol-macro =foo= '(a b c))
do the same job; but consider these two scenarios (untested, sorry!):
(i) DEFCONSTANT:
;;; setting the stage:
(defparameter *counter* 0 "to be incremented by one")
(defun counter () (list 'counter (incf *counter*)))
(defconstant =foo-const= (load-time-value (counter) t)
"a list containing an integer")
File one:
(format t "~&one/1: ~S" =foo-const=)
(format t "~&one/2: ~S" =foo-const=)
File two:
(format t "~&two/1: ~S" =foo-const=)
(format t "~&two/2: ~S" =foo-const=)
Whether constant variables are inlined or not, loading these files
one after the other would print
one/1: (COUNTER 1)
one/2: (COUNTER 1)
two/1: (COUNTER 1)
two/2: (COUNTER 1)
(ii) DEFINE-SYMBOL-MACRO:
;;; setting the stage:
(defparameter *counter* 0 "to be incremented by one")
(defun counter () (list 'counter (incf *counter*)))
(define-symbol-macro =foo-macro= (load-time-value (counter) t))
File one:
(format t "~&one/1: ~S" =foo-macro=)
(format t "~&one/2: ~S" =foo-macro=)
File two:
(format t "~&two/1: ~S" =foo-macro=)
(format t "~&two/2: ~S" =foo-macro=)
And the printout would be:
one/1: (COUNTER 1)
one/2: (COUNTER 1) _or_ (COUNTER 2)
two/1: (COUNTER 2) _or_ (COUNTER 3)
two/2: (COUNTER 2) _or_ (COUNTER 4)
> > (2) DEFCONSTANT where the value is interned in some way, e.g.
> >
> > (defconstant =bar= 'baz "an interned symbol")
> >
> > EQL identity is guaranteed not by virtue of what DEFCONSTANT does
> > but by virtue of the properties of the value given to the constant
> > variable: (EQL =BAR= =BAR=) and (EQL =BAR= 'BAZ) will both be true
> > because this is how interned symbols work, and it doesn't matter
> > if =BAR= is inlined or not. The reference is not registered,
> > the value is.
> >
> > (By the way, this reminds me of the concept of otherwise accessible
> > parts.)
>
> I just knew someone was going to raise that. It seems like it should
> be related, but it's not. Indeed, just the opposite. The whole PROBLEM
Problem?
> is that the parts of a constant ARE accessible. Why? Because the whole
Yes, they are---this is what I wrote---`otherwise accessible,' not
`otherwise inaccessible' (was it too self-assured of me to coin the
opposite of the existing phrase?).
> point of a constant is to use it in code. If you didn't use it, you
> wouldn't need it. And having used it, you can't know what the system
> has done with it--unless you forbid the system from doing useful things,
> which seems silly.
God forbid that I should forbid anything or anyone from doing useful
things... (not that I don't do silly things from time to time).
> > (3) A constant reference where the value is not interned in any way:
> > EQL identity is guaranteed only via the same reference but not between
> > the constant reference and the value produced by another evaluation
> > of the initialisation form. The reference is registered (terminology?),
> > the value isn't.
>
> I'm not sure what it means to be the same reference.
> In (let* ((x +quux+) (y x)) (eql x y)) is there one reference or two?
^^^^^^ ^ ^ ^
Four occurrences of the same reference which has been propagated
from +QUUX+ to X and then via X to Y.
Yes, there's some ambiguity in `reference.' If we agree to visualise
references as arrows, sometimes `reference' means the start of the
arrow and sometimes the end of the arrow, I think. So in the above
LET* form there are three arrows (one for +QUUX+, one for X, one
for Y), with X's being used twice, and all of which end at the same
object.
(...)
> > Correct me if I am wrong, but inlining may happen no matter
> > if the initialisation form in DEFCONSTANT is QUOTE or LOAD-TIME-VALUE.
>
> I'm not positive I know the answer or that there is an answer.
> I can see differing interpretations.
(...)
> I know I've spent so long replying to this that
> I'm not going to spend more time researching it today.
Yes, perhaps in the spirit of lazy evaluation this question is better
left until someone really *must* know, and then they will have a
specific case in hand which might make arguing one way or the other
easier.
> > > > Another thing that might be useful would be control over the
> > > > mutability of the object that is the value of a constant (similar
> > > > to what is there for LOAD-TIME-VALUE).
> > >
> > > I don't know what this is about. LOAD-TIME-VALUE seems entirely
> > > powerful enough.
> >
> > Once again, please read `constant variable' for `constant' in my
> > sentence. A constant variable is always immutable. (There is
> > no `read-only-p' parameter to DEFCONSTANT.)
>
> It's not obvious to me what it would mean. Suppose I have a non-read-only
> constant and I write it to a file, then I modify it, then I write it to
> another file. What is the world like when I load both files in the target
> environment? The whole point of a thing you can "aggressively reason about"
> is that "you know its contents". It doesn't make sense to say you can know
> its identity in advance because it has no pointer identity until runtime.
> Only its parts matter, so to say that you're going to sacrifice the parts
> for the iddentity so that you can modify it but you can't trust the contents
> of that modification seems very odd.
I admit I wasn't thinking about externalising and then internalising
with modifications in between, just about in-core operations. It's
best for me to retract my idea at least until I know better what I am
talking about.
(...)
> > (And `adequately' here also means access in O(1); implementing
> > somehow a constant reference on top of a `private' global variable
> > does not ensure that because of the possibility of deep binding.)
>
> Oh, this is easy to do:
>
> (defparameter +lexical-unbound+ (make-symbol "UNBOUND"))
>
> (defun lexical-cell (x)
> (or (get x 'lexical-cell)
> (setf (get x 'lexical-cell) (cons x +lexical-unbound+))))
CONS in order to ensure that the first argument to OR is false iff
no value has been given yet, right? (And the choice of the first
argument to CONS is mainly for convenience of debugging and inspecting?)
> (defmacro lexical-ref (x)
> `(cdr (load-time-value (lexical-cell 'x))))
^^
Read `quote comma X,' right?
> (defmacro deflexical (var value)
> `(progn (define-symbol-macro ,var (lexical-ref ,var))
> (setq ,var ,value)
> ',var))
>
> This is the reason that that define-symbol-macro is so useful.
> It allows one to implement a variety of other constructs.
>
(...)
> But I think the code is right. You should be able to do:
>
> (deflexical x 3)
>
> (defun foo () x)
>
> (defun bar (f x) (funcall f x))
>
> (defun baz (x) (declare (special x)) (foo))
>
> (bar #'baz 7) => 3
I tried to follow this example in detiail; while trying to understand
what happens, I expanded the macros and obtained:
(defun lexical-cell (x)
(or (get x 'lexical-cell)
(setf (get x 'lexical-cell) (cons x #:UNBOUND))))
(setf (cdr (load-time-value (lexical-cell 'x))) 3)
(defun foo () (cdr (load-time-value (lexical-cell 'x))))
(defun bar (f x) (funcall f x))
(defun baz (x) (declare (special x)) (foo))
(bar #'baz 7) => 3
Are you using LOAD-TIME-VALUE to get a permanent direct handle on
the p-list entry? If so, then this solution is very instructive,
but I want to look at it again in the daytime after I have cught
up with some of my sleep...
(...)
Happy Easter!
---Vassil.
That may be true, but consider the following two examples:
(defconstant =foo= (list 1 2 3))
(eql =foo= =foo=) => true
vs.
(define-symbol-macro =foo= (list 1 2 3))
(eql =foo= =foo=) => false
> You are right that I don't think like a language designer, and I
> should try to imagine myself in a language designer's shoes, as it
> were, more often. I guess I am now rather irresponsibly testing
> how far the cracks would open...
Oh, I don't know if I'd say "irresponsibly". Thought experiments like
this are what this newsgroup is for. And, btw, I'm not saying I'm always
right. A lot of people are beating me up on this one, and for all I know
they might be right. But I'm not going to change my mind just because
I'm outvoted. I'm sticking with my opinion until I see a reason to change
my mind. As I hope others will.
> Yes, they are---this is what I wrote---`otherwise accessible,' not
> `otherwise inaccessible' (was it too self-assured of me to coin the
> opposite of the existing phrase?).
Uh, just too obscure for me to notice, I think. :-)
> Yes, there's some ambiguity in `reference.'
Incidentally, this ambiguity hit the Scheme community when there was
a question of whether:
(let ((x (+ y 3)))
x)
has (+ y 3) in a "tail recursive" position. I think that's the funny
case, anyway. The rule of present is, I think, that this is only
optionally seen as "tail recursive". In a future standard, the Scheme
authors plan to clarify this. But variables are funny things...
> I admit I wasn't thinking about externalising and then internalising
> with modifications in between, just about in-core operations. It's
> best for me to retract my idea at least until I know better what I am
> talking about.
Oh, darn. And here I just thought if I forced people to repeat this
to me another 20 or 30 times I'd eventually understand why they were
so excited about it...
> > (defun lexical-cell (x)
> > (or (get x 'lexical-cell)
> > (setf (get x 'lexical-cell) (cons x +lexical-unbound+))))
>
> CONS in order to ensure that the first argument to OR is false iff
> no value has been given yet, right?
Right. And because CONS is the cheapest way of making a "cell" in most
implementations. In fact, only a half-cons is needed, but most
implementations have no such datatype--certainly the standard doesn't.
> (And the choice of the first
> argument to CONS is mainly for convenience of debugging and inspecting?)
Yes.
> > (defmacro lexical-ref (x)
> > `(cdr (load-time-value (lexical-cell 'x))))
> ^^
> Read `quote comma X,' right?
Oh, weird. I thought I tested that code. Maybe I used x. Heh.
yes: ',x
> Are you using LOAD-TIME-VALUE to get a permanent direct handle on
> the p-list entry?
Yes.
> If so, then this solution is very instructive,
> but I want to look at it again in the daytime after I have cught
> up with some of my sleep...
I think it's a nice trick, so if it isn't obvious to you what it's
doing, it's worth a second look.
> Happy Easter!
You, too!
R5RS has this:
In the following example the only tail call is the call to f. None
of the calls to g or h are tail calls. The reference to x is in a
tail context, but it is not a call and thus is not a tail call.
(lambda ()
(if (g)
(let ((x (h)))
x)
(and (g) (f))))
Note: Implementations are allowed, but not required, to recognize
that some non-tail calls, such as the call to h above, can be
evaluated as though the were tail calls. (...)
See http://www.sonic.net/~bear/scheme/r5rs.html#SEC24 .
rthappe
or false if the compiler has inlined =FOO=!
I.e. the result of EQL on constant variables is unspecified except
for data objects such as symbols where EQUAL is the same as EQL.
> vs.
>
> (define-symbol-macro =foo= (list 1 2 3))
> (eql =foo= =foo=) => false
Always.
So EQL is unspecified in the former case and false in the latter;
the real difference I can see between these two examples is that
one can destructively modify the value coming from the symbol macro.
One of my high-school teachers in literature used to say, when there
was disagreement among the class e.g. whether an author really meant
this and this or that and that, `shall we vote on it?'
Not to mention that ill fated attempt in one of the states of the USA
to vote on a formula implying a rather poor approximation of pi.
> > Yes, they are---this is what I wrote---`otherwise accessible,' not
> > `otherwise inaccessible' (was it too self-assured of me to coin the
> > opposite of the existing phrase?).
>
> Uh, just too obscure for me to notice, I think. :-)
In any case, I should be more careful with unusual phrases that are
likely to be taken for their well-known opposite.
(...)
> But variables are funny things...
They make our lives more interesting. And I recall that old fortune
cookie, `Variables won't, constants aren't.'
> > I admit I wasn't thinking about externalising and then internalising
> > with modifications in between, just about in-core operations. It's
> > best for me to retract my idea at least until I know better what I am
> > talking about.
>
> Oh, darn. And here I just thought if I forced people to repeat this
> to me another 20 or 30 times I'd eventually understand why they were
> so excited about it...
I am excited about it essentially because I hope this can help me
understant what an object __really__ is. It's a distant goal, and I
want to talk again about the issue in public after I have made some
more progress in private.
(...)
> And because CONS is the cheapest way of making a "cell" in most
> implementations. In fact, only a half-cons is needed, but most
> implementations have no such datatype--certainly the standard doesn't.
I have mentioned this a couple of times already (sorry for repeating
myself): how about a zero-dimensional array as an idiom for a half-cons
(i.e. a singleton reference)? Of course, it is rather unlikely that
in any given implementation making and/or garbage collecting a
zero-dimensional array would be cheaper than the same operations on
a cons cell, but this would save the comment saying something like,
`only the car of this cons is really used.'
(...)
Thanks for the followup clarifications about the example.
(...) ;using LOAD-TIME-VALUE to get a `lock' on a reference
; (not in the usual sense of `lock' for mutual exclusion,
; but in the same sense as a radar locks on a target---
; sorry for the military metaphor)
> I think it's a nice trick, so if it isn't obvious to you what it's
> doing, it's worth a second look.
It is indeed. In fact, I believe it deserves to appear as a textbook
exercise so students have the satisfaction of discovering it themselves.
And it also show a not very obvious benefit from LOAD-TIME-VALUE.
> > Happy Easter!
>
> You, too!
Thanks! I'll save this for a week from now (this year Eastern Orthodoc
Easter is a week later than Easter of the Western Churches).
Where in the spec does it say that? I believe the compiler can inline
=FOO=, replacing it with its value, which should be the list that =FOO= is
bound to. Section 3.2.2.3 of the CLHS says, "A reference to a constant
variable in source code is equivalent to a reference to a literal object
that is the value of the constant variable." It doesn't say that it's
equivalent to a reference to a listeral object that's *similar* to the
value.
Yes. If you look earlier in the document you'll see I'm among the
"authors" (more like a design committee, with all the argumentation
and disarray that committeeness sometimes implies) of this document
you're referring me to. It was just late and I didn't want to bother
looking it up.
My point was not about what Scheme does now, but rather that we
discussed how it should be defined and it was, as I recall, decided
that without making it formal exactly what was to be detected, it
wasn't the right time to define that (h) in the above is a tail call.
But that might change in the future. Various people are getting
psyched to make the whole business of storage usage in these
situations more formal.
And the more general point was to underscore that this matter of
"variables" and "reference" and "code substitution" is a messy gray
area that affects not just this discussion of defconstant but other
things as well.
> how about a zero-dimensional array as an idiom for a half-cons
> (i.e. a singleton reference)?
As a rule, arrays are higher overhead to access than conses. And they
have to have header info that says what their dimension size is, so
that would use up the space you saved by going into array--maybe even
more. I don't know how much overhead arrays have on them in typical
implementations but they do use more storage than just the element
memory itself, I'm pretty sure.
It might help a little if you knew the 0d array was also simple.(I
never saw a displaced or adjustable 0d array, but I think it's
possible in principle. :-)
Maclisp had the HUNK datatype which was the right thing. It was a
generalized cons that had however many parts you wanted. A 1-element
hunk was still of type HUNK2 (represented identically to CONS, but in
a different allocation space) but had a filler in one of its slots that
said "nothing here". I was sad hunks went away, though they really
were a kind of hack ... I guess the main reason they went away was that
they had been co-opted as a low-level implementation substrate for
implementing class-like things in pre-class-system days, and when a
real class data structure came along, it wasn't as needed. I suppose
you could use a structure with only one slot, but since it would have
a header saying the type of the structure, it would take up the same
space as a CONS anyway. Just can't quite win.
Perhaps you are thinking of some other funny case? This one seems pretty
clear (even without resorting to R5RS). Since (even in R4RS) "let" is a
"derived expression", its defining semantics are given in terms of a
procedure call of its body, i.e., translating your example:
((lambda (x) x) (+ y 3))
Now that being the case, and given that Scheme is a call-by-value
variant of the lambda calculus, if I understand these things correctly
the "(+ y 3)" is *not* in a tail position, since it must be evaluated
*before* the call to the lambda.
The lambda itself is not in a tail position, either, for the
same reasons. But the *body* of the lambda ("x", in this case) is...
Hmmm... I see here that R5RS *allows* the beta-reduction to "(+ y 3)"
[which would convert it to a tail call of "+"], in section "3.5 Proper
Tail Recursion" [just after the example Darius(?) quoted which contained
a (let ((x (h))) x) as one branch of an "if"], where it says:
Note: Implementations are allowed, but not required, to recognize
that some non-tail calls, such as the call to "h" above can be
evaluated as though they were tail calls. In the example above,
the "let" expression could be compiled as a tail call to "h".
So it's only "in a tail position" if your compiler is smart enough to
*prove* that making it so won't break the semantics...
-Rob
-----
Rob Warnock, 8L-855 rp...@sgi.com
Applied Networking http://reality.sgi.com/rpw3/
Silicon Graphics, Inc. Phone: 650-933-1673
2011 N. Shoreline Blvd. FAX: 650-964-0811
Mountain View, CA 94043 PP-ASEL-IA
Several Scheme implementations have such a notion for a half-cons,
conventially called a "box":
> (define foo (box 37)) ; some call it "make-box"
> (box? foo)
#t
> (unbox foo)
37
> (set-box! foo 24)
> (unbox foo)
24
>
In Common Lisp, of course, I guess you could always do:
> (defstruct box (content))
BOX
> (setq foo (make-box :content 37))
#S(BOX :CONTENT 37)
> (box-content foo)
37
>
but I'm not sure how much overhead CL drags along with a structure instance
(in addition to the slots themselves)...
-Rob
p.s. FWIW, MzScheme provides the following
printer & reader syntax for boxes:
> foo
#&24
> (box? #&51)
#t
> (unbox #&51)
51
Oh, yes. Of course it is highly implementation dependent, but my
`generalised implementation-independent' estimate is 8 bytes (which
is useful only to maintain some _rough_ mental idea of storage costs).
I hope I mentioned it that I was aware of higher costs---the reason
why I still put forward 0d arrays has to do with expressing intent,
see below.
(In fact, assuming 64 bits of header and, in a 64-bit architecture,
64 bits per reference, a simple array with room for a single element
would occupy as much storage as a cons cell. But this is just
hypothetical.)
> It might help a little if you knew the 0d array was also simple.(I
> never saw a displaced or adjustable 0d array, but I think it's
> possible in principle. :-)
It ought to be possible to have a 0d displaced array. But adjustable?
Don't ask, don't tell...
> Maclisp had the HUNK datatype which was the right thing. It was a
> generalized cons that had however many parts you wanted. A 1-element
> hunk was still of type HUNK2 (represented identically to CONS, but in
> a different allocation space) but had a filler in one of its slots that
> said "nothing here". I was sad hunks went away, though they really
> were a kind of hack ... I guess the main reason they went away was that
I had known about Maclisp's hunks only from comp.lang.lisp. They do
look like what I want but I know I should not be holding my breath for
their re-introduction into the language.
> they had been co-opted as a low-level implementation substrate for
> implementing class-like things in pre-class-system days, and when a
> real class data structure came along, it wasn't as needed. I suppose
> you could use a structure with only one slot, but since it would have
> a header saying the type of the structure, it would take up the same
> space as a CONS anyway. Just can't quite win.
In theory at least, an implementation could have a specific tag for
a singleton reference so that the reference itself does take half the
space for a cons cell. But even though it could, I don't expect that
it would, it being too much trouble for something that's rarely needed.
But, before worrying about storage, I worry about a clear expression
of my intent to use something as a singleton reference.
(1) Using a cons is most efficient but is one of those `cancer of the
semicolon' cases:
(let ((ref (cons foo nil))) ;the cdr will not be used
...) ^^^^^^^^^^^^^^^^^^^^^^^^^
(2) Using a structure makes for a program where adding a second slot
is an easy change to the program: a structure with just one slot
still sort of implies that a second slot might appear some day,
and I don't want that. For the same reason, I want a _zero_
dimensional array, not a vector of length 1: (AREF REF) means
there is just this one, period, while (AREF REF 0) again sort of
implies that one day we might see (AREF REF 1) etc.
(3) Of course I could say
(defstruct ref obj) ;thou shalt have just this slot
but on the other hand sometimes I (perhaps others too) prefer
idioms to Doing the Right Thing by introducing abstractions
for each and every purpose and thus cluttering the program
with defining forms and the concept space with datatypes,
accessors, etc. (For the same reason I am not very keen on
wrapping calls to CONS and FIRST in calls to MAKE-REF and REF.)
Anyway, this exercise is not worth more than the proverbial two cents
since even if `efficient singleton references' makes it to a wish
list it would have very nice(1) priority...
I also remember the case of Algol-68, which among other things tried
to Get references Right, I believe.
Thanks, this is interesting to know. I am afraid I can't find the
time to follow Scheme developments, so it is good that I can catch
at least some pieces from comp.lang.lisp.
Are boxes as efficient as cons cells in terms of speed of access and
memory use? (By the latter I mean, does a box take up half the space
for a cons?)
> p.s. FWIW, MzScheme provides the following
> printer & reader syntax for boxes:
>
> > foo
> #&24
(...)
But I suspect that reading this back would produce a fresh box---
is that so? If yes, this may or may not be what I want, and we have
hit the externalisation issue again...
First, the relevant item from 3.2.2.3 Semantic Constraints reads in
full (with the first sentence included):
Constant variables defined in the compilation environment must have a
similar value at run time. A reference to a constant variable in
source code is equivalent to a reference to a literal object that is
the value of the constant variable.
so similarity is already mentioned here.
Second:
3.2.4 Literal Objects in Compiled Files
The functions eval and compile are required to ensure that literal
objects referenced within the resulting interpreted or compiled code
objects are the same as the corresponding objects in the source code.
compile-file, on the other hand, must produce a compiled file that,
when loaded with load, constructs the objects defined by the source
code and produces references to them.
In the case of compile-file, objects constructed by load of the
compiled file cannot be spoken of as being the same as the objects
constructed at compile time, because the compiled file may be loaded
into a different Lisp image than the one in which it was compiled.
This section defines the concept of similarity which relates objects
in the evaluation environment to the corresponding objects in the
run-time environment.
The constraints on literal objects described in this section apply
only to compile-file; eval and compile do not copy or coalesce
constants.
(By the way, it was easy to find these as the links to them still show
up as recently visited in my browser.)
I find the above to mean that the file compiler _may_ inline the value
of a constant variable: it treats it like it treats a literal object
(3.2.2.3), and for that only similarity, and not necessarily eqlness,
must be observed (3.2.4, 2nd paragraph).
To make sure there's no misunderstanding, for me
(EQL =FOO= =FOO=) may return T or NIL
is shorthand for:
if a file has, say, (DEFUN FOO1 () =FOO=), and another file has, say,
(DEFUN FOO2 () =FOO=), then (EQL (FOO1) (FOO2)) may return T or NIL.
(While (EQL =FOO= =FOO=) does not necessarily imply programmer
silliness if it is seen by the compiler (since one or both foo's may
come from macro expansion), I would find it hard to argue that the
latter form may return NIL as it can occur but within a single file.)
So if (EQL (FOO1) (FOO2)) => NIL, one might be right to say that the
implementation worked in a counter-intuitive, unexpected, or harmful
way (I would disagree but I wouldn't argue about that here), but one
would *not* be right to say the implementation broke the rules.
Once again, my summary of the issue (which maybe deserves attention
if and when the standard is reviewed) is: like DEFINE-SYMBOL-MACRO,
DEFCONSTANT does not provide a constant reference (non-inlinable)
to an object. (DEFCONSTANT differs from DEFINE-SYMBOL-MACRO with
respect to other things: when and how many times the value form is
evaluated, and also with respect to the possibility of the same
symbol appearing in a binding form.)
It all depends on the implementation, I'd guess. In MzScheme a box is an
instance of "Scheme_Small_Object", which is a type field [a short, but
most compilers will force that to a PSV] and a pointer-sized value (PSV).
A cons is an instance of "Scheme_Object", which is a type field and
two PSVs. So for MzScheme, a box is 2/3 (or maybe 3/5) the size of a cons.
In an implementation I've been working on [play project], the same is true:
a box is 2/3 the size of a cons.
Chez Scheme (see http://www.cs.indiana.edu/eip/workshop1/lab7/lab73.html)
also has boxes, but I don't know if they're 1/2, 2/3, or 1 times a cons...
-Rob
Aren't Lisp Machine locatives essentially half-conses?
> * pvan...@mail.inthan.be (Peter Van Eynde)
> | With "the values" you mean the human readable form of the (day, millisec,
> | timezone) triplet?
>
> no, the individual (day month year) and (second minute hour) triples.
>
> | So (year,month,day,hour,minute,second) in a fixnum, right?
>
> no. keep trying. :)
I'm stumped. Are you talking about doing:
(defun encode-time-values (days seconds milliseconds)
(dpb seconds (byte 17 10)
(dpb milliseconds (byte 10 0)
(ash days 27))))
? That runs out of fixnum range on 32-bit machines pretty quickly.
--David Gadbois
Unfortunately my knowledge of Lisp Machines is rather limited, so I
can't say. I hope someone else would answer.
Assuming Lisp Machine locatives are singleton references, I'm also
curious to know what were the reasons why they didn't make it into
Common Lisp (to avoid complexity? to avoid expensive implementation
on stock hardware? ...)
I admit I haven't had time to really think about the problem, so I
am not satisfied I know the optimal answer, but the above approach is
not the most economical. A more economical way would be to use
a variable-base positional system (I am not sure this is the correct
English term). There the weights of the different digit positions
are independent of one another (instead of all of them being powers
of one and the same base), e.g. 1 for the rightmost digit, 31 for
the middle one, and 31*12 for the leftmost one in a triplet that
represents (year, month, day). So producing the integer for the
triplet can be done e.g. by
;; untested---may contain typos
(reduce #'+ (map '(vector fixnum 3) #'*
(vector year month day)
(vector (* 12 31) 31 1)))
(of course, the weights vector would be precomputed, and consing
would be avoided, etc.---this is just the illustration). This is
more economical in terms of space than allocating 5 bits for the
day (32 different values, 1 unused) and 4 bits for the month (16
different values, 4 unused).
I'd never really thought of it that way, but I can see why someone might. I can't
remember all the details. A car of a dtp-locative returns the same as the car of a dtp-list, but I don't remember what
cdr returns. However, since a locative is
just a pointer into virtual memory, it can be used for pointing to anything, not
just a list.
I only recall them being used in two ways:
1) As the reference to disembodied property lists (I think the GET and PUT
functions automatically followed a locative, so they can be used
interchangeably with symbols). In Maclisp we used an extra cons for this.
Common Lisp provides GETF, and SETF of GETF updates the place that holds
the reference to the property list, so this isn't as necessary.
2) As pointers in low-level system programming. This is the most common
use.
Since low-level system programming was not a target use of Common Lisp,
there was no incentive to include them in CL.
No. They are pointers to any location. One doesn't allocate a locative. One
obtains a locative that points to an existing location. I don't recall the
exact primitives on the Lisp Machine, but using Scheme-like notation they were:
(CAR-LOCATION <pair>) procedure
(CDR-LOCATION <pair>) procedure
(VECTOR-LOCATION <vector> <k>) procedure
(STRING-LOCATION <string> <k>) procedure
(VARIABLE-LOCATION <variable>) syntax
Note that VARIABLE-LOCATION is syntax, i.e. a special form, while the others
are procedures.
On the Lisp Machine, one didn't use the equivalent of the above primitives
very often because there was the LOCF macro, which was analogous to SETF.
(LOCF (CAR <pair>)) ==> (CAR-LOCATION <pair>)
(LOCF (CDR <pair>)) ==> (CDR-LOCATION <pair>)
(LOCF (VECTOR-REF <vector> <k>)) ==> (VECTOR-LOCATION <vector> <k>)
(LOCF (STRING-REF <string> <k>)) ==> (STRING-LOCATION <string> <k>)
(LOCF <variable>) ==> (VARIABLE-LOCATION <variable>)
Given a locative, one could then access it with something like:
(LOCATION-REF <location>) procedure
and assign to it with soemthing like:
(LOCATION-SET! <location> <object>) procedure
Does anyone know of any good sources to information on the hardware
design of the Lisp Machines? OS information would be nice as well.
--Lars M.
> It all depends on the implementation, I'd guess. In MzScheme a box is an
> instance of "Scheme_Small_Object", which is a type field [a short, but
> most compilers will force that to a PSV] and a pointer-sized value (PSV).
> A cons is an instance of "Scheme_Object", which is a type field and
> two PSVs. So for MzScheme, a box is 2/3 (or maybe 3/5) the size of a cons.
>
I modified the Boehm GC and MzScheme to implement CDR-coding and represent
non-cdr-coded conses with 2 words on SGI.
But, like my (tiny)clos support, those things did not make it into the
distribution (Matt suggested that it was probably time for a `Fernando
Scheme').
I sent the GC mods to Hans also, but I doubt they are there either.
So in this case, a box and a cons are the same size.
--
Fernando D. Mato Mira
Real-Time SW Eng & Networking
Advanced Systems Engineering Division
CSEM
Jaquet-Droz 1 email: matomira AT acm DOT org
CH-2007 Neuchatel tel: +41 (32) 720-5157
Switzerland FAX: +41 (32) 720-5720
no.
I'm hinting at encoding year-month-day and hour-minute-second as bytes in
a fixnum. little did I know this would be paper material in complexity.
#:Erik
>
> Does anyone know of any good sources to information on the hardware
> design of the Lisp Machines? OS information would be nice as well.
>
I have some manuals if you will pay postage. There are also some
papers which I think are available at MIT on the CAR/CADR machines,
and probably info on later commercial boxes was published -- there's
certainly a cool document on the LMI K machine (of which there may
have been one or two made) somewhere on the net.
--tim
Erik> I'm hinting at encoding year-month-day and hour-minute-second as bytes in
Erik> a fixnum. little did I know this would be paper material in complexity.
Do you mean a single byte for each? Or multiple bytes for each?
Since there are (* 24 60 60) = 86400 possible values for
hour-minute-second, and they're all equally likely, I don't see how
you can take much less than 17 (= log2(86400)) bits for this.
Similarly, for year-month-day, there are at most 366 month-day
combinations and 100 year values (I assume year is the year in a
century, with the century stored somewhere else). This is a total of
at most 366000 values but at least 36500 values (not every year is a
leap year). This takes another 16 bits.
The total is 33 bits which won't fit in a typical fixnum of a 32-bit
machine.
Obviously, there's some redundancy I'm overlooking. And of course,
with this type of encoding, you have to do all sorts of divisions to
get the individual pieces out, but that may not be so bad if the
machine has integer division built in.
Very interesting problem indeed!
Ray
Thanks, that was interesting and useful.
I wonder if the following terminology would work: a locative is
a _reference value_ and not a _reference object_ (and thus it is
not a half-cons, i.e. a singleton reference).
Thinking about this problem, I encountered the following question.
Imagine I obtain some value out of two (or more) fixnums, each within
a specific range (not too big). To do this fast, I can precompute
a table represented as a two-dimensional array indexed by the fixnums
in question. Accessing the array, however, will involve multiplication
which I find too slow,^1 so I want to replace the two-dimensional array
with a vector of vectors, roughly like this:
(defvar *table* (make-array max1 :element-type `(vector fixnum ,max2)
:initial-contents
;; MAX1 fresh (VECTOR FIXNUM max2)'s:
(map 'list #'(lambda (_)
(declare (ignore _))
(make-array max2 :element-type 'fixnum))
(make-array max1)))
"Precomputed values.")
__________________________
^1 if MAX2 is known to be a power of 2, multiplication can be replaced
by bitwise operations which are fast, but this is not the point here
(Using C notation seems to provide the shortest way to describe the
memory layout.) Normally, *TABLE* would be structured as table here
(modulo headers and tags):
typedef int ivector[max2];
ivector *table[max1];
However, is an aggressively optimising CL compiler allowed to do as in
int table[max1][max2];
i.e. `open-code' the nested vectors?
If the latter representation happens, then again slow multiplication
will have to be used.
(Note that in C the same expression will be used in both cases to
obtain a value: ``table[i][j]'' but in the former case this would
compile into two indexing operations plus one indirect reference
through a pointer while in the latter case this would compile into
a multiplication and an indexing operation (assuming the addition
can be integrated with the indexing operation).)
the whole idea is to avoid division by table lookups. therefore, we're
talking about a means to ensure that the bits of a fixnum can hold 24
hours, 60 minutes, and 60 seconds in separate bytes, or 400 years, 12
months, and 31 days in separate bytes. then define a simple vector of
precomputed values indexed by the undivided value.
| Obviously, there's some redundancy I'm overlooking. And of course,
| with this type of encoding, you have to do all sorts of divisions to
| get the individual pieces out, but that may not be so bad if the
| machine has integer division built in.
division was the _primary_ cost in the algorithm...
#:Erik
If you were not the author of this text, Eric, I would assume that the
author was lazy and didn't meant byte==piece of 8 bits, but now I'm
doubting :-).
Anyway, I can't see the problem:
(log 60 2) ~ 5.9 bits
(log 24 2) ~ 4.6 bits
and (+ 6 6 5) = 17 bits -> the first fixnum,
used in the table to convert between
seconds-since-midnight -> (hour, minute, second).
(log 31 2) ~ 4.9 bits
(log 12 2) ~ 3.6 bits
(log 400 2) ~ 8.6 bits
(+ 5 4 9) -> 18 bits -> the second fixnum,
in the table days-since-newyear -> (day, month, year).
But I would do
(days-since-newyear, year) -> (day, month), not?
And because all the fields are lower than 8 bits we can use pieces
of 8 bit to encode them (could be more efficient on come CPU's, mainly
Alpha's I think).
Groetjes, Peter
--
It's logic Jim, but not as we know it. | pvan...@debian.org for pleasure,
"God, root, what is difference?",Pitr | pvan...@inthan.be for more pleasure!
well, a "byte" is not "8 bits of a machine word at an 8-bit boundary",
but "8 bits of a machine word at an 8-bit boundary" _is_ a "byte". that
IBM usurped a perfectly general concept and used it for byte-addressable
machines with 8-bit bytes, only, is an historic accident that Common Lisp
does not accept. on the PDP-10, on which I for all practical purposes
grew up, byte operations were very powerful instructions to work on bytes
of user-specified sizes. LoaD Byte and DePosit Byte have survived into
Common Lisp. Increment Byte Pointer and ILDB and IDBP have not -- they
were used to read successive bytes out of a sequence of machine words,
and that is properly handled by better primitives, such as STRING
operations today. the byte pointer itself did not survive: it had a
memory address, too, but the byte specification did -- see the functions
BYTE, BYTE-SIZE, and BYTE-POSITION.
| Anyway, I can't see the problem:
| (log 60 2) ~ 5.9 bits
| (log 24 2) ~ 4.6 bits
um, how can I put this? INTEGER-LENGTH returns an integer, not a
floating point number. LOG is the wrong function to use when measuring
the required size of an integer, although CEILING of LOG yields the same
value as INTEGER-LENGTH for positive numbers.
| and (+ 6 6 5) = 17 bits -> the first fixnum, used in the table to convert
| between seconds-since-midnight -> (hour, minute, second).
yup, this is the idea. (it turned out not to be worth the hassle to use
only 16 bits by partition the day into two 12-hour halves. if I were
cramped for space, I'd revisit that space optimization.)
| (log 31 2) ~ 4.9 bits
| (log 12 2) ~ 3.6 bits
| (log 400 2) ~ 8.6 bits
|
| (+ 5 4 9) -> 18 bits -> the second fixnum,
| in the table days-since-newyear -> (day, month, year).
| But I would do
| (days-since-newyear, year) -> (day, month), not?
I use day-since-beginning-of-leap-day-period. as I explained earlier,
the current leap day period started at 1600-03-01, and ends 2000-02-29.
the next leap day period starts 2000-03-01. since we're going to face
that day pretty soon, I chose that day as day 0.
| And because all the fields are lower than 8 bits we can use pieces of 8
| bit to encode them (could be more efficient on come CPU's, mainly
| Alpha's I think).
hm. I haven't actually investigated the possibility of using a (vector
(unsigned-byte 8)) for the representation. I'll try that. (I guess I'm
still not quite used to 8-bit addressable memory, and from what I read in
the specifications of modern CPU's, the designers work really hard to
avoid forcing people to stop believing in that old myth, too. think
about it: if we scuttled the stupid 8-bit legacy and chose 32 bits as the
smallest addressable unit and reinstated byte pointers, we'd have four
times more addressable memory, and we wouldn't need 64-bit processors for
another, um, 18 months!)
#:Erik
Hmm. Yes, now I remember how confused I was when I read the ldb spec...
I should learn to say "octet"...
>| Anyway, I can't see the problem:
>| (log 60 2) ~ 5.9 bits
>| (log 24 2) ~ 4.6 bits
>
> um, how can I put this? INTEGER-LENGTH returns an integer, not a
> floating point number. LOG is the wrong function to use when measuring
> the required size of an integer, although CEILING of LOG yields the same
> value as INTEGER-LENGTH for positive numbers.
I just wanted to show how many bits are needed to represent the ranges.
I was expecting a trap, like a non-uniform encoding.
..
> I use day-since-beginning-of-leap-day-period. as I explained earlier,
> the current leap day period started at 1600-03-01, and ends 2000-02-29.
> the next leap day period starts 2000-03-01. since we're going to face
> that day pretty soon, I chose that day as day 0.
So now it's day -329?
>| And because all the fields are lower than 8 bits we can use pieces of 8
>| bit to encode them (could be more efficient on come CPU's, mainly
>| Alpha's I think).
>
> hm. I haven't actually investigated the possibility of using a (vector
> (unsigned-byte 8)) for the representation. I'll try that. (I guess I'm
> still not quite used to 8-bit addressable memory, and from what I read in
> the specifications of modern CPU's, the designers work really hard to
> avoid forcing people to stop believing in that old myth, too. think
> about it: if we scuttled the stupid 8-bit legacy and chose 32 bits as the
> smallest addressable unit and reinstated byte pointers, we'd have four
> times more addressable memory, and we wouldn't need 64-bit processors for
> another, um, 18 months!)
IIRC the Alpha only had 64 and 32 bit instructions for the first 2
generations, but with the 21264 they introduced 8 bit instructions
especially for emulation and multi-media. Anyway we _need_ those extra 4
bits, where else would we place the type info?
-328, actually. (floor (- (get-universal-time) (encode-universal-time 0
0 0 1 3 2000 0)) 86400)
| Anyway we _need_ those extra 4 bits, where else would we place the type
| info?
well, there's the "Big Bag of Pages" design that allocates objects of
individual types to large areas of virtual memory which makes it possible
to use the highest few bits of an address as an index into a type table,
instead of using the bits _explicitly_ to encode type. since very few
objects would require less than 64 bits, they could all be aligned on
even addresses, and fixnums could be a whopping 31 bits wide. whee!
anyway... the 8-bit byte-addressable memory is here to stay as long as C
is with us. are you happy now, Dennis Ritchie!? are you? </frustrated>
#:Erik
> used in the table to convert between
> seconds-since-midnight -> (hour, minute, second).
>
> (log 31 2) ~ 4.9 bits
> (log 12 2) ~ 3.6 bits
> (log 400 2) ~ 8.6 bits
>
> (+ 5 4 9) -> 18 bits -> the second fixnum,
> in the table days-since-newyear -> (day, month, year).
> But I would do
> (days-since-newyear, year) -> (day, month), not?
>
> And because all the fields are lower than 8 bits we can use pieces
> of 8 bit to encode them (could be more efficient on come CPU's, mainly
> Alpha's I think).
OK, this is where you lost me. 400 years takes 9 bits but you're going to
stick it in an 8 bit memory location? How are you doing that again?
Mike McDonald
mik...@mikemac.com
Mike, do me a favor, now. carefully, on your own, not posting anything
until you're done, go away from the computer, pick up a refence book on
Common Lisp, preferably the actual standard, but either CLtL or CLtL2
will do just fine, and _read_ about bytes operations in Common Lisp.
here's a summary of the situation from historic perspective: the 8-bit
byte at an 8-bit boundary is a special case of the more general byte
concept. the 1990's hardware "byte" is 8 bits with the same machine
address. the _software_ "byte" is any contiguous number of bits in an
_integer_. some processors can extract a byte out of a machine integer
in one operation, while others need two: a shift and a mask. because
Common Lisp is not a hardware-oriented language, it does not deal with
the machine concepts, but the historically correct _concept_.
and _no_ bullshit about how this will confuse other ignorants, OK?
#:Erik
Eric, do me a favor, pull your head out of your ass, you pompous twit! I am
well aware that CL allows arbitrary size bytes. However, Peter said:
> (log 400 2) ~ 8.6 bits
and 6 lines later he said:
> And because all the fields are lower than 8 bits we can use pieces
> of 8 bit to encode them
In the math they teach and use around here, 8.6 in not "lower" that 8.
Since I know Peter is a reasonable person, I don't think that's what he
really meant. So I'd still like Peter, not you, to clearify what he meant.
Mike McDonald
mik...@mikemac.com
thank you for contributing to my impression of you.
#:Erik
I meant that we can waste space on the fields that require less than
8 bits. Instead of packing them together as close as possible. On the
x86 at least (and on modern Alpha's I believe, but I'm no expert, I just
keep dreaming of buying one) the
extract-8bits-on-a-8bit-boundary-from-that-register[1]
commands are pretty common and fast[2], so this would give a speed bonus.
> In the math they teach and use around here, 8.6 in not "lower" that 8.
>Since I know Peter is a reasonable person, I don't think that's what he
>really meant. So I'd still like Peter, not you, to clearify what he meant.
Thanks for the complement, but as countless math teachers have told me:
I should learn to express myself a bit more correctly.
Groetjes, Peter
1: like (rusty x86 assembler, I want to _forget_ this)
mov day,al
mov month,ah
shr eax,16
mov year, ax
instead of masking bits and shifting 3 times.
2: except on a pentium pro, but Intel saw the errors of their ways...
this presupposes a number of hard-to-presuppose things. fixnums have to
live in the same 32-bit space as addresses. first, this frequently means
that some least or most significant bits are used for type information,
and this impacts the 8-bit boundaries. second, even if you could extract
a byte with a register operation, the same considerations as for the
whole fixnum would apply to the result. in practice, it's easier to
shift.
since we're dealing with Intels and I'm sure everybody has Pentium II by
now :), a shift immediate is only one ľop, and if you interleave the
shifts, masks, and the register transfers or memory writes, you can fill
the decoder and exploit the entire pipeline; you would save nothing by
using other sequences of ľops that also filled the pipeline, so there is
nothing to be gained by handling 8-bit "shifts" specially. I don't think
we want to get into the details, though...
#:Erik
You never forget how to ride a bicycle (hai voluto la bicicletta?!).
> mov day,al
> mov month,ah
> shr eax,16
> mov year, ax
Welcome back my friends, to the show that never ends,
Ladies and gentlemen, INTEL EX EIGHTY SIX...
> In article <7e9pec$64f$1...@nnrp1.dejanews.com>,
> Vassil Nikolov <vnik...@poboxes.com> wrote:
> >In theory at least, an implementation could have a specific tag for
> >a singleton reference so that the reference itself does take half the
> >space for a cons cell. But even though it could, I don't expect that
> >it would, it being too much trouble for something that's rarely needed.
>
> Aren't Lisp Machine locatives essentially half-conses?
Yes and no.
A locative is not a cons at all. In fact, its feature is that it is
an immediate pointer to a storage location in a way that is (mostly)
neutral as to what it's pointing into. (location-contents <locative>)
would read and its setf would write any location in any data structure
without regard to whether it was a pointer into a cons or whatever,
but the locative itself could be created without consing, and it was
in fact critical that this was so. In effect, a locative was a
pointer datatype and not dissimilar to the C style of pointer except
that because the Lisp Machine is very cool, locatives in the Lisp
Machine respect the data model and you can't (easily) get them to
arbitrary memory so they don't confuse the GC.
However, the underlying storage still requires a cons. Then you can
locf it and make it seem neater and prettier
HOWEVER, all is not lost because the lisp machine also has cdr-coding,
so it only takes 2 extra bits to represent an empty cdr, and conveniently
there are enough bits in a word that this doesn't take extra memory to
do, so (LIST x) takes only one Q (pointer-width) of storage because it's
a single cell with a 2-bit cdr-code of cdr-nil. And then since locf'ing
takes no additional storage, voila' you get one-word storage for free.
But then, you'd have gotten one word storage if you just used (LIST x)
[though for trivia buffs, not if you did (CONS x NIL) which gets a two-cell
cons with the first cell being CDR-NORMAL with a NIL pointer in it; the
LispM makes an assumption which it publishes that (CONS x NIL) is more
likely to be RPLACD'd than (LIST x), and LispM programmers know to
use (CONS x NIL) when they plan to rplacd so the RPLACD doesn't have to
cons. In the (LIST x) case, RPLACD has to replace the one cell with a
dtp-body-forward that the hardware knows means "find the real cons
elsewhere--there wasn't room here". It's an amazing piece of craftsmanship
which is invisible to users and which works statistically very well
with most people being wholly unaware, and even better if they are
aware...].
I'm doing this from memory and haven't recently used any of it, but I think
I got it right. If anyone thinks I've slipped, tell me and I'll check.
I have a lispm here to do that with--but I have other news to answer
after having been away a week, so I'm skimping on the details.
Also, Barry probably already knew all of the above and I bet he had it
compiled in at such a low level that he was remembering the LOCF=1 word
thing becuase he'd long ago collapsed out the fact that two serendipitous
facilities were conspiring to give this one nice pretty result.
I vaguely recall writing things like
(defun half-cons (x) (locf (car (list x))))
I also recall there is a thing about the cdr of the list where you don't
want to try locfing that because the complexity of locfing the cdr is so
bad that locf of cdr just returns the cons and location-contents knows
specially how to deal with conses as if they were locatives by doing cdr
because the cdr microcode (or ivory chip hardware) is so complicated
(because of cdr codes) that it was too expensive to effectively duplicate
in the locative-manipulating instructions and it was cheaper to just
pretend a cons was a locative in that one case. So
(let ((x (cons 1 2))) (cons x (locf (car x))))
((1 . 2) . #<LOCATIVE 123456>)
but
(let ((x (cons 1 2))) (cons x (locf (cdr x))))
=> ((1 . 2) 1 . 2)
> (1) Using a cons is most efficient but is one of those `cancer of the
> semicolon' cases:
>
> (let ((ref (cons foo nil))) ;the cdr will not be used
> ...) ^^^^^^^^^^^^^^^^^^^^^^^^^
How about:
(let ((ref (cons foo 'if-you-use-this-half-of-this-cons-you-deserve-to-lose)))
...)
That avoids the semicolon. Does that make it better? ;-)
I recall the Lisp Machine has some funny case where you find something
on the stack that is a throw tag with the name
If-you-throw-to-this-tag-you-deserve-to-lose.
My assumption has always been that it is the secret stack thing used to
implement lexical blocks or some such, but I never tracked it down.
If anyone happens to know...
I vaguely remember it had something to do with unwind-protect. But I
could be wrong.
-- Chuck
--
Chuck Fry -- Jack of all trades, master of none
chu...@chucko.com (text only please) chuc...@home.com (MIME enabled)
Lisp bigot, mountain biker, car nut, sometime guitarist and photographer
The addresses above are real. All spammers will be reported to their ISPs.