Paul Dietz <paul.f.di...@motorola.com> writes: > Erik Naggum wrote:
> > By > > the way, which operators return a `boolean´ as opposed to only a > > "generalized boolean"? The obvious choice is of course `not´.
> The ALWAYS and NEVER termination test clauses of LOOP return T on success. > Aside from NOT (and NULL), I don't know of any others.
Implementationally, I believe that this is as it should be. It is implementationally easier to generate a generalized boolean than a boolean, but the specification of NOT/NULL turning a generalized boolean into a boolean takes care of the specialized situations where one needs T or NIL (which really aren't that often, since most true or false arguments to CL functions are required to be generalized booleans).
Also, implementing (not (not (<expression>))), or (not (null (<expression>))) is very inexpensive.
Consider:
(when <pred> <conseq>)
The generated code generates both the predicate and the consequent, and between them places a "jump if result not NIL" instruction whose target is the code just after the consequent. Now consider:
(when (not <pred>) <conseq>)
which is the same as
(unless <pred> <conseq>)
In this case, a "jump if result is NIL" instruction is placed after the predicate. A decent CL compiler will simply reverse the sense of the conditional jump instruction for each level of negation, for as many NOTs or NULLs are wrapped around the predicate.
Now, finally, consider a non-predicate situation:
(setq x (member y z))
The assignment of the result to x is direct; so what is returned from the function (MEMBER, in this case) is significant. Suppose we want T or NIL from this:
(setq x (null (member y z)))
In this case, the compilation of NOT must create two branches, one for the T result and one for the NIL result. It is equivelent to
(setq x (if (member y z) nil t))
But note that this result is negated, and we might want the positive logic as the result:
(setq x (not (null (member y z))))
Which is equivelent to
(setq x (if (member y z) t nil))
This compiles to no more code than the previous example, because the only real expense is in the final generation of the boolean values to return.
Duane Rettig <du...@franz.com> writes: > The generated code generates both the predicate and the consequent, and > between them places a "jump if result not NIL" instruction whose target > is the code just after the consequent. Now consider:
This is poorly worded. I should have said "The compiler generates code for both the ..."
* Paul Dietz | It's not a very strong critique; many existing lisps are not ANSI | compliant anyway, so you have much more serious potential | portability problems.
If I read you right, the effect of specifying a boolean instead of generalized boolean value would only have been to create more non-conforming implementations...
I used to believe that people (both programmers and vendors) would value conformance to the standard and based much of my effort on this premise, but it appears that programmers are either clueless or know how to circumvent the shortcomings and few vendors give a damn, but this has, in turn, dampened my enthusiasm for both vendors, language, and community.
-- Erik Naggum, Oslo, Norway
Act from reason, and failure makes you rethink and study harder. Act from faith, and failure makes you blame someone and push harder.
* Kent M Pitman | There as a public review comment when standardizing that came from | Boyer (of Boyer/Moore) and John McCarthy, if I recall correctly. | The committee replied that they should do exactly as you suggest.
But `eq´ is not guaranteed to return a boolean. It may seem rather silly for it to return anything else, but the specification is very clear that it does not, in fact, return `t´ when it means true. So
will return `t´ when the two are the same truth value.
Now, the much more important question is why anyone would care what the specific "true" value is. Nobody tests for anything but `nil´, anyway, and writing code that tests for any /particular/ true value is broken.
Incidentally, the above function will effectively, since machine operations do not return `nil´ and `t´ but some CPU flag, be at least one conditional, so a version of `same-truth-value´ that returned a generalized boolean could also be written as
and if a true boolean is necessary, use (and v2 t) instead of v2. In natively compiled Common Lisp implementations, this is much more efficient, too, but when expressed this way, there is seldom a need for a function for it.
Also note that the standard name for this boolean operator is `eqv´, cf `logeqv´ and `boole-eqv´.
-- Erik Naggum, Oslo, Norway
Act from reason, and failure makes you rethink and study harder. Act from faith, and failure makes you blame someone and push harder.
Duane Rettig wrote: > Implementationally, I believe that this is as it should be. It is > implementationally easier to generate a generalized boolean than a > boolean,
Ok. For example, it's often slightly easier to generate a small fixnum than it would be to load the address (+ mark bits) of the T symbol.
> > Implementationally, I believe that this is as it should be. It is > > implementationally easier to generate a generalized boolean than a > > boolean,
> Ok. For example, it's often slightly easier to generate a small > fixnum than it would be to load the address (+ mark bits) of the T symbol.
Not necessarily. T is only one memory reference, and that location is likely to be in cache. It also tends not to have any pipeline or functional unit conflicts, because the memory reference is coming just after a comparison operation and a possible jump. It is usually actually more consistent to just load T than to try to calculate some new value, unless that value is in fact the result of some calculation already known to be true. If that value is numeric, you then have to load NIL on a false result anyway.
> But then why does (eq 'x 'x) return T in ACL? :)
Um, er, ah, ... Because it can? :-)
I'm actually working on an answer to a question Erik asked earlier. I'll answer seriously there.
Erik Naggum <e...@naggum.no> writes: > Now, the much more important question is why anyone would care what > the specific "true" value is. Nobody tests for anything but `nil´, > anyway, and writing code that tests for any /particular/ true value > is broken.
In a GC which uses a write-barrier for new objects, T and NIL can usually be guaranteed to be old and thus the write-barrier can be elided. Thus, if a generalized boolean can be guaranteed by the implementation to only return T or NIL, even though it might be allowed to return something else for true, it can then lead to more efficient code.
Unfortunately, I lost track of this optimization, because of a fluke in the naming (we had implemented a comp::boolean type for the compiler, which didn't get fully translated when cl:boolean was exported as a late entry in the CL spec). I've now recovered the optimization, and once again my devel version implements (setf x (the boolean y)) efficiently.
In article <3DF608BE.A7831...@motorola.com>, Paul Dietz <paul.f.di...@motorola.com> wrote:
> There would be a slight efficiency advantage to having false == 0 in > the usual lisp implementations, since obtaining a nonzero NIL for comparisons > requires either extra instructions or consumes a register. I doubt this > is significant, or worth any extra difficulty it might cause the programmer, > however.
Depends on how the tags work in each implementation, but having NIL be at machine address zero is about equally as likely to be optimal.
Bruce Hoult wrote: > Depends on how the tags work in each implementation, but having NIL be > at machine address zero is about equally as likely to be optimal.
But that means symbols need to be the objects with zero tag bits, which slows down fixnum arithmetic.
Duane Rettig wrote: >>Ok. For example, it's often slightly easier to generate a small >>fixnum than it would be to load the address (+ mark bits) of the T symbol.
> Not necessarily. T is only one memory reference, and that location is > likely to be in cache. It also tends not to have any pipeline or > functional unit conflicts, because the memory reference is coming just > after a comparison operation and a possible jump. It is usually actually > more consistent to just load T than to try to calculate some new value, > unless that value is in fact the result of some calculation already known > to be true. If that value is numeric, you then have to load NIL on a false > result anyway.
Well, here's an example. Generating a 0 appears to be slightly more compact than loading T. (ACL 6.2 Linux x86 trial; optimization settings: (speed 3) (safety 0) (space 0) (debug 0))
CL-USER(22): (defun foo (x y) (eq x y)) FOO CL-USER(23): (defun bar (x y) (if (eq x y) 0 nil)) BAR CL-USER(24): (compile 'foo) FOO NIL NIL CL-USER(25): (compile 'bar) BAR NIL NIL CL-USER(26): (disassemble 'foo) ;; disassembly of #<Function FOO> ;; formals:
>>>>> "Paul" == Paul F Dietz <di...@dls.net> writes:
Paul> Duane Rettig wrote:
Paul> Well, here's an example. Generating a 0 appears to be slightly more Paul> compact than loading T. (ACL 6.2 Linux x86 trial; optimization settings: Paul> (speed 3) (safety 0) (space 0) (debug 0))
But would the answer have been different if you ran this test on a Sparc? At least with CMUCL foo is
The only difference it the instruction at L1. Instead of loading a zero to %A0, a small constant is added to the %NULL register (which is NIL) to create T.
> >>Ok. For example, it's often slightly easier to generate a small > >>fixnum than it would be to load the address (+ mark bits) of the T symbol. > > Not necessarily. T is only one memory reference, and that location > > is
> > likely to be in cache. It also tends not to have any pipeline or > > functional unit conflicts, because the memory reference is coming just > > after a comparison operation and a possible jump. It is usually actually > > more consistent to just load T than to try to calculate some new value, > > unless that value is in fact the result of some calculation already known > > to be true. If that value is numeric, you then have to load NIL on a false > > result anyway.
> Well, here's an example. Generating a 0 appears to be slightly more > compact than loading T. (ACL 6.2 Linux x86 trial; optimization settings: > (speed 3) (safety 0) (space 0) (debug 0))
Yes, it does generate a two-byte instruction instead of a three-byte instruction on the x86 (but, as Raymond pointed out, not on other architectures). However, the savings are small - less than 3% in this very trivial case, which means that real applications will see even less advantage. As I said earlier, and your example code shows, what could have been a disadvantage via a memory reference is mitigated by the lack of any neighboring pipeline-killing instructions - the memory reference completes when it completes, there is no immediate usage of the target eax register, and thus the instruction locus continues on with little delay.
Also, in a CL that provides a foreign-function interface, 0 is an especially bad value to use for <true>, because when a foreign argument is declared to be type boolean, the translation semantics become confusing for 0, which is true in lisp and false in C.
Duane Rettig <du...@franz.com> writes: > Also, in a CL that provides a foreign-function interface, 0 is an > especially bad value to use for <true>, because when a foreign argument > is declared to be type boolean, the translation semantics become > confusing for 0, which is true in lisp and false in C.
Interesting point.
Of curiosity, is this an actual semantic ambiguity problem or just a trap waiting to nab people who don't keep contexts straight? It doesn't seem like there's every any actual ambiguity, but I can easily see how human beings could be made to show their potential fallibilities in this context.
Could we go back and change things, I've advocated the "many false values" approach that MOO takes. It has a very good feel. I'd like to see "", 0, 0.0, NIL, #(), and all error objects be false.
But oh well... we have what we have and it's been "good enough" for a long time. I'm content. Every strategy has its advantages and disadvantages. Even if I got what I wanted, we'd be discussing the "good ole days" when there was only one truth value because of reasons we presently take for granted... You never get anything for free.
Kent M Pitman wrote: > Could we go back and change things, I've advocated the "many false values" > approach that MOO takes. It has a very good feel. I'd like to see > "", 0, 0.0, NIL, #(), and all error objects be false.
There's a blonde joke here somewhere.
:)
--
kenny tilton clinisys, inc http://www.tilton-technology.com/ --------------------------------------------------------------- "Cells let us walk, talk, think, make love and realize the bath water is cold." -- Lorraine Lee Cudmore
> > Also, in a CL that provides a foreign-function interface, 0 is an > > especially bad value to use for <true>, because when a foreign argument > > is declared to be type boolean, the translation semantics become > > confusing for 0, which is true in lisp and false in C.
> Interesting point.
> Of curiosity, is this an actual semantic ambiguity problem or just a > trap waiting to nab people who don't keep contexts straight? It doesn't > seem like there's every any actual ambiguity, but I can easily see how > human beings could be made to show their potential fallibilities in this > context.
Correct. There is no ambiguity. But pity the poor tired C/Lisp programmer who is spending a late night with a foreign interface: "I don't understand this at all; when I call propagate_truth_value() directly from C with a 0 argument, I get the right behavior, and the false value is propagated, but when I call it from Lisp with the same 0 argument, it is doing something different...". How many times have you stared at code over and over again, with the answer staring right back at you, and you just _can't_ see it?
> Could we go back and change things, I've advocated the "many false values" > approach that MOO takes. It has a very good feel. I'd like to see > "", 0, 0.0, NIL, #(), and all error objects be false.
Ouch; that would be an implementor's nightmare, unless you also then provided for only one "true" value. Imagine how a test for truth or falsehood would have to be implemented!
> But oh well... we have what we have and it's been "good enough" for a > long time. I'm content. Every strategy has its advantages and > disadvantages. Even if I got what I wanted, we'd be discussing the > "good ole days" when there was only one truth value because of reasons > we presently take for granted... You never get anything for free.
The grass _looks_ greener on the other side... but is it, really?
On Wed, 11 Dec 2002 05:07:32 -0600, "Paul F. Dietz" <di...@dls.net> wrote:
>Bruce Hoult wrote:
>> Depends on how the tags work in each implementation, but having NIL be >> at machine address zero is about equally as likely to be optimal.
>But that means symbols need to be the objects with zero tag bits, >which slows down fixnum arithmetic.
> Paul
Not necessarily. It could be at address 0 but there could still be tag bits i.e. NIL could be 0x1, 0x2 or 0x3. Loading small constants is a small instruction (on intel processors anyway).
Corman Lisp just keeps NIL and T in cells indexed off a committed register: [esi] and [esi + 4]. The esi register is always pointing here when lisp code is executing. This strategy also works well for RISC processors--even better, because they usually have more registers. I don't think there is much faster than loading these, because they should almost always be in cache and the instructions are small and well-optimized. Given this, it doesn't matter too much what representation is used. There is a cost of course in committing a register (especially on intel with few registers). However, in this case, that register is used for many, many more uses, so the specific cost for these particular cells is negligable. In other words, many other things are stored at offsets from that register, and it works in conjunction with the processor stack to maintain thread execution state. This would be needed in any case. For old-school mac programmers, it's like the old A5-world (where everything was hanging off the A5 register).
>Kent M Pitman wrote: >> Could we go back and change things, I've advocated the "many false values" >> approach that MOO takes. It has a very good feel. I'd like to see >> "", 0, 0.0, NIL, #(), and all error objects be false. >There's a blonde joke here somewhere.
Seems Kent has converted to some perl derivative recently *g*
Erik Naggum wrote: > I used to believe that people (both programmers and vendors) would > value conformance to the standard and based much of my effort on > this premise, but it appears that programmers are either clueless > or know how to circumvent the shortcomings and few vendors give a > damn, but this has, in turn, dampened my enthusiasm for both > vendors, language, and community.
The difference is certainly stark when comparing the C community and the Common Lisp community (as represented by their respective newsgroups comp.lang.c and comp.lang.lisp). C people are fanatic about their standard compared to Common Lisp people. I think the C people does the right thing.
One of the effects of this is that if I write some ISO C code I can be pretty sure it will compile and run and any C implementation on the planet. I don't feel that I have this guarantee when writing Common Lisp. Which is a shame since IMHO Common Lisp is the better language of the two.
>>>>> "Thomas" == Thomas Stegen <tste...@cis.strath.ac.uk> writes:
Thomas> One of the effects of this is that if I write some ISO C code Thomas> I can be pretty sure it will compile and run and any C implementation Thomas> on the planet. I don't feel that I have this guarantee when
But how many different C implementations have you really tried it on? And on how many different platforms?
Thomas Stegen <tste...@cis.strath.ac.uk> writes: > Erik Naggum wrote: > > I used to believe that people (both programmers and vendors) would > > value conformance to the standard and based much of my effort on > > this premise, but it appears that programmers are either clueless > > or know how to circumvent the shortcomings and few vendors give a > > damn, but this has, in turn, dampened my enthusiasm for both > > vendors, language, and community.
> The difference is certainly stark when comparing the C community > and the Common Lisp community (as represented by their respective > newsgroups comp.lang.c and comp.lang.lisp). C people are fanatic > about their standard compared to Common Lisp people. I think the > C people does the right thing.
> One of the effects of this is that if I write some ISO C code > I can be pretty sure it will compile and run and any C implementation > on the planet. I don't feel that I have this guarantee when > writing Common Lisp. Which is a shame since IMHO Common Lisp is > the better language of the two.
You're kidding right? Download cryptlib and look at the comments in the Makefile if you'd like to read some humorous commentary on the state of C portability between different architectures & compilers.
Highly portable "ISO" code is easy until you start doing complicated things or actually try to use operating system features, then you really quickly end up either drowning in homebrewed ifdefs or you sign your soul over to something like automake/autoconf/configure. And thats before you even start trying to support different compilers. Sure, the simple cases are pretty easy- but thats not the point, the important question is how difficult do the various implementations on the various architectures make portability in the face of complex software.
Just try to find highly or even reasonably portable support in C for things as seemingly simple as managing pathnames or date/time without tracking down a library that happens to support your target architectures- make it easy and only include the Unix family in that list, worry about the various Windows permutations later. Then you'll really enjoy how portable CL is.
>>>>> On Wed, 11 Dec 2002 22:00:54 +0000, Thomas Stegen ("Thomas") writes:
Thomas> One of the effects of this is that if I write some ISO C code Thomas> I can be pretty sure it will compile and run and any C Thomas> implementation on the planet. I don't feel that I have this Thomas> guarantee when writing Common Lisp. Which is a shame since Thomas> IMHO Common Lisp is the better language of the two.
I'm not sure why you have that feeling, because if you write ANSI Common Lisp code, it will compile and run on any ANSI Common Lisp implementation on the planet.
And you'll be able to write portable code a zillion times more easily than in C. And you'll know very well when you're doing something non-standard in Lisp.
I'm under the impression that the Lisp spec is tighter than the C spec, and I am definitely positive that the Lisp spec includes a ton more functionality.
There are some implementations of Lisp that are not fully ANSI compliant, but they tell you that right up front. There are also C implementations that are not ANSI compliant, but I am not sure they are widely used any more for mainstream programming. An interesting question for you would be: why do you suppose it is possible for some people to get their work done, writing portable code, when using a non-compliant Lisp implementation, but that situation does not happen in C? (Part of the answer is surely that those Lisps are so close to being ANSI compliant that their users can't tell the difference.)
If you call third-party libraries, they are not ANSI standard. But you'll find that it's easy to write compatible interface libraries, so that even your calls to third-party code will be totally portable. In fact, some people have gone off and written such compatability libraries and shared them.
But if you stick to ANSI implementations, your code will be portable. It does not matter what the word size is, or whether the machine has registers, or what the byte order is, or anything else. I don't know how you got an impression that is so backwards from reality.
Christopher C. Stacy wrote: >>>>>> On Wed, 11 Dec 2002 22:00:54 +0000, Thomas Stegen ("Thomas") writes: > Thomas> One of the effects of this is that if I write some ISO C code > Thomas> I can be pretty sure it will compile and run and any C > Thomas> implementation on the planet. I don't feel that I have this > Thomas> guarantee when writing Common Lisp. Which is a shame since > Thomas> IMHO Common Lisp is the better language of the two.
> I'm not sure why you have that feeling, because if you > write ANSI Common Lisp code, it will compile and run on > any ANSI Common Lisp implementation on the planet.
I agree on the point that ANSI CL code is more portable than C code. But I think there are things that make writing portable ANSI CL code unnecessarily difficult.
One thing is the LF(Unix) vs. CR(MacOS) vs. CRLF(Windows) lineending issue. This differences make read-line and write-line useless for I/O that has to speak a protocol like HTTP, IRC or most other common internet text-protocols. If I call read-line on such a source one of the following will happen:
[xxxxx]abc ; Windows [xxxxxCR]abc ; Unix [xxxxx]LFabc ; MacOS
The part in the brackets is what read-line would return and the part after it is what is left in the stream.
Some implementations of Common Lisp allow to change the lineending style via external formats - but that is not portable ANSI CL code either.
So it seems the only thing one can do is to write your own line reader function in portable ANSI CL (via read-char/unread-char/peek-char).
>>>>> On Thu, 12 Dec 2002 09:37:06 +0100, Jochen Schmidt ("Jochen") writes:
Jochen> I agree on the point that ANSI CL code is more portable than Jochen> C code. But I think there are things that make writing Jochen> portable ANSI CL code unnecessarily difficult.
Jochen> One thing is the LF(Unix) vs. CR(MacOS) vs. CRLF(Windows) Jochen> lineending issue. This differences make read-line and Jochen> write-line useless for I/O that has to speak a protocol like Jochen> HTTP, IRC or most other common internet text-protocols. Jochen> If I call read-line on such a source one of the following Jochen> will happen:
Jochen> The part in the brackets is what read-line would return and Jochen> the part after it is what is left in the stream.
Jochen> Some implementations of Common Lisp allow to change the lineending style via Jochen> external formats - but that is not portable ANSI CL code either.
Jochen> So it seems the only thing one can do is to write your own line reader Jochen> function in portable ANSI CL (via read-char/unread-char/peek-char).
What you are seeing is a result of the fact that READ-LINE does not include the line terminator character in the string it returns.
I think this is better than C, which does not have that degree of portable IO: C programs have to be coded to know what the line terminator character is. In Common Lisp, you either don't need to worry about it at all, or you use the portable character #\Newline. Lisp takes care of the line terminator for you.
The problem you are experiencing is because of the network protocol.
Neither ANSI Common Lisp (nor C) defines what a network stream is, or provides a way to open one. So you're already in "extension land" as soon as you're talking about network streams.
The problem you're having above is that each network protocol defines what the line terminator is, and there is no way for Lisp (or any other language) to know what choice was made. If the network protocol happens to use the same line terminator as your host operating system, then READ-LINE will happen to work.
Usually, for text command based control connections, the choice is CR LF. That's the same choice as DOS/Windows, which is why your first example above works with READ-LINE. But that's not portable, so you may have to program around that.
In C, you have to write your own function to read the lines and do the translation. The programs have to be ported.
In Lisp, you could write your own, also, using Grey streams if you like. But each vendor has also provided an extension that allows you to automatically control the external character set translation. This is done either by wrapping translation streams around the network stream, or just specifying some keywords to the stream opener function. At that point, you can write portable code and you're back to not caring about line terminators.
If you're okay with writing (MAKE-INSTANCE 'NET:SOCKET-STREAM...) then you ought to be okay with some sort of :EXTERNAL-FORMAT keyword thingie that is endorsed by the standard.
It's much more than C is doing for you, so it's hard to understand why you would claim that C is more portable than Lisp in this area. Or any area.
Jochen> What do others when faced with this issue?
The main server code is handed a stream that implements the correct #\Newline semantics (and probably buffering and maybe other features). A small portion of this code, and also the code that opens the network connection, needs to be ported for each operating system. The main server code is all portable ANSI Common Lisp, while the network code is as a whole, not.
Jochen Schmidt <j...@dataheaven.de> writes: > Christopher C. Stacy wrote:
> >>>>>> On Wed, 11 Dec 2002 22:00:54 +0000, Thomas Stegen ("Thomas") writes: > > Thomas> One of the effects of this is that if I write some ISO C code > > Thomas> I can be pretty sure it will compile and run and any C > > Thomas> implementation on the planet. I don't feel that I have this > > Thomas> guarantee when writing Common Lisp. Which is a shame since > > Thomas> IMHO Common Lisp is the better language of the two.
> > I'm not sure why you have that feeling, because if you > > write ANSI Common Lisp code, it will compile and run on > > any ANSI Common Lisp implementation on the planet.
> I agree on the point that ANSI CL code is more portable than C code. > But I think there are things that make writing portable ANSI CL code > unnecessarily difficult.
> One thing is the LF(Unix) vs. CR(MacOS) vs. CRLF(Windows) lineending issue.
This difference also exists in C, a la fgets().
> This differences make read-line and write-line useless for I/O that has to > speak a protocol like HTTP, IRC or most other common internet > text-protocols. If I call read-line on such a source one of the following > will happen:
> The part in the brackets is what read-line would return and the part after > it is what is left in the stream.
The read-line function is defined to leave out the #\newline that was read. In C, the '\n' is included by fgets() (a minor difference between the CL and C styles of reading a line). I did not try this on MacOS, but on Linux and Windows, I got similar results (a file with the characters abc\r\n was read into windows as "abc\n" and into Linux as "abc\r\n"). Thus the same non-portability exists between linux and Windows in C, even with the supposedly portable Cygwin C library for Windows.
> Some implementations of Common Lisp allow to change the lineending style via > external formats - but that is not portable ANSI CL code either.
Yes, the CL spec allows for the vendor to provide portability between line-ending styles via external formats. What does the C specification provide?
> So it seems the only thing one can do is to write your own line reader > function in portable ANSI CL (via read-char/unread-char/peek-char).
Or ask your vendor to provide a portable solution.
> What do others when faced with this issue?
In Allegro CL, we provide external formats which handle CR/LF styles.