Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
(setf (readtable-case *readtable*) :invert) completely preserves symbol case in CMUCL
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  9 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Adam Warner  
View profile  
 More options May 26 2002, 10:58 am
Newsgroups: comp.lang.lisp
From: Adam Warner <use...@consulting.net.nz>
Date: Mon, 27 May 2002 03:00:44 +1200
Local: Sun, May 26 2002 11:00 am
Subject: (setf (readtable-case *readtable*) :invert) completely preserves symbol case in CMUCL
Hi all,

After trying to find a way to preserve symbol case in CMUCL I have
discovered that (setf (readtable-case *readtable*) :invert) preserves
symbol case perfectly (also without breaking existing lowercase code):

$ lisp
CMU Common Lisp release x86-linux 3.0.12 18d+ 23 May 2002 build 3350,
...

Loaded subsystems:
    Python 1.0, target Intel x86
    CLOS based on PCL version:  September 16 92 PCL (f)
* (setf (readtable-case *readtable*) :invert)

:invert
* 'aa

aa
* 'AA

AA
* 'Aa

Aa
* :abc

:abc
* :Abc

:Abc
* :ABC

:ABC

This is a fantastic find because I wish to include the symbols in XHTML
output (which is case sensitive).

Can someone please comment on whether this behaviour accords with the
HyperSpec:

http://www.xanalys.com/software_tools/reference/HyperSpec/Body/23_ab.htm

   When the readtable case is :invert, then if all of the unescaped
   letters in the extended token are of the same case, those (unescaped)
   letters are converted to the opposite case.

I'm thankful that all lowercase symbols are not converted to uppercase and
vice versa. Does this mean the CMUCL behaviour is non-standard (or what
has been called "modern" in other threads?)

Regards,
Adam


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options May 26 2002, 11:33 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Sun, 26 May 2002 15:33:22 GMT
Local: Sun, May 26 2002 11:33 am
Subject: Re: (setf (readtable-case *readtable*) :invert) completely preserves symbol case in CMUCL
* Adam Warner
| After trying to find a way to preserve symbol case in CMUCL I have
| discovered that (setf (readtable-case *readtable*) :invert) preserves
| symbol case perfectly (also without breaking existing lowercase code):

  Try the following two forms and report your understanding of the
  interaction of the symbol reader and the printer:

(mapcar #'symbol-name '(UPPER UPPER-lower lower))
(mapcar #'intern '("UPPER" "UPPER-lower" "lower"))

  There is potential enligthenment here.
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.

  70 percent of American adults do not understand the scientific process.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kalle Olavi Niemitalo  
View profile  
 More options May 26 2002, 11:41 am
Newsgroups: comp.lang.lisp
From: Kalle Olavi Niemitalo <k...@iki.fi>
Date: 26 May 2002 18:46:12 +0300
Local: Sun, May 26 2002 11:46 am
Subject: Re: (setf (readtable-case *readtable*) :invert) completely preserves symbol case in CMUCL

Adam Warner <use...@consulting.net.nz> writes:
> I'm thankful that all lowercase symbols are not converted to uppercase and
> vice versa. Does this mean the CMUCL behaviour is non-standard (or what
> has been called "modern" in other threads?)

The reader converts all-lower-case symbols to upper case, but the
printer converts them back again.  See CLHS section 22.1.3.3.2.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam Warner  
View profile  
 More options May 26 2002, 12:14 pm
Newsgroups: comp.lang.lisp
From: Adam Warner <use...@consulting.net.nz>
Date: Mon, 27 May 2002 04:15:59 +1200
Local: Sun, May 26 2002 12:15 pm
Subject: Re: (setf (readtable-case *readtable*) :invert) completely preserves symbol case in CMUCL

On Mon, 27 May 2002 03:33:22 +1200, Erik Naggum wrote:
> * Adam Warner
> | After trying to find a way to preserve symbol case in CMUCL I have |
> discovered that (setf (readtable-case *readtable*) :invert) preserves |
> symbol case perfectly (also without breaking existing lowercase code):

>   Try the following two forms and report your understanding of the
>   interaction of the symbol reader and the printer:

> (mapcar #'symbol-name '(UPPER UPPER-lower lower)) (mapcar #'intern
> '("UPPER" "UPPER-lower" "lower"))

>   There is potential enligthenment here.

Thanks Kalle and Erik. If I had run the test-readtable-case-reading code
it would have been clear. Yes I have achieved enlightenment Erik.

READTABLE-CASE  Input   Symbol-name
-----------------------------------
:INVERT         ZEBRA   zebra
:INVERT         Zebra   Zebra
:INVERT         zebra   ZEBRA

So when I go to read the symbol it will be the wrong case and I will have
to invert it. But at least mixed case will be preserved (and since the
inverting is predictable no information is thrown away).

I can't use preserve because none of the built in functions can be called
using lower case. Perhaps a custom compiled CMUCL image would be a long
term solution.

I'll have to think about this in the (*cough*) morning. I'm really tired.

Thanks for all your help.

Regards,
Adam


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Pierre R. Mai  
View profile  
 More options May 26 2002, 12:31 pm
Newsgroups: comp.lang.lisp
From: "Pierre R. Mai" <p...@acm.org>
Date: 26 May 2002 18:14:47 +0200
Local: Sun, May 26 2002 12:14 pm
Subject: Re: (setf (readtable-case *readtable*) :invert) completely preserves symbol case in CMUCL

* (symbol-name 'aa)

"AA"
* (symbol-name 'AA)

"aa"
* (symbol-name 'Aa)

"Aa"

> This is a fantastic find because I wish to include the symbols in XHTML
> output (which is case sensitive).

> Can someone please comment on whether this behaviour accords with the
> HyperSpec:

> http://www.xanalys.com/software_tools/reference/HyperSpec/Body/23_ab.htm

>    When the readtable case is :invert, then if all of the unescaped
>    letters in the extended token are of the same case, those (unescaped)
>    letters are converted to the opposite case.

CMUCL does exactly what the HyperSpec demands here.

> I'm thankful that all lowercase symbols are not converted to uppercase and
> vice versa. Does this mean the CMUCL behaviour is non-standard (or what
> has been called "modern" in other threads?)

They are converted as demanded by the HyperSpec (otherwise entering
(car (cons 1 2)) in that mode would fail), so this isn't modern mode.
The reason why you are confused is that both the reader and the
printer collude to give you the intended illusion that case is
completely preserved.  Quoting from section 22.1.3.3.2 "Effect of
Readtable Case on the Lisp Printer":

When printer escaping is disabled, or the characters under consideration are
not already quoted specifically by single escape or multiple escape syntax,
the readtable case of the current readtable affects the way the Lisp printer
writes symbols in the following ways:

:upcase
          When the readtable case is :upcase, uppercase characters are printed
     in the case specified by *print-case*, and lowercase characters are
     printed in their own case.

[...]

:invert
          When the readtable case is :invert, the case of all alphabetic
     characters in single case symbol names is inverted. Mixed-case symbol
     names are printed as is.

So as long as you always use the Lisp Printer (or do something
similar), you will get the illusion of having both case preservation,
and the ability to access CL-mandated symbols in lower-case.  However,
behind the scenes 'car is still the symbol CL:CAR, etc.

Regs, Pierre.

--
Pierre R. Mai <p...@acm.org>                    http://www.pmsf.de/pmai/
 The most likely way for the world to be destroyed, most experts agree,
 is by accident. That's where we come in; we're computer professionals.
 We cause accidents.                           -- Nathaniel Borenstein


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam Warner  
View profile  
 More options May 26 2002, 12:57 pm
Newsgroups: comp.lang.lisp
From: Adam Warner <use...@consulting.net.nz>
Date: Mon, 27 May 2002 04:58:47 +1200
Local: Sun, May 26 2002 12:58 pm
Subject: Re: (setf (readtable-case *readtable*) :invert) completely preserves symbol case in CMUCL

On Mon, 27 May 2002 04:14:47 +1200, Pierre R. Mai wrote:
> So as long as you always use the Lisp Printer (or do something similar),
> you will get the illusion of having both case preservation, and the
> ability to access CL-mandated symbols in lower-case.  However, behind
> the scenes 'car is still the symbol CL:CAR, etc.

Thanks Pierre. I also find this is a clear example of what happens with (setf
(readtable-case *readtable*) :invert):

* (string :align)

"ALIGN"

Only (setf (readtable-case *readtable*) :preserve) actually preserves the
symbol case:

* (STRING :align)

"align"

But unfortunately (string :align) is undefined.

Regards,
Adam


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options May 26 2002, 1:42 pm
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Sun, 26 May 2002 17:42:04 GMT
Local: Sun, May 26 2002 1:42 pm
Subject: Re: (setf (readtable-case *readtable*) :invert) completely preserves symbol case in CMUCL
* Adam Warner <use...@consulting.net.nz>
| Yes I have achieved enlightenment Erik.

  You made my day!

| I can't use preserve because none of the built in functions can be called
| using lower case.  Perhaps a custom compiled CMUCL image would be a long
| term solution.

  Well, that way lies madness.  One Common Lisp vendor has decided to make
  a "custom" world in which symbols are in their preferred lower-case.
  While I also like to read and see lower-case, all I need to do to get
  that most of the time is with either :invert or :upcase and *print-case*
  to :downcase.  However, if you want to use lower-case names in your own
  code, you can shadow intern, find-symbol, and symbol-name to invert their
  argument.  Efficient invertion is not necessarily a trivial task, and
  your implementation may have optimized functions for it, but this is a
  shot, and intended to be an efficien tone.  Just how efficient it is
  seems to vary a lot between implementations:

(defun invert-string (string)
  (declare (optimize (speed 3) (safety 0))
           (simple-string string))
  (check-type string 'string)
  (prog ((invert nil)
         (index 0)
         (length (length string)))
    (declare (simple-string invert)
             (type (integer 0 65536) index length))
   unknown-case
    (cond ((= index length)
           (return string))
          ((upper-case-p (schar string index))
           (when (and (/= (1+ index) length)
                      (lower-case-p (schar string (1+ index))))
             (return string))
           (setq invert (copy-seq string))
           (go upper-case))
          ((lower-case-p (schar string index))
           (setq invert (copy-seq string))
           (go lower-case))
          (t
           (incf index)
           (go unknown-case)))
   upper-case
    (setf (schar invert index) (char-downcase (schar invert index)))
    (incf index)
    (cond ((= index length)
           (return invert))
          ((lower-case-p (schar invert index))
           (return string))
          (t
           (go upper-case)))
   lower-case
    (setf (schar invert index) (char-upcase (schar invert index)))
    (incf index)
    (cond ((= index length)
           (return invert))
          ((upper-case-p (schar invert index))
           (return string))
          (t
           (go lower-case)))))
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.

  70 percent of American adults do not understand the scientific process.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam Warner  
View profile  
 More options May 26 2002, 7:32 pm
Newsgroups: comp.lang.lisp
From: Adam Warner <use...@consulting.net.nz>
Date: Mon, 27 May 2002 11:34:37 +1200
Local: Sun, May 26 2002 7:34 pm
Subject: Re: (setf (readtable-case *readtable*) :invert) completely preserves symbol case in CMUCL

On Mon, 27 May 2002 05:42:04 +1200, Erik Naggum wrote:
> * Adam Warner <use...@consulting.net.nz> | Yes I have achieved
> enlightenment Erik.

>   You made my day!

And mine!

> | I can't use preserve because none of the built in functions can be
> called | using lower case.  Perhaps a custom compiled CMUCL image would
> be a long | term solution.

>   Well, that way lies madness.  One Common Lisp vendor has decided to
>   make a "custom" world in which symbols are in their preferred
>   lower-case. While I also like to read and see lower-case, all I need
>   to do to get that most of the time is with either :invert or :upcase
>   and *print-case* to :downcase.  However, if you want to use lower-case
>   names in your own code, you can shadow intern, find-symbol, and
>   symbol-name to invert their argument.  Efficient invertion is not
>   necessarily a trivial task, and your implementation may have optimized
>   functions for it, but this is a shot, and intended to be an efficient
>   one.  Just how efficient it is seems to vary a lot between
>   implementations:

Thanks for the code Erik. It appears to be slightly broken. Here's my
attempt to fix it:

> (defun invert-string (string)
>   (declare (optimize (speed 3) (safety 0))
>       (simple-string string))
>   (check-type string 'string)

The above line is undefined. Shouldn't it be (check-type string string)? This
is comparing the variable called string against the type string. Luckily
Lisp has multiple namespaces. We also have string as a function.

>   (prog ((invert nil)
>     (index 0)
>     (length (length string)))
>     (declare (simple-string invert)
>         (type (integer 0 65536) index length))

And an optimisation question: Doesn't this declare index and length to be
greater than 16-bit unsigned integers? (when starting from 0 the maximum
permissible unsigned value is 2^16-1). This probably causes the compiler
to optimise using 32-bit integers. On my computer it seems to make no
speed difference, probably because 32-bit integers are the minimum size
used on 32-bit machines.

Thanks again Erik.

Regards,
Adam


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options May 26 2002, 9:10 pm
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Mon, 27 May 2002 01:10:18 GMT
Local: Sun, May 26 2002 9:10 pm
Subject: Re: (setf (readtable-case *readtable*) :invert) completely preserves symbol case in CMUCL
* Adam Warner
| Shouldn't it be (check-type string string)?

  Yes.  I stuffed that line in just prior to posting.  I keep making that
  mistake, yet I think it seems more correct to use the quoted type for the
  type, not just an unevaluated expression.

| And an optimisation question: Doesn't this declare index and length to be
| greater than 16-bit unsigned integers?

  Yes, but this is actually irrelevant, since the point was only to limit
  these things to less than array-dimension-limit, which it is annoyingly
  verbose to do.  I also keep misremembering that (integer 0 1) and
  (integer (-1) (2)) are equivalent.  I guess I believe upper limits should
  be exclusive because they are everywhere else in the language.  It is
  surprisingly hard to learn things you believe should be different from
  what they are.  Thanks for reminding me of these things.  Just goes to
  show what happens when I post code I had not visited for weeks and had
  just rattled off at the time -- it was just useful to me at the time.

| This probably causes the compiler to optimise using 32-bit integers.

  Well, we do not generally have 32-bit integers in Common Lisp systems on
  32-bit hardware. but at least this makes it use more than 16 bits.  It
  should have been only 65535, of course.  A better way to specify this is
  (unsigned-byte 16).

| On my computer it seems to make no speed difference, probably because
| 32-bit integers are the minimum size used on 32-bit machines.

  (integer-length (- most-positive-fixnum most-negative-fixnum)) is usually
  less than 32, and can be as low as 16.  A quick survey finds that Allegro
  CL and CMUCL have 30-bit signed fixnums, CLISP has 25-bit, and LispWorks
  24-bit, all on a 32-bit Linux system.
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.

  70 percent of American adults do not understand the scientific process.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »