Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

case sensetivity

38 views
Skip to first unread message

David Bakhash

unread,
May 28, 1998, 3:00:00 AM5/28/98
to

is there an ANSI way to put package info at the top of a (package)
Lisp file so that it treats all symbols as being case-sensetive, as if
they were inside | |'s ?

I know that Allegro has some `set-case-mode' stuff, but I wanted
something portable.

dave

Erik Naggum

unread,
May 28, 1998, 3:00:00 AM5/28/98
to

* David Bakhash

| is there an ANSI way to put package info at the top of a (package)
| Lisp file so that it treats all symbols as being case-sensetive, as if
| they were inside | |'s ?

case-sensitivity is not a property of the package, but of the Lisp
reader, controlled by the SETF'able function READTABLE-CASE. :UPCASE and
:DOWNCASE are obviously case-insensitive. :PRESERVE and :INVERT are
case-sensitive. with :PRESERVE, you need to type all standard symbols in
upper-case. with :INVERT, you must remember the algorithm: only symbol
names where all unescaped characters for which BOTH-CASE-P is true have
the same case have the case of those characters inverted.

I, too, thought case should have been a property of the package, but that
offers some rather messy semantic relationships with the way packages are
used by other packages and access to symbols from several packages makes
it complicated to decide which package should control the case-sensitivity
of a symbol name when interned.

instead of this very messy situation, I have written a new reader macro
that handles the case of the symbol the way the reader does, yet does not
cons a symbol. this makes it possible to use either :INVERT or Allegro
CL's non-standard "case-mode" stuff and still write in all lower-case.
it also stands out as noticeably different, much unlike using lower-case
symbol-name strings in Allegro's lower-case modes, which breaks stuff.

the principle is that if a symbol is written as `foobar', then the name
of that symbol should be written `#"foobar"', and this is semantically
identical to #.(symbol-name 'foobar), except that it should never have to
cons a symbol -- the reader already has to pass a fresh string from the
input stream to INTERN, and the intent is to capture that string before
it gets passed to INTERN.

here's the implementation for Allegro CL 4.3 and 5.0. caveat emptor.

;;; reader for symbol names that does case conversion according to the
;;; rest of the symbol reader. thanks to Sean Foderaro for the pointer
;;; to EXCL::READ-EXTENDED-TOKEN, which luckily does all the dirty work.

(defun symbol-namestring-reader (stream character prefix)
(declare (ignore prefix))
(prog1 (excl::read-extended-token stream)
(unless (char= (read-char stream) character)
(excl::.reader-error stream "invalid symbol-namestring syntax"))))

;; set it in all readtables. (yes, I know this is _really_ dirty.)
(eval-when (:compile-toplevel :load-toplevel)
(loop with readtables = (excl::get-objects 11)
for i from 1 to (aref readtables 0)
for readtable = (aref readtables i)
do (when (excl::readtable-dispatch-tables readtable)
(set-dispatch-macro-character #\# #\"
#'symbol-namestring-reader
readtable))))

a portable implementation would UNREAD-CHAR the character it had just
read (it should therefore be bound to be #\"), call the reader to get the
string, and frob the case according to the value of READTABLE-CASE (and
make sure it got escaping right, which is a _pain_), but I'll do that
only when I actually need it. it is sufficient for me that it can be
done portably, too.

the big advantages of this technique is that you can always refer to a
symbol name by a unique syntax that never gets confused with anything
else, doesn't wantonly create uninterned symbols, or worse: redundant
keywords, and _always_ gets the complexity of :INVERT right, so the
symbol that is named "FOOBAR" internally still has the reader syntax
`foobar' and the symbol-name syntax `#"foobar"'. this is important (and
convenient) when writing arguments to APROPOS and the package functions.

the above code modifies all existing readtables, which some might find
yucky beyond belief, but Allegro CL also offers named readtables that
might make this a little easier on the aesthesticles. to use a named
readtable for a given project:

(let ((readtable (copy-readable nil))) ;copy from the standard
(set-dispatch-macro-character #\# #\" #'symbol-namestring-reader readtable)
(setf (readtable-case readtable) :invert)
(setf (named-readtable :foo-project) readtable))

now you can say

(eval-when (:compile-toplevel :load-toplevel)
(setf *readtable* (named-readtable :foo-project t)))

or you can use the IN-SYNTAX proposal from Kent Pitman, which I have
implemented as follows:

(defun in-syntax-ensure-readtable (evaled quoted)
"Ensure that the argument to IN-SYNTAX is a readtable, or error."
(if (readtablep evaled) evaled
(error 'type-error
:datum evaled
:expected-type 'readtable
:format-control
"~@<IN-SYNTAX argument `~S' evaluates to a ~:@(~S~), ~
not a (named) READTABLE.~:@>"
:format-arguments (list quoted (type-of evaled)))))

(defmacro in-syntax (readtable)
"Set *READTABLE* to READTABLE (evaluated) for the remainder of the file.
If READTABLE is a keyword, uses NAMED-READTABLE to retrieve the readtable."
`(eval-when (:compile-toplevel :load-toplevel :execute)
(setq *readtable*
,(if (keywordp readtable)
`(excl:named-readtable ,readtable t)
`(in-syntax-ensure-readtable ,readtable ',readtable)))))

;; start off with one that we can always rely on.
(setf (excl:named-readtable :ansi-cl) (copy-readtable nil))

this implementation of IN-SYNTAX is of course just as happy with a
variable or any other form that yields a readtable object, which is the
fully portable version. (just remove the test for KEYWORDP.)

hope this helps and also helps people decide against using the invisibly
non-standard "case-mode" :case-sensitive-lower stuff (which breaks code
without letting you know it could do so) and encourages people to adopt
the standard READTABLE-CASE value :INVERT to get case-sensitive
lower-case symbols.

#:Erik
--
"Where do you want to go to jail today?"
-- U.S. Department of Justice Windows 98 slogan

Kent M Pitman

unread,
May 29, 1998, 3:00:00 AM5/29/98
to

David Bakhash <ca...@bu.edu> writes:

> is there an ANSI way to put package info at the top of a (package)
> Lisp file so that it treats all symbols as being case-sensetive, as if
> they were inside | |'s ?

To be clear, Common Lisp is already case-sensitive. It just happens
to case-translate on input. That's different.

Your request, better put, would be:
Is there an ANSI way to put something at the top of a package
or Lisp file so that the reader does not case-translate [i.e.,
in the same way as it does between | ".

The issue?
Once the object is a symbol, it's not syntax, it's an object.
The object |X| and the object |x| are already treated as
distinct because they don't match in case. The notations
X and x both select |X| because on input, the reader case-translates.
But, by contrast, (intern "x") and (intern "X") selects different
symbols because INTERN does not case-translate. The reader is
actively doing (INTERN (STRING-UPCASE your-symbol-name)) and you
want it not to do the STRING-UPCASE prior to the intern. After
the INTERN is done is where symbols come into play. And symbols
whose names are not STRING= cannot be EQ.

You can construct an appropriate readtable using the readtable construction
features and give it a READTABLE-CASE value that you want.

Put the newly constructed readatable in a special variable.

(defvar *case-preserving-readtable*
(let ((r (copy-readtable)))
(setf (readtable-case r) :preserve)
r))

Then make a macro that does:

(defmacro in-case-preserving-syntax ()
`(eval-when (:execute :compile-toplevel :load-toplevel)
(setq *readtable* *case-preserving-readtable*)))

and then at the top of your file say:

(in-package "MY-PACKAGE")
(in-case-preserving-syntax)

All calls to COMPILE-FILE and LOAD will bind *READTABLE*, so you don't
have to undo anything at the end of the file. The SETQ done by the
EVAL-WHEN will be automatically lost when the COMPILE-FILE or LOAD
is finished.

You may have to fudge things a little for interactive editors, which
may not be able to compile isolated defuns because they may not know
you want a special environment; that's beyond the scope of the language
and you should talk to your vendor about that. They may have ways you
can work around it.

You could even combine the in-package with the syntax by making an explicit
reference at the top of the file. e.g., in package MY-PACKAGE, define
(defmacro in-my-package-and-syntax ()
`(progn (in-package "MY-PACKAGE") (in-case-preserving-syntax)))
and then do
(my-package:in-my-package-and-syntax)
at the top of other files. [You'll have to export that symbol if you want
to use a single-colon.]

Sorry this isn't more automatic, but it's really not that bad.
The same technique can be used for files that require special readmacros,
btw, so that you don't side-effect the global environment in order to
read in your code. (Side-effecting the global environment has the
disadvantage that unrelated systems will tend to clobber each other if
they are loaded into the same environment. The above technique will
not break files that are not intending to have case preserved.

NOTE WELL:
All Common Lisp symbols are uppercase, so even if you select
"case preserving mode" you have to use CAR, not car, unless
you explicitly (SETF (SYMBOL-FUNCTION '|car|) (SYMBOL-FUNCTION 'CAR)).
Some objects like NIL can be held in lowercase vars, like
(SETQ |nil| 'NIL) but (CDR '(x)) is always going to return NIL,
and never |nil|.

Bruno Haible

unread,
Jun 1, 1998, 3:00:00 AM6/1/98
to

David Bakhash <ca...@bu.edu> asked:

> is there an ANSI way to put package info at the top of a (package)
> Lisp file so that it treats all symbols as being case-sensetive, as if
> they were inside | |'s ?
>
> I know that Allegro has some `set-case-mode' stuff

The problem with both the Allegro approach and the readtable-case
approach is they you can be either in a case-sensitive world or in a
case-insensitive world. But they cannot mix.

Therefore CLISP has a `defpackage' option `:case-sensitive'.
For example, the Linux libc bindings will be in a case-sensitive
package, defined like this:

(defpackage "LINUX" (:case-sensitive t) ...)

I can use it from any other package:

#'linux:open ==> #<FOREIGN-FUNCTION LINUX:open>
(eq 'linux::open 'linux::Open) ==> NIL
(eq 'linux::open 'Linux::open) ==> T

Hint to other Lisp implementors: This was fairly easy to implement.
I would be pleased if other implementations adopted the same solution.

Bruno


Kent M Pitman

unread,
Jun 2, 1998, 3:00:00 AM6/2/98
to

hai...@clisp.cons.org (Bruno Haible) writes:

> The problem with both the Allegro approach and the readtable-case
> approach is they you can be either in a case-sensitive world or in a
> case-insensitive world. But they cannot mix.
>
> Therefore CLISP has a `defpackage' option `:case-sensitive'.
> For example, the Linux libc bindings will be in a case-sensitive
> package, defined like this:
>
> (defpackage "LINUX" (:case-sensitive t) ...)
>
> I can use it from any other package:
>
> #'linux:open ==> #<FOREIGN-FUNCTION LINUX:open>
> (eq 'linux::open 'linux::Open) ==> NIL
> (eq 'linux::open 'Linux::open) ==> T
>
> Hint to other Lisp implementors: This was fairly easy to implement.
> I would be pleased if other implementations adopted the same solution.

I don't buy any of this personally. Here's (approximately) why:

First, it's important to note that the default CL world IS case-sensitive.
It is, however, also case-translating. These are different. Ask for
clarification if you don't understand that. It's important to understand
that what it would mean to be case-insensitive would be for
(intern "open" "LINUX") == (intern "Open" "LINUX") == (intern "OPEN" "LINUX")
But in CL, by default, these are necessarily-different. Hence CL *is*
case-sensitive.

Second, it seems to me that it's GOOD that you can't mix the worlds.
I would not want to have to know which packages are case-sensitive and
which are not. I LIKE a case-translating world. I *want* to write
'linux::|Open|
to know that I am specifying something where case matters. It would be
baffling to me, who routinely writes code in both all-uppercase and
all-lowercase depending on what his fingers do on any given day, to
find that (eq 'linux::open 'linux::Open) => NIL because the definition
of a case-translating world is that this doesn't happen.

Now, if you're not like me and you WANT a case-translating world (not
unreasonable) it seems to me that you'll want to write ALL of your
code where case matters. But at any given time you need some cue
about which Universe you're in. I mind far less coming up to your
console and you saying "everything you see is in case-preserving mode"
because I can immediately undestand that than having you say
"everything you see is randomly in case-preserving mode or not, so
if you ever see a package-qualified symbol, ask me". That's very
BAD.

Just one person's opinion.

Rainer Joswig

unread,
Jun 2, 1998, 3:00:00 AM6/2/98
to

hai...@clisp.cons.org (Bruno Haible) writes:

> David Bakhash <ca...@bu.edu> asked:
> > is there an ANSI way to put package info at the top of a (package)
> > Lisp file so that it treats all symbols as being case-sensetive, as if
> > they were inside | |'s ?
> >
> > I know that Allegro has some `set-case-mode' stuff
>

> The problem with both the Allegro approach and the readtable-case
> approach is they you can be either in a case-sensitive world or in a
> case-insensitive world. But they cannot mix.
>
> Therefore CLISP has a `defpackage' option `:case-sensitive'.
> For example, the Linux libc bindings will be in a case-sensitive
> package, defined like this:
>
> (defpackage "LINUX" (:case-sensitive t) ...)

I would prefer not to have a case-sensitive Lisp. It means
unnecessary incompatibility. Is Bruno Haible's solution
adoptable for other vendors? Sounds interesting.


Rainer Joswig

unread,
Jun 2, 1998, 3:00:00 AM6/2/98
to

Scott L. Burson

unread,
Jun 2, 1998, 3:00:00 AM6/2/98
to

Kent M Pitman wrote:

>
> hai...@clisp.cons.org (Bruno Haible) writes:
>
> First, it's important to note that the default CL world IS case-sensitive.
> It is, however, also case-translating.

That's one way to put it. One could also say that while INTERN is
case-sensitive, READ is not (precisely because it translates). From
this perspective, the truth of "CL (as a whole) is case-sensitive" is
not well-defined.

> I mind far less coming up to your
> console and you saying "everything you see is in case-preserving mode"

> because I can immediately understand that than having you say


> "everything you see is randomly in case-preserving mode or not, so
> if you ever see a package-qualified symbol, ask me". That's very
> BAD.

<Shrug> I don't think this would bother me terribly.

I guess that's because I've spent enough time writing in case-sensitive
languages that I tend to write Lisp as if the reader were
case-sensitive, even though it's not. It would probably be much easier
to deal with mixed case-sensitivity coming from a case-sensitive mindset
than from a case-insensitive mindset.

-- Scott

* * * * *

To use the email address, remove all occurrences of the letter "q".

Erik Naggum

unread,
Jun 3, 1998, 3:00:00 AM6/3/98
to

* Scott L. Burson

| That's one way to put it. One could also say that while INTERN is
| case-sensitive, READ is not (precisely because it translates). From this
| perspective, the truth of "CL (as a whole) is case-sensitive" is not
| well-defined.

I quite agree with Kent Pitman that CL is case-sensitive. CL would be
case-insensitive if INTERN ignored case. the reader can ignore case or
preserve it or do some perverse magic through READTABLE-CASE accessor on
readtables.

however, there is one place in the standard that puzzles me. 22.3.5.4
Tilde Slash: Call Function (in 22.3 Formatted Output) reads:

All of the characters in name are treated as if they were upper case.

I find this curious. was editing instructions for this clause neglected
when READTABLE-CASE was voted on?

#:Erik, who likes :INVERT

Kent M Pitman

unread,
Jun 4, 1998, 3:00:00 AM6/4/98
to

Erik Naggum <c...@naggum.no> writes:

> however, there is one place in the standard that puzzles me. 22.3.5.4
> Tilde Slash: Call Function (in 22.3 Formatted Output) reads:
>
> All of the characters in name are treated as if they were upper case.
>
> I find this curious. was editing instructions for this clause neglected
> when READTABLE-CASE was voted on?
>
> #:Erik, who likes :INVERT

FORMAT is interpreted at runtime, not at readtime. Using whatever
random readtable is available at runtime would be very bad. It's for
the same reason that FORMAT's ~/.../ requires a package prefix and
doesn't look at *package*.

Probably what should have been done was to use #"..." to always
preprocess a format string at readtime so that the readtable and
package that were intended would be known and then we could have done
better.

0 new messages