Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

symbol name with case preserved

222 views
Skip to first unread message
Message has been deleted

Henrik Motakef

unread,
Jan 17, 2004, 1:20:00 PM1/17/04
to
"William J. Lamar" <w...@NOSPAMverizon.net> writes:

> Do any of the free-as-in-freedom Common Lisp implementations (CMUCL, SBCL,
> CLISP, GCL, etc.) support a way to get the name of a symbol with it's case
> preserved?

The question has to be: Do they support a way to /read/ the symbol
with it's case preserved, because that's when the case conversion is
done.

> Although Common Lisp is a case-insensative language, it nevertheless would
> be possible for an implementation to have an extension function that
> returns a string representation of a symbol the way the symbol appeared in
> the source code. To help illustrate what I am describing, here is part of a
> session with CMUCL:
>
> * (symbol-name 'Hello)
>
> "HELLO"
> *
>
> What I am looking for is a function foo where
>
> (foo 'Hello)
>
> would result in the string "Hello".

You can either write (symbol-name '|Hello|), or set *readtable-case*
to :preserve before the symbol is read. But after the reader is done
with the symbol, there is no way to reconstruct the original symbol
name as it appeared in the source, only the "real" symbol name (as in
cl:symbol-name) is kept.

Pascal Costanza

unread,
Jan 17, 2004, 1:17:54 PM1/17/04
to

William J. Lamar wrote:

> Hello all,


>
> Do any of the free-as-in-freedom Common Lisp implementations (CMUCL, SBCL,
> CLISP, GCL, etc.) support a way to get the name of a symbol with it's case
> preserved?
>

> Although Common Lisp is a case-insensative language, it nevertheless would
> be possible for an implementation to have an extension function that
> returns a string representation of a symbol the way the symbol appeared in
> the source code.

This sounds like readtable case :preserve or :invert would do what you
need. Check out the section on readtables in the HyperSpec, especially
23.1.2.

I am not 100% sure if that's what you need since I haven't used this
feature yet.


Pascal

--
Tyler: "How's that working out for you?"
Jack: "Great."
Tyler: "Keep it up, then."

Kenny Tilton

unread,
Jan 17, 2004, 1:43:01 PM1/17/04
to

Henrik Motakef wrote:
> "William J. Lamar" <w...@NOSPAMverizon.net> writes:
>
>
>>Do any of the free-as-in-freedom Common Lisp implementations (CMUCL, SBCL,
>>CLISP, GCL, etc.) support a way to get the name of a symbol with it's case
>>preserved?

Here's a macro I use when READing XML:

(defmacro with-case-sensitivity ( &body body)
`(let ((*readtable* (copy-readtable)))
(setf (readtable-case *readtable*) :preserve)
(progn
,@body)))

But...

>
>
> The question has to be: Do they support a way to /read/ the symbol
> with it's case preserved, because that's when the case conversion is
> done.
>
>
>>Although Common Lisp is a case-insensative language, it nevertheless would
>>be possible for an implementation to have an extension function that
>>returns a string representation of a symbol the way the symbol appeared in
>>the source code. To help illustrate what I am describing, here is part of a
>>session with CMUCL:
>>
>> * (symbol-name 'Hello)
>>
>> "HELLO"
>> *
>>
>>What I am looking for is a function foo where
>>
>> (foo 'Hello)

My macro would not work if wrapped around this, because the form 'hello
gets read before even the macroexpansion. But I could:

(setf (readtable-case *readtable*) :preserve)

and then (SYMBOL-NAME 'Hello) -> "Hello"

but I have to type symbol-name with caps, so that sucks.

What I do is use my macro to read XML, then write case sensitive code
with ||s:

(case tag-id (|BeginString|...)(|MsgType|....)

kt

--
http://tilton-technology.com

Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film

Your Project Here! http://alu.cliki.net/Industry%20Application

Thomas F. Burdick

unread,
Jan 17, 2004, 4:16:42 PM1/17/04
to
"William J. Lamar" <w...@NOSPAMverizon.net> writes:

> For those who are wondering what the use of this would be, here is one thing
> I plan to use it for:
>
> I belive that many people use Common Lisp for generating HTML code. Since
> the names of tags in HTML are not case-sensitive, case does not have to be
> dealt with when generating HTML. However, I am working on a Common Lisp
> library for generating C++ code. Since C++ indeed is case-sensitive, the
> case of C++ identifier names must be preserved. Currently, this means that
> all C++ identifier names in my Common Lisp code are quoted as strings.
> Since my Common Lisp code contains many C++ identifier names, I think that
> it would more readable if I could write the C++ identifier names as symbols
> (foo or 'foo instead of "foo").

You probably want to use a readtable-case of :invert. You can write
your CL code in lowercase, and write the names of C++ identifiers in
the normal way. Then, to get their name-strings, just uninvert the
symbol-name. Eg:

* (setf (readtable-case *readtable*) :invert)

:invert

* (defun uninvert (s)
(labels ((upperp (c)
(or (not (alpha-char-p c)) (upper-case-p c)))
(lowerp (c)
(or (not (alpha-char-p c)) (lower-case-p c))))
(cond
((every #'upperp s) (string-downcase s))
((every #'lowerp s) (string-upcase s))
(t s))))

uninvert

* (string 'lower)

"LOWER"
* (uninvert *)

"lower"
* (string 'UPPER)

"upper"
* (uninvert *)

"UPPER"
* (string 'camelCase)

"camelCase"
* (uninvert *)

"camelCase"

--
/|_ .-----------------------.
,' .\ / | No to Imperialist war |
,--' _,' | Wage class war! |
/ / `-----------------------'
( -. |
| ) |
(`-. '--.)
`. )----'

Pascal Bourguignon

unread,
Jan 17, 2004, 6:15:34 PM1/17/04
to
Kenny Tilton <kti...@nyc.rr.com> writes:
> My macro would not work if wrapped around this, because the form
> 'hello gets read before even the macroexpansion. But I could:
>
> (setf (readtable-case *readtable*) :preserve)
>
> and then (SYMBOL-NAME 'Hello) -> "Hello"
>
> but I have to type symbol-name with caps, so that sucks.

See following elisp. I'd write: (SYMBOL-NAME '|Hello|)

Otherwise, there's a nice key labeled caps-lock and I notice that, at
least with a QWERTY keyboard, I type exactly the same keys to type
symbols with caps. Perhaps, what would be missing is an emacs
function to exchange upper case and lower case in all key equivalents.


;; -*- mode: emacs-lisp -*-
;; Copyright Pascal J. Bourguignon 2003
;; GPL
;; ------------------------------------------------------------------------
;; upcase-lisp-region (start end)
;; upcase-lisp ()
;; downcase-lisp-region (start end)
;; downcase-lisp ()
;; ------------------------------------------------------------------------
;; Converting LISP symbols between COMMON-LISP and emacs
;; ie. converts to down-case or to up-case only the unescaped symbols.
;;

(defun case-lisp-region (start end transform)
"
DO: Applies transform on all subregions from start to end that are not
a quoted character, a quote symbol, a comment (;... or #|...|#),
or a string.
"
(save-excursion
(goto-char start)
(while (< (point) end)
(while (and (< (point) end) (looking-at "\\([^\"#|;\\\\]\\|#[^|]\\)+"))
(goto-char (match-end 0)))
(funcall transform start (point))
(cond
((looking-at "\\(\\\\.\\)") ;; \x quoted char (in symbol)
(goto-char (match-end 0)))
((looking-at "\\(;.*$\\)") ;; ;xxx comment
(goto-char (match-end 0)))
((looking-at "\\(|[^|]*|\\)") ;; |xxx| quoted symbol
(goto-char (match-end 0)))
((looking-at "\\(#|\\([^|]\\||[^#]\\)*|#\\)") ;; #|xxx|# comment
(goto-char (match-end 0)))
((looking-at "\"\\([^\\\\\"]\\|\\\\.\\|\\\\\n\\)*\"") ;; "xx\"x" strings.
(goto-char (match-end 0))))
(setq start (point))
);;while
) ;;save-excursion
);;case-lisp-region

(defun upcase-lisp-region (start end)
"
DO: From the start to end, converts to upcase all symbols.
Does not touch string literals, comments starting with ';' and
symbols quoted with '|' or with '\'.
"
(interactive "*r")
(case-lisp-region start end (function upcase-region))
(message "Upcase LISP Done.")
);;upcase-lisp-region


(defun upcase-lisp ()
"
DO: From the (point) to (point-max), converts to upcase all symbols.
Does not touch string literals, comments starting with ';' and
symbols quoted with '|' or with '\'.
"
(interactive "*")
(upcase-lisp-region (point) (point-max))
);;upcase-lisp


(defun downcase-lisp-region (start end)
"
DO: From the start to end, converts to low-case all symbols.
Does not touch string literals, comments starting with ';' and
symbols quoted with '|' or with '\'.
"
(interactive "*r")
(case-lisp-region start end (function downcase-region))
(message "Downcase LISP Done.")
);;downcase-lisp-region


(defun downcase-lisp ()
"
DO: From the (point) to (point-max), converts to lowcase all symbols.
Does not touch string literals, comments starting with ';' and
symbols quoted with '|' or with '\'.
"
(interactive "*")
(downcase-lisp-region (point) (point-max))
);;downcase-lisp


--
__Pascal_Bourguignon__ http://www.informatimago.com/
There is no worse tyranny than to force a man to pay for what he doesn't
want merely because you think it would be good for him.--Robert Heinlein
http://www.theadvocates.org/

Peter Seibel

unread,
Jan 17, 2004, 6:57:36 PM1/17/04
to
t...@famine.OCF.Berkeley.EDU (Thomas F. Burdick) writes:

> You probably want to use a readtable-case of :invert. You can write
> your CL code in lowercase, and write the names of C++ identifiers in
> the normal way. Then, to get their name-strings, just uninvert the
> symbol-name. Eg:
>
> * (setf (readtable-case *readtable*) :invert)
>
> :invert
>
> * (defun uninvert (s)
> (labels ((upperp (c)
> (or (not (alpha-char-p c)) (upper-case-p c)))
> (lowerp (c)
> (or (not (alpha-char-p c)) (lower-case-p c))))
> (cond
> ((every #'upperp s) (string-downcase s))
> ((every #'lowerp s) (string-upcase s))
> (t s))))


If one wanted to be pedantic about it, I think that ALPHA-CHAR-P
should be BOTH-CASE-P.

(defun my-uninvert (s)
(labels ((upperp (c)
(or (not (both-case-p c)) (upper-case-p c)))
(lowerp (c)
(or (not (both-case-p c)) (lower-case-p c))))


(cond
((every #'upperp s) (string-downcase s))
((every #'lowerp s) (string-upcase s))
(t s))))

At least in Allegro, :invert seems to cause the case of characters to
be inverted whenever all the characters *with-case* are the same case,
not just alpha chars. And in some character sets there are characters
that are alphabetic but not bi-case. E.g.:


CL-USER(193): (let* ((str (format nil "abc~def" #\%null))
(sym (prog2
(setf (readtable-case *readtable*) :invert)
(read-from-string str)
(setf (readtable-case *readtable*) :upcase))))
(format t "~a => ~a => ~a~%" str sym (uninvert (symbol-name sym))))
abc─ef => ABC─EF => ABC─EF ;; oops
NIL
CL-USER(194): (let* ((str (format nil "abc~def" #\%null))
(sym (prog2
(setf (readtable-case *readtable*) :invert)
(read-from-string str)
(setf (readtable-case *readtable*) :upcase))))
(format t "~a => ~a => ~a~%" str sym (my-uninvert (symbol-name sym))))
abc─ef => ABC─EF => abc─ef
NIL

--
Peter Seibel pe...@javamonkey.com

Lisp is the red pill. -- John Fraser, comp.lang.lisp

Erik Naggum

unread,
Jan 17, 2004, 8:22:22 PM1/17/04
to
* William J. Lamar

| Do any of the free-as-in-freedom Common Lisp implementations (CMUCL,
| SBCL, CLISP, GCL, etc.) support a way to get the name of a symbol with
| it's case preserved?

The route from source to symbol name involves case conversion, but the
route from symbol name to string does not. This is important if you
want to understand this admittedly complex issue, but it is complex
mostly because people do not pay attention to detail.

| Although Common Lisp is a case-insensitive language,

This is not true.

| it nevertheless would be possible for an implementation to have an
| extension function that returns a string representation of a symbol
| the way the symbol appeared in the source code.

No, this is not possible. Common Lisp does not retain the source from
which it builds its internal representation. This is actually crucial
to understand. Common Lisp is defined on the internal form, not on
the character sequence that is the source.

| To help illustrate what I am describing, here is part of a session
| with CMUCL:
|
| * (symbol-name 'Hello)
| "HELLO"
| *
|
| What I am looking for is a function foo where
|
| (foo 'Hello)
|

| would result in the string "Hello".

Now that you understand that the reader transforms a character source
into an internal representation and that you can never recover that
character source, you know that you need to ask for ways to retain the
case of the symbols that you read.

The first, which is the simplest and which requires minimal effort to
understand and use properly, is the multiple escape characters in the
reader. |Hello| is a symbol whose symbol-name is the exact character
sequence between the ||. When you use this notation, you communicate
intent and both human and machine readers of the source code will know
that case matters in this particular situation.

The second, as has been mentioned by others, is to use READTABLE-CASE
to modify the case conversion behavior of the Common Lisp reader.

| Currently, this means that all C++ identifier names in my Common
| Lisp code are quoted as strings.

That should make it easy use || instead of "", and you're on your way.

If you decide to investigate READTABLE-CASE, you wil run into a very
annoying problem: the case of Common Lisp symbols is upper-case, which
these days is useful mostly in news articles like this to make them
stand out from other words, so if you ask for :PRESERVE, you get to
shout a lot. It would have been easy to break with the past and
declare symbol names to be lower-case back when Common Lisp was
defined, but They chose not to, and breaking with the standard today
is not particularly smart. To accomodate those who wanted to write
their code in lower-case and still use case sensitive symbol names,
the :INVERT case mode was defined, and this works sufficiently well.
It may also be supported natively in an implementation that may choose
to represent symbol-names in lower-case internally to cut down on the
case conversion costs, which get more noticeable with larger character
sets.

The only central issue is what case COMMON-LISP:SYMBOL-NAME returns
and COMMON-LISP:INTERN (etc) takes, and the key to this issue is to
realize it is completely unrelated to the internal representation. An
implementation is free to offer its own |symbol-name| and |intern|
(etc) functions in a package that it might call |common-lisp|, which
reflects the internal representation, as long as it does the right
thing for COMMON-LISP:SYMBOL-NAME (etc). It might even decide to
offer its own |readtable-case| function that swaps the meaning of
:PRESERVE and :INVERT, and thus allow the external and internal case
to work smoothly and effortlessly together. It might even decide to
cache the result of applying a function INVERT-CASE to a symbol name
if it is requested or created via the standard functions so that the
performance penalty would be negligible. The problem is that :INVERT
makes a symbol in which all characters with case have the same case,
invert them all to the other case, while it leaves those that contain
one character with case in each case alone. I don't know the history
of this decision, but I know it was painful to several parties present
and I have no desire to re-open this wound, but let's look at what an
implementation that wants lower-case symbol names, a case sensitive
reader, and conformance to the standard would most intelligently do.
It would /not/ do the inversion trick except when user code asks for
the standard symbol name or creates a symbol through the standard
functions, which is actually a very rare thing. The important issue
to take away from this discussion is that Common Lisp standard does
not mandate that symbols are stored internally in any particular case;
the standard only mandates what various functions accept and return.

I think all modern Common Lisp implementations should optimize for the
lower-case, literal symbol and should treat the upper-case symbols as
a relic of the past that is supported via the standard functions but
bypassed when reading and writing source code and results. The switch
is easily accomplished by doing the :INVERT trick in the accessors to
symbol names first, and then gradually changing the calls to them all
through the source code. It will take a little while before the new
system outperforms the old system, but the end result will be vastly
less case conversion. Users can prepare and encourage the whole thing
by starting to do (setf (readtable-case ...) :invert) and the vendors
can prepare them for the future with a package |common-lisp| that has
variables and functions that invert the meaning of the case.

--
Erik Naggum | Oslo, Norway

Act from reason, and failure makes you rethink and study harder.
Act from faith, and failure makes you blame someone and push harder.

Rahul Jain

unread,
Jan 19, 2004, 12:00:36 AM1/19/04
to
"William J. Lamar" <w...@NOSPAMverizon.net> writes:

> However, I am working on a Common Lisp library for generating C++
> code. Since C++ indeed is case-sensitive, the case of C++ identifier
> names must be preserved.

I would do a translation of lisp symbol-names to C++ identifier names so
that the "native" naming conventions are kept in both
languages. For example, it would translate "FOO-BAR" to "FooBar".

--
Rahul Jain
rj...@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist

Hannah Schroeter

unread,
Jan 19, 2004, 8:18:42 AM1/19/04
to
Hello!

Kenny Tilton <kti...@nyc.rr.com> wrote:

>[...]

>My macro would not work if wrapped around this, because the form 'hello
>gets read before even the macroexpansion. But I could:

>(setf (readtable-case *readtable*) :preserve)

>and then (SYMBOL-NAME 'Hello) -> "Hello"

>but I have to type symbol-name with caps, so that sucks.

>What I do is use my macro to read XML, then write case sensitive code
>with ||s:

>(case tag-id (|BeginString|...)(|MsgType|....)

Or you want to use the readtable-case value :invert.

* (setf (readtable-case *readtable*) :invert)

:invert
* (symbol-name 'hello)

"HELLO"
* (symbol-name 'Hello)

"Hello"
* (symbol-name 'HELLO)

"hello"

Kind regards,

Hannah.

Kenny Tilton

unread,
Jan 19, 2004, 12:50:08 PM1/19/04
to

Hannah Schroeter wrote:

> Hello!
>
> Kenny Tilton <kti...@nyc.rr.com> wrote:
>
>
>>[...]
>
>
>>My macro would not work if wrapped around this, because the form 'hello
>>gets read before even the macroexpansion. But I could:
>
>
>>(setf (readtable-case *readtable*) :preserve)
>
>
>>and then (SYMBOL-NAME 'Hello) -> "Hello"
>
>
>>but I have to type symbol-name with caps, so that sucks.
>
>
>>What I do is use my macro to read XML, then write case sensitive code
>>with ||s:
>
>
>>(case tag-id (|BeginString|...)(|MsgType|....)
>
>
> Or you want to use the readtable-case value :invert.
>
> * (setf (readtable-case *readtable*) :invert)

Yes, thx, I picked that up along the way in this useful thread. I had
always wondered why on earth lipsniks grooved on :invert, and it is
great having the mystery resolved. I also find it funny to have a
language feature purely as a hack to let us type code in lower case.

:)

kenny

Pascal Bourguignon

unread,
Jan 19, 2004, 2:52:02 PM1/19/04
to
Erik Naggum <er...@naggum.no> writes:
> | Currently, this means that all C++ identifier names in my Common
> | Lisp code are quoted as strings.
>
> That should make it easy use || instead of "", and you're on your way.
>
> If you decide to investigate READTABLE-CASE, you wil run into a very
> annoying problem: the case of Common Lisp symbols is upper-case, which
> these days is useful mostly in news articles like this to make them
> stand out from other words, so if you ask for :PRESERVE, you get to
> shout a lot. It would have been easy to break with the past and
> declare symbol names to be lower-case back when Common Lisp was
> defined, but They chose not to, and breaking with the standard today
> is not particularly smart. To accomodate those who wanted to write
> their code in lower-case and still use case sensitive symbol names,
> the :INVERT case mode was defined, and this works sufficiently well.
> It may also be supported natively in an implementation that may choose
> to represent symbol-names in lower-case internally to cut down on the
> case conversion costs, which get more noticeable with larger character
> sets.

Silly suggestion:

(defpackage "common-lisp"
(:use)
(:export "car" "cdr" "cons" "nil" "t" ;; ...
))
(in-package "common-lisp")
(setf (symbol-function '|car|) (symbol-function 'COMMON-LISP:CAR))
(setf (symbol-function '|cdr|) (symbol-function 'COMMON-LISP:CDR))
(setf (symbol-function '|cons|) (symbol-function 'COMMON-LISP:CONS))
(define-symbol-macro |nil| COMMON-LISP:NIL)
(define-symbol-macro |t| COMMON-LISP:T)
;; ...

Then:

(defpackage "TEST"
(:use "common-lisp")
(:export "foo"))

(setf (readtable-case *readtable*) :preserve)
(defun foo (x) (symbol-name x))
(format t "~S~%" (foo 'Hello))


(The serrious suggestion was my previous message with my
lisp-upcase-region / lisp-downcase-region emacs commands).

Timothy Moore

unread,
Jan 19, 2004, 4:44:15 PM1/19/04
to
Pascal Bourguignon <sp...@thalassa.informatimago.com> writes:

> Erik Naggum <er...@naggum.no> writes:
> > If you decide to investigate READTABLE-CASE, you wil run into a very
> > annoying problem: the case of Common Lisp symbols is upper-case, which
> > these days is useful mostly in news articles like this to make them
> > stand out from other words, so if you ask for :PRESERVE, you get to
> > shout a lot. It would have been easy to break with the past and

> Silly suggestion:

>
> (defpackage "common-lisp"
> (:use)
> (:export "car" "cdr" "cons" "nil" "t" ;; ...
> ))
> (in-package "common-lisp")
> (setf (symbol-function '|car|) (symbol-function 'COMMON-LISP:CAR))
> (setf (symbol-function '|cdr|) (symbol-function 'COMMON-LISP:CDR))
> (setf (symbol-function '|cons|) (symbol-function 'COMMON-LISP:CONS))
> (define-symbol-macro |nil| COMMON-LISP:NIL)
> (define-symbol-macro |t| COMMON-LISP:T)
> ;; ...
>
>

> (defpackage "TEST"
> (:use "common-lisp")
> (:export "foo"))
>
> (setf (readtable-case *readtable*) :preserve)
> (defun foo (x) (symbol-name x))
> (format t "~S~%" (foo 'Hello))

This doesn't work for special forms. Too bad.

Tim

Pascal Bourguignon

unread,
Jan 19, 2004, 6:07:14 PM1/19/04
to
Timothy Moore <mo...@trousse.labri.fr> writes:

> Pascal Bourguignon <sp...@thalassa.informatimago.com> writes:
>
> > Erik Naggum <er...@naggum.no> writes:
> > > If you decide to investigate READTABLE-CASE, you wil run into a very
> > > annoying problem: the case of Common Lisp symbols is upper-case, which
> > > these days is useful mostly in news articles like this to make them
> > > stand out from other words, so if you ask for :PRESERVE, you get to
> > > shout a lot. It would have been easy to break with the past and
>
> > Silly suggestion:
> >
> > (defpackage "common-lisp"

> > [...]


> This doesn't work for special forms. Too bad.

Yes it does because you can always cover a special form with a macro.
All special forms can be implemented by a macro the standard says.

Erik Naggum

unread,
Jan 19, 2004, 9:47:41 PM1/19/04
to
* Pascal Bourguignon
| Silly suggestion:

Symbol identity is a really great thing. How a symbol reads and
prints is not part of this identity.

Pascal Bourguignon

unread,
Jan 20, 2004, 4:42:52 PM1/20/04
to
Erik Naggum <er...@naggum.no> writes:

> * Pascal Bourguignon
> | Silly suggestion:
>
> Symbol identity is a really great thing. How a symbol reads and
> prints is not part of this identity.

Yes, but I was hinting to another facility of COMMON-LISP and its
packages. Namely, the possibility to define another kind of lisp,
encapsulated into a package, and have client packages use that new
"lisp" package instead of "COMMON-LISP". (I note that genera had the
same facility, I saw some demo where the user selected the
"FUTURE-COMMON-LISP" package instead of some other lisp). So, the
silly "common-lisp" package I proposed is clearly NOT "COMMON-LISP".
It's a new lisp where the pre-defined symbols are low-case instead of
the COMMON-LISP standard of upper case. (Granted, there is the
difficulty of T, and other functions that would still return T instead
of t in a naive implementation, and some part of the printer/reader
would have to be modified too, but this could be taken care of. We
have the same problem when wanting to handle scheme #t/#f/nil mess).

0 new messages