Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

About ''"CooL": low-level macros considered useful

14 views
Skip to first unread message

ol...@pobox.com

unread,
Mar 28, 2001, 7:32:29 PM3/28/01
to

According to R5RS, symbols created by string->symbols, e.g.,
(string->symbol "ASymbol")
retain their case, while symbols 'read' or entered literally
(with-input-from-string "ASymbol" read)
'ASymbol
may get their case changed on many Scheme systems. Therefore,
(eq? (string->symbol "ASymbol") 'ASymbol)

is #f on many Scheme systems, e.g., on SCM (which downcases all
literal symbols) and Bigloo (which uppercases them). When developing a
test suite for a SSAX XML parser, I needed a portable and _concise_
way of entering _case-sensitive_ symbols. Again, I needed that way
only for validation self-tests, which are always enclosed within a
special form run-test:
(run-test (test1) (test2) ...)
If a user wants to run self-test, he declares run-test as
(define-macro run-test (lambda body `(begin (display "\n-->Test\n") ,@body)))

Otherwise, he defines run-test as
(define-macro run-test (lambda body '(begin #f)))

which effectively switches all the tests off. This fortuitous
circumstance suggested that run-test can do a bit more than just
expanding into a begin form. It can be used to enable truly portable
and truly concise case-sensitive symbols.

Therefore, we introduce a notation '"ASymbol" (a quoted string) that
stands for a case-_sensitive_ ASymbol -- on any R5RS Scheme system
with a low-level macro system. This notation is valid only within the
body of run-test.

The notation is implemented by scanning the run-test's body and
replacing every occurrence of (quote "str") with the result of
(string->symbol "str").

(define-macro run-test
(lambda body
(define (re-write body)
(cond
((vector? body)
(list->vector (re-write (vector->list body))))
((not (pair? body)) body)
((and (eq? 'quote (car body)) (pair? (cdr body))
(string? (cadr body)))
(string->symbol (cadr body)))
(else (cons (re-write (car body)) (re-write (cdr body))))))
(cons 'begin (re-write body))))


It must be stressed that '"ASymbol" behaves truly like a Scheme symbol
with its case preserved: the operation (string->symbol "ASymbol") is
performed at _macro-expand_ time rather than at run time. The running
code sees no quotes and applications where '"ASymbol" used to appear:
the running code sees a true symbol -- a literal. Thus '"ASymbol" can
be used within a case statement in positions where only literal values
are allowed (see below for an example). We must also stress that the
above source code transformation can only be effected by a low-level
macro facility. High-level (aka "portable" aka R5RS) macros _cannot_
express this transformation. By design, syntax-rules prohibit
manufacturing of symbols and identifiers: otherwise, it will be
impossible to guarantee hygiene.

Examples

(run-test
(and
(symbol? ''"ASymbol")
(symbol? (car '('"ASymbol")))
(eq? (string->symbol "ASymbol") ''"ASymbol"))
(case (string->symbol "ASymbol")
(('"ASymbol") #t) (else #f))
)

returns #t on Gambit, SCM and Bigloo. Notice a curious notation:
''"ASymbol": a double-quote following double quotes.


SSAX.scm source code
http://pobox.com/~oleg/ftp/Scheme/SSAX.scm
shows many more examples, e.g.,

(run-test
; Definition of
; test:: XML-string * doctype-defn * expected-SXML-term -> void
; elided

(test "<BR/>" dummy-doctype-fn '(('"BR")))

(test "<!DOCTYPE T SYSTEM 'system1' ><!-- comment -->\n<T/>"
(lambda (elem-gi seed) (assert (equal? elem-gi ''"T"))
(values #f '() '() seed))
'(('"T")))
)


This exercise gave me a new appreciation for an ability of Scheme to
manipulate its own code as data. Whereas a compiled Scheme code is
most often anything but S-expressions, the code seen by a
macro-expander is certainly in the form of S-expressions.


--
Posted from cs.nps.navy.mil [131.120.10.2]
via Mailgate.ORG Server - http://www.Mailgate.ORG

brl...@my-deja.com

unread,
Mar 29, 2001, 9:19:13 AM3/29/01
to
IMHO, the requirement that symbols be case-insensitive is not useful
today. Changing the spec to case sensitivity would be a somewhat risky
proposition, but I bet the amount of code that would break would be
quite small.

Rob Warnock

unread,
Mar 30, 2001, 7:34:23 PM3/30/01
to

<a href="http://www.biep.org">Biep</a>

unread,
Apr 5, 2001, 3:53:38 AM4/5/01
to

I consider case sensitivity evil, at least for variable names. :-)

It doesn't really gain you anything (apart from (1) interfacing to other
case sensitive systems and (2) more short names - but in the latter case,
switching to full Unicode will give you even more..).

On the other hand, it is the source of an enormous number of bugs and
misunderstandings, because things that superficially seem to be the same
aren't.

Here, I think, is a rare case where Microsoft did the right thing: case
retention. In e.g. Visual Basic, the declaration of a variable defines its
upper case/lower case pattern, and all other occurrences are automatically
coerced.

--
Biep
Reply via http://www.biep.org


Will Clinger - Sun Microsystems

unread,
Apr 5, 2001, 10:50:41 AM4/5/01
to
Biep wrote:
> Here, I think, is a rare case where Microsoft did the right thing: case
> retention.

Yet another case in which Apple did the right thing and Microsoft
eventually copied it.

Will

John Tobey

unread,
Apr 6, 2001, 9:35:23 PM4/6/01
to
biep writes:

> i consider case sensitivity evil, at least for variable names. :-)

I'm sorry you feel that way. In my experience, case INsensitivity
just always leads to headaches.

> it doesn't really gain you anything (apart from (1) interfacing to other


> case sensitive systems and (2) more short names

It makes languages easier to implement.

> - but in the latter case,

> switching to full unicode will give you even more..).

More headaches, that is, when you have to decide whether &aacute;
equals &Aacute;. If it is in Unicode or iso-latin-1, what about the
Macintosh encoding or EBCDIC? Suddenly, interoperability gains a
whole new dimension of difficulty.

> on the other hand, it is the source of an enormous number of bugs and


> misunderstandings, because things that superficially seem to be the same
> aren't.

Superficially the same? I don't have much trouble telling h from H.
At what age did you start learning the rules for capitalization?
Probably before you programmed a computer. On the other hand, your
computer has never understood the peculiar relationship between the
cases. Probably neither has anyone whose native alphabet lacks the
distinction.

The fact that there are case-insensitive and case-sensitive languages
is probably the real source of bugs. Let's ditch the former.

> here, i think, is a rare case where microsoft did the right thing: case
> retention. in e.g. visual basic, the declaration of a variable defines its


> upper case/lower case pattern, and all other occurrences are automatically
> coerced.

Good point. So let's write editors that default to case retention for
newbies, but please, please stop putting case insensitivity into
language specs. Thank goodness Scheme has string->symbol.

Cheers.
-John

--
John Tobey, late nite hacker <jto...@john-edwin-tobey.org>
\\\ ///
]]] With enough bugs, all eyes are shallow. [[[
/// \\\

Hartmann Schaffer

unread,
Apr 6, 2001, 10:41:53 PM4/6/01
to
In article <m3g0flm...@feynman.localnet>, John Tobey wrote:
> ...

>The fact that there are case-insensitive and case-sensitive languages
>is probably the real source of bugs. Let's ditch the former.

with the first programming languages this wasn't an issue at all, because
pretty much all available equipment had only one case. when the first
terminals that handled both cases came out, the relative frequency of the
equipment probably made it a good idea to fold everything into upper case.

when the case distinction capable equipment became more commonplace, some
people were so used to everything being uppercase that they insisted on
casefolding to uppercase because that's what they were used to, while other
people remembered the beauty of mathematics where case and font distinction
is used quite heavily

> ...

hs

kl

unread,
Apr 7, 2001, 12:01:59 AM4/7/01
to

In comp.lang.scheme Biep wrote:

> I consider case sensitivity evil, at least for variable names. :-)
>
> It doesn't really gain you anything (apart from (1) interfacing to other
> case sensitive systems and (2) more short names - but in the latter case,

IMHO, just (1) is enough.
Say, DSSSL is case sensitive, let alone XML.
This effectively prevent practical use of case-insensitive Schemes for a
wide
range of real life applications.

> misunderstandings, because things that superficially seem to be the same
> aren't.

In opposite case the things that seem to be different are the same.
Is it any better?

Best regards,
Kirill Lisovsky.

-----
email: lisovsky at acm dot org

jmar...@alum.mit.edu

unread,
Apr 8, 2001, 3:48:03 PM4/8/01
to
John Tobey <jto...@john-edwin-tobey.org> writes:

> biep writes:
>
> > i consider case sensitivity evil, at least for variable names. :-)
>
> I'm sorry you feel that way. In my experience, case INsensitivity
> just always leads to headaches.

I don't understand this sentence. Did you mean case Insensitivity, or
Case Insensitivity, or CASE INSENSITIVITY, or case insensitivity,
or what, exactly?

> > it doesn't really gain you anything (apart from (1) interfacing to other
> > case sensitive systems and (2) more short names
>
> It makes languages easier to implement.

The effort required to fold case when interning identifiers is
trivial.

> Good point. So let's write editors that default to case retention for
> newbies, but please, please stop putting case insensitivity into
> language specs. Thank goodness Scheme has string->symbol.

I would ask that people make languages case insensitive (vis-a-vis
identifiers) so we don't have IdioticIdentifierNamesThatAreHardToRead
and identifiersThatAreEasilyConfused with
IdentifiersThatAreEasilyConfused.

Biep @ http://www.biep.org

unread,
Apr 9, 2001, 4:01:23 AM4/9/01
to
I wrote:
> Here, I think, is a rare case where Microsoft did the right thing: case
retention.

"Will Clinger - Sun Microsystems" <william...@east.sun.com> replied in
message news:3ACC8641...@east.sun.com...


> Yet another case in which Apple did the right thing and Microsoft
eventually copied it.

Aha! I already felt uneasy about my conclusion.. :-)

Kirill Lisovsky

unread,
Apr 9, 2001, 12:33:03 AM4/9/01
to
jmar...@alum.mit.edu wrote:

> I would ask that people make languages case insensitive (vis-a-vis
> identifiers) so we don't have IdioticIdentifierNamesThatAreHardToRead
> and identifiersThatAreEasilyConfused with
> IdentifiersThatAreEasilyConfused.

I (personally) agree that idiotic-identifier-names-that-are-hard-to-read
is better,
but is it a good name?
IMHO, any identifier which is 30+ characters long are not suitable for
human
reading.

Anyway, this is a matter of taste, whereas case-insensitive symbol may
be a
reason of very material problems:

From R5RS:

Identifiers have two uses within Scheme programs:
* Any identifier may be used as a variable or as a syntactic keyword
(see
sections 3.1 and 4.3).
* When an identifier appears as a literal or within a literal (see
section
4.1.2), it is being used to denote a symbol (see section 6.3.3).

Symbols are useful for many other applications; for instance, they
may be used the way enumerated values are used in Pascal.

It is necessary to take into account this second role of identifiers.

If you are using symbols for representation of unix files names, XML
tags,
passwords and user names, command-line options, etc. , and your Scheme
is case-insensitive - you are in troubles.

Just recall the reason of initial posting in this thread - Oleg has a
very
practical problem due to the case-insensitivity of some Schemes.
Ough, yes - his trick works fine for this particular problem, but
a lot of the similar problem exists...

If we are interested in practical applications programmed in Scheme we
have to
provide more natural solution for such a trivial problem.
I think that case-sensitive Schemes provide most natural solution
possible.

Biep @ http://www.biep.org

unread,
Apr 9, 2001, 5:23:03 AM4/9/01
to
"John Tobey" <jto...@john-edwin-tobey.org> wrote in message
news:m3g0flm...@feynman.localnet...
> [Case sensitivity] makes languages easier to implement.

Hardly.

> [If you want lots of different short identifiers], switching to full


unicode will give you even more..).
>
> More headaches, that is, when you have to decide whether &aacute; equals
&Aacute;.
> If it is in Unicode or iso-latin-1, what about the Macintosh encoding or
EBCDIC?
> Suddenly, interoperability gains a whole new dimension of difficulty.

Thanks for elaborating my point.

> Superficially the same? I don't have much trouble telling h from H.

At the beginning of a sentence, for instance. Capitalisation is about
sentence structure, not about words. People are taught (indoctrinated if
you prefer) to see the two as equal, i.e. mapping a sentence to upper case
changes the prosodics ("yelling"), not the meaning of the words. (And yes,
I realise there is the exception of proper names..)

> At what age did you start learning the rules for capitalization?

There you are. It they weren't "the same" in a very fundamental sense,
there would be no such things as "rules of capitalisation". Did you ever
learn "rules of voicing"? No, because voiced and unvoiced consonants are
fundamentally different in English (Phoneticians: yes, I know there are a
few exceptions..)

> Probably before you programmed a computer. On the other hand, your
> computer has never understood the peculiar relationship between the cases.

So why should I adapt to the computer instead of the other way around?

> The fact that there are case-insensitive and case-sensitive languages
> is probably the real source of bugs.

No, the fact that the Latin alphabet has (say) 26 letters, which come in two
forms each, rather than 52 letters. That is a subtlety a computer needs to
be taught. Designers trying to avoud that effort are the main source; the
fact that only some designers do this obviously compounds it because of the
resulting confusion, but that is true for every rule, up to sticking to the
proper side of the road.

Would you want italic letters to be distinct, too? And bold-faced ones?
And underlined ones?

Rob Warnock

unread,
Apr 9, 2001, 5:22:32 AM4/9/01
to
Kirill Lisovsky <liso...@acm.org> wrote:
+---------------

| I think that case-sensitive Schemes provide most natural solution
| possible.
+---------------

Actually, rather than be forced to choose one or the other exclusively,
I kinda like the way MzScheme handles it, with a "parameter" that can be
set or cleared as desired, e.g.:

> (begin
(define foo (read))
(define bar
(parameterize ((read-case-sensitive #t)) ; sorta like fluid-let
(read)))
(define baz (read)))
FooFoo BarBar BazBaz ; hand-typed input
> (list foo bar baz)
(foofoo BarBar bazbaz)
>

MzScheme's parameters are thread-specific, so that if you're writing a
multi-threaded app (e.g., a web server or other multi-user interaction)
you can tailor case sensitivity without inter-thread interference.


-Rob

-----
Rob Warnock, 31-2-510 rp...@sgi.com
SGI Network Engineering <URL:http://reality.sgi.com/rpw3/>
1600 Amphitheatre Pkwy. Phone: 650-933-1673
Mountain View, CA 94043 PP-ASEL-IA

Kirill Lisovsky

unread,
Apr 9, 2001, 2:44:27 AM4/9/01
to
Rob Warnock wrote:

> Actually, rather than be forced to choose one or the other exclusively,
> I kinda like the way MzScheme handles it, with a "parameter" that can be
> set or cleared as desired, e.g.:
>
> > (begin
> (define foo (read))
> (define bar
> (parameterize ((read-case-sensitive #t)) ; sorta like fluid-let
> (read)))
> (define baz (read)))
> FooFoo BarBar BazBaz ; hand-typed input
> > (list foo bar baz)
> (foofoo BarBar bazbaz)
> >
> MzScheme's parameters are thread-specific, so that if you're writing a
> multi-threaded app (e.g., a web server or other multi-user interaction)
> you can tailor case sensitivity without inter-thread interference.

This feature is nice, but this code is not very portable across the
different Scheme implementations.
Fortunately, MzScheme's "-g" switch provides another opportunity to make
your choice.
It is not so elegant, but it resolves many problems...

Best regards,
Kirill Lisovsky.

brl...@my-deja.com

unread,
Apr 9, 2001, 8:54:56 AM4/9/01
to
John Tobey <jto...@john-edwin-tobey.org> writes:

> Superficially the same? I don't have much trouble telling h from H.
> At what age did you start learning the rules for capitalization?

Uh oh. When I suggested switching to case insensitivity might be
harmless, I did so on the assumption that people wouldn't radically
change existing practice, i.e. lower case just about everywhere.
Occasionally you see all-cap variables, but they stay consistent.

I would be against making heavy use of case sensitivity. My eyes can
tell h from H easily, but my memory cannot.

ol...@pobox.com

unread,
Apr 9, 2001, 7:25:06 PM4/9/01
to

> ... Capitalisation is about

> sentence structure, not about words. People are taught (indoctrinated if
> you prefer) to see the two as equal, i.e. mapping a sentence to upper case
> changes the prosodics ("yelling"), not the meaning of the words. (And yes,
> I realise there is the exception of proper names..)

I take it you're not German. In German, case matters a great deal:
(das) Vorliegen is different from vorliegen, Entscheiden (decision
making) is different from entscheiden (to determine), Sein (being) is
different from sein (to be). Even in English capitalization can change
the meaning of a word dramatically: e.g., John vs. john, or, close to
the topic at hand, SOAP vs. soap, or even Scheme vs. scheme (the
verb).

Case of identifiers has semantic and even syntactic significance in
formal languages as well. In Haskell, capitalized identifiers name
types and type constructors whereas uncapitalized identifiers are
keywords or the names of values. Capitalization matters in OCaml too.
Call it a Germanic influence.

These linguistic excursions appear however beside the point. This
discussion didn't start with a goal to tell anyone how to make
confusing and hard-to-read identifier names. Everybody has his own
rules -- such as hyphenation, underscoring, or odd capitalization --
which he will continue to follow. The point is that there are
legitimate applications for which case-sensitive identifiers or
symbols are all but required. One such application is an
S-expression-based form of XML. PLT XML collections, SXML and all
other similar projects map tag names to identifiers. It is highly
appropriate as tag names aren't usually mutable but heavily used in
identity comparisons. According to DSSSL, upper- and lower-case forms
of a letter are always distinguished.

Personally, I'd like to see

(a) A gentleman's agreement on case-sensibility, which means:
(i) If I use Foo to name a value, I ought to refer to this value
by Foo rather than by foo, or by FOO or by fOO.
(ii) Avoid identifiers that are distinguished solely by
their case

(b) A command-line flag, a pragma declaration, a parameter or some
other switch to make every compiler or interpreter preserve the case
of identifiers, in lookups and i/o. Case-sensitivity does not have to
be the default behavior -- a mere option suffices. Given the
gentleman's agreement above, setting the case-sensitivity option will
not break the existing code. A command-line flag will be nearly
ideal. A pragma similar to MzScheme's 'parameter' will work too: I can
always isolate such platform-specific pragma's in cond-expand
blocks. The case-sensitivity switch does not have to be the same for
all compilers -- it merely ought to exist.

The gentleman's agreement can be reinforced by a Scheme system that
signals an error whenever
(and (string-ci=? (symbol->string x) (symbol->string y))
(not (string=? (symbol->string x) (symbol->string y))))
for any two identifiers x and y.


--
Posted from www.cs.nps.navy.mil [131.120.10.2]

Hartmann Schaffer

unread,
Apr 9, 2001, 8:59:59 PM4/9/01
to

if you use consistent conventions about what you capitalize and what not.
it's quite common in mathematics (even more so: fonts, script styles, etc)

hs

jmar...@alum.mit.edu

unread,
Apr 9, 2001, 10:07:05 PM4/9/01
to
Kirill Lisovsky <liso...@acm.org> writes:

> Anyway, this is a matter of taste, whereas case-insensitive symbol may
> be a reason of very material problems:

[elided]

I agree that there are places where case sensitivity is the
appropriate solution to a particular problem. However, I believe that
most uses of case sensitivity are not to address these problems. I
believe that using case sensitivity as a sort of `microsyntax'
violates abstraction and leads to errors.

My preference is that identifiers in a program are, by default,
matched in a case-insensitive manner. I also prefer that a language
provide a mechanism by which I can create identifiers and symbols that
are *not* case folded, and I would prefer being able to choose this
rather than being forced into one or the other.

On the few occasions where I have wanted to use symbols or identifiers
where the case was important, it was easy to do: |FooBar|
You can even put whitespace in an identifier with this mechanism.

Kirill Lisovsky

unread,
Apr 10, 2001, 12:14:17 AM4/10/01
to
jmar...@alum.mit.edu wrote:

To use the difference between Variable and VARIABLE is a bad practice,
I'm
completely agree here.

> On the few occasions where I have wanted to use symbols or identifiers
> where the case was important, it was easy to do: |FooBar|

1. This notation is neither standard nor portable.
By the different Schemes '|FooBar| will be evaluated to:
FooBar - bigloo, Gambit
foobar - mzscheme
|FooBar| - guile, Chez, kawa, chicken
Is it a victory in the war against confusing identifiers?

2. Say, XPath query
((sxpath '(Order Day @ Month)) data-tree)
has to be rewritten as
((sxpath '(|Order| |Day| @ |Month|)) data-tree)
Is it more readable?

So, in a vain attempt to prevent somebody from the use of ugly
identifiers we have code
readability and portability sacrificed.

Best regards,
Kirill Lisovsky

David Rush

unread,
Apr 10, 2001, 4:22:45 AM4/10/01
to
"Biep @ http://www.biep.org" <repl...@my.webpage.com> writes:
> So why should I adapt to the computer instead of the other way around?

Because you're smarter than your computer?

david rush
--
In a tight spot, you trust your ship or your rifle to get you through,
so you refer to her affectionately and with respect. Your computer? It
would just as soon reboot YOU if it could. Nasty, unreliable,
ungrateful wretches, they are. -- Mike Jackmin (on sci.crypt)

Biep @ http://www.biep.org

unread,
Apr 10, 2001, 5:23:21 AM4/10/01
to
<ol...@pobox.com> wrote in message
news:2001040923...@adric.cs.nps.navy.mil...

> I take it you're not German. In German, case matters a great deal:
> (das) Vorliegen is different from vorliegen, Entscheiden (decision
> making) is different from entscheiden (to determine), Sein (being) is
> different from sein (to be). Even in English capitalization can change
> the meaning of a word dramatically: e.g., John vs. john, or, close to
> the topic at hand, SOAP vs. soap, or even Scheme vs. scheme (the
> verb).

I KNEW somebody would bring up German, acronyms and the lot. In English
there are even case where italics change the meaning (by indicating the word
is not to be taken as a "native" English word, but as e.g. Latin).

BTW, in German the capitalisation you refer to indicates "take this as a
noun", which still is supralexical (if less so than sentence
capitalisation). And if a non-noun starts a sentence, it is still
capitalised. And a sentence all in uppercase is still a legal German
sentence..

There is a lot of interesting potential analysis here, but it is not about
Scheme, so I'll leave it.

BTW, I LIKE some ways in which capitalisation is used in certain Programming
languages, e. g. marking variables in Prolog. (Oh, and a German Prolog I
used to work with had turned the convention around: capitals for constants,
lower case for variables. Went less against their linguistic intuition!)

Rob Warnock

unread,
Apr 10, 2001, 7:05:36 AM4/10/01
to
<ol...@pobox.com> wrote:
+---------------

| Personally, I'd like to see
...

| (b) A command-line flag, a pragma declaration, a parameter or some
| other switch to make every compiler or interpreter preserve the case
| of identifiers, in lookups and i/o. Case-sensitivity does not have to
| be the default behavior -- a mere option suffices. Given the
| gentleman's agreement above, setting the case-sensitivity option will
| not break the existing code. A command-line flag will be nearly
| ideal. A pragma similar to MzScheme's 'parameter' will work too...
+---------------

My apologies for not mentioning it in my previous article about
MzScheme's "read-case-sensitive" parameter, but MzScheme *also*
provides a command-line option "-g" [also "--case-sens"] to set
the default parameter value to #t. (Happy?)

Biep @ http://www.biep.org

unread,
Apr 10, 2001, 10:07:36 AM4/10/01
to
"Biep @ http://www.biep.org" <repl...@my.webpage.com> wrote:
> So why should I adapt to the computer instead of the other way around?


"David Rush" <ku...@bellsouth.net> replied
news:okf7l0u...@bellsouth.net...


> Because you're smarter than your computer?

Yes, but as I am smart enough to make my computer smart enough to have it
adapt to me (in this respect)..
Or is there gain in keeping computing systems dumber than they need be?

Hartmann Schaffer

unread,
Apr 10, 2001, 8:00:18 PM4/10/01
to
In article <9aujc0$6vt3b$1...@ID-63952.news.dfncis.de>,
Biep @ http://www.biep.org wrote:
> ...

>BTW, in German the capitalisation you refer to indicates "take this as a
>noun", which still is supralexical (if less so than sentence
>capitalisation). And if a non-noun starts a sentence, it is still
>capitalised. And a sentence all in uppercase is still a legal German
>sentence..

i guess german capitalization serves the same purpose as english spelling:
to let a few people be proud at being able to demonstrate how well
educated they are (this isn't completely meant as a joke: i have heard
this argument used in all seriousness for both languages)

hs

bitd...@hotmail.com

unread,
Apr 11, 2001, 1:55:20 AM4/11/01
to
Kirill Lisovsky <liso...@acm.org> writes:

> jmar...@alum.mit.edu wrote:
>
> To use the difference between Variable and VARIABLE is a bad practice,
> I'm
> completely agree here.
>
> > On the few occasions where I have wanted to use symbols or identifiers
> > where the case was important, it was easy to do: |FooBar|
>
> 1. This notation is neither standard nor portable.

True, but it is a reasonably common convention.

> By the different Schemes '|FooBar| will be evaluated to:
> FooBar - bigloo, Gambit
> foobar - mzscheme
> |FooBar| - guile, Chez, kawa, chicken
> Is it a victory in the war against confusing identifiers?

It is unclear to me whether the latter 4 and the first 2 differ. The
latter 4 *appear* to attempting to print readably, the first two may
not be. If this *is* the difference, then mzscheme is the odd man
out.

> 2. Say, XPath query
> ((sxpath '(Order Day @ Month)) data-tree)
> has to be rewritten as
> ((sxpath '(|Order| |Day| @ |Month|)) data-tree)
> Is it more readable?

It depends on the application. If this were the only use in a
program, I'd rather do '(|Order| ...) But if I had an application
that makes heavy use if case-sensitive identifiers, I would want to
turn on that mode.

> So, in a vain attempt to prevent somebody from the use of ugly
> identifiers we have code readability and portability sacrificed.

People can always write ugly identifiers: iamanuglyidentifier
whether case sensitive or not. But the trend these days is to use
case conventions to encode micro-syntax, and that is horrible.

Biep @ http://www.biep.org

unread,
Apr 11, 2001, 3:44:54 AM4/11/01
to
"Hartmann Schaffer" <h...@paradise.nirvananet> wrote in message
news:slrn9d77...@paradise.nirvananet...

> i guess german capitalization serves the same purpose as english spelling:
> to let a few people be proud at being able to demonstrate how well
> educated they are (this isn't completely meant as a joke: i have heard
> this argument used in all seriousness for both languages)

I don't think so. Capitalising nouns is so standard (like capitalising
proper names in English, but it occurs a lot more), that I cannot believe
there are many Germans who WOULDN'T do it. In fact, it is not uncommon to
catch a German doing it while writing a foreign language.

BTW, German spelling has been simplified drastically a few years ago, from
hundreds of rules to just over fifty, I think, but the "look 'n' feel"
hasn't changed. But if your name is any indication you probably know much
more about that than I do.

David Rush

unread,
Apr 11, 2001, 4:31:43 AM4/11/01
to
"Biep @ http://www.biep.org" <repl...@my.webpage.com> writes:
> "Biep @ http://www.biep.org" <repl...@my.webpage.com> wrote:
> > So why should I adapt to the computer instead of the other way around?

> "David Rush" <ku...@bellsouth.net> replied


> > Because you're smarter than your computer?

> is there gain in keeping computing systems dumber than they need be?

But in fact this (case-insensitivity) makes the computer (compiler
actually) less able to discriminate between different forms.

If you really want the computer to adapt to you, dive into natural
language processing. I here that there's an ISO WG for DWIM
standardization ;)

david rush
--
From the start, when it was the instrument of the wood-god Pan, the
flute has been associated with pure (some might say impure)
energy. Its sound releases something naturally untamed, as if a
squirrel were let loose in a church." --Seamus Heaney

Kirill Lisovsky

unread,
Apr 11, 2001, 11:34:14 AM4/11/01
to

On Wed, 11 Apr 2001 05:55:20 GMT, bitd...@hotmail.com wrote:

>> By the different Schemes '|FooBar| will be evaluated to:
>> FooBar - bigloo, Gambit
>> foobar - mzscheme
>> |FooBar| - guile, Chez, kawa, chicken
>> Is it a victory in the war against confusing identifiers?
>
>It is unclear to me whether the latter 4 and the first 2 differ. The
>latter 4 *appear* to attempting to print readably, the first two may
>not be. If this *is* the difference, then mzscheme is the odd man
>out.

Not quite.

(display (symbol->string '|FooBar|)) evaluates to:
FooBar - Chez, chicken (I guess this is what you mean)
|FooBar| - guile, kawa, scsh

... let alone MIT Scheme, SXM and RScheme, where '|Foo| is an error!

So, '|Foo| notation is far away from a de-facto standard.

>> So, in a vain attempt to prevent somebody from the use of ugly
>> identifiers we have code readability and portability sacrificed.
>
>People can always write ugly identifiers: iamanuglyidentifier
>whether case sensitive or not. But the trend these days is to use
>case conventions to encode micro-syntax, and that is horrible.

I'm against this trend, but I believe that such restriction as
case-insensitivity can not enforce good naming convention.

Just consider two different identifiers: O0 and OO.
IMHO, they are very confusing.
If we will not allow numerals in identifiers, it will make such a
confusion impossible, but is it an adequate solution?

I think that in all this cases the game is not worth the candle.

Best regards,
Kirill Lisovsky.

Marc Feeley

unread,
Apr 11, 2001, 2:52:07 PM4/11/01
to
> > By the different Schemes '|FooBar| will be evaluated to:
> > FooBar - bigloo, Gambit
> > foobar - mzscheme
> > |FooBar| - guile, Chez, kawa, chicken
> > Is it a victory in the war against confusing identifiers?
>
> It is unclear to me whether the latter 4 and the first 2 differ. The
> latter 4 *appear* to attempting to print readably, the first two may
> not be. If this *is* the difference, then mzscheme is the odd man
> out.

You are correct, at least for Gambit. Gambit's reader can be put in
case sensitive and case insensitive modes. Symbols containing
upper-case letters are printed with the vertical bar escapes when in
the case insensitive mode and without the escape when in case sensitive
mode. This preserves write/read invariance. Here is an example:

Gambit Version 3.0

> 'FooBar
FooBar
> (string->symbol "FooBar")
FooBar
> (set-case-conversion! #t)
> 'FooBar
foobar
> (string->symbol "FooBar")
|FooBar|
> (set-case-conversion! 'upcase)
> 'FooBar
FOOBAR
> (|string->symbol| "FooBar")
|FooBar|
>

By the way, Gambit's syntax for vertical bar escaped symbols is the
same as for strings except the meaning of doublequote and vertical bar
is interchanged. So

|aB\|c"D|

is the symbol equal to (string->symbol "aB|c\"D").

I think other Scheme systems treat vertical bar escapes differently.

Marc

bitd...@hotmail.com

unread,
Apr 11, 2001, 8:14:01 PM4/11/01
to
liso...@acm.org (Kirill Lisovsky) writes:

> On Wed, 11 Apr 2001 05:55:20 GMT, bitd...@hotmail.com wrote:
>
> >> By the different Schemes '|FooBar| will be evaluated to:
> >> FooBar - bigloo, Gambit
> >> foobar - mzscheme
> >> |FooBar| - guile, Chez, kawa, chicken
> >> Is it a victory in the war against confusing identifiers?
> >
> >It is unclear to me whether the latter 4 and the first 2 differ. The
> >latter 4 *appear* to attempting to print readably, the first two may
> >not be. If this *is* the difference, then mzscheme is the odd man
> >out.
>
> Not quite.
>
> (display (symbol->string '|FooBar|)) evaluates to:
> FooBar - Chez, chicken (I guess this is what you mean)
> |FooBar| - guile, kawa, scsh

Actually it would be more interesting to see what
(write (string->symbol "Foo Bar")) returns.

> ... let alone MIT Scheme, SXM and RScheme, where '|Foo| is an error!
>
> So, '|Foo| notation is far away from a de-facto standard.

I must admit I was surprised to find that MIT Scheme didn't support
this notation. It seems to me that there ought to be a way to print
`non-standard' symbols such that they can be read back in.

> >> So, in a vain attempt to prevent somebody from the use of ugly
> >> identifiers we have code readability and portability sacrificed.
> >
> >People can always write ugly identifiers: iamanuglyidentifier
> >whether case sensitive or not. But the trend these days is to use
> >case conventions to encode micro-syntax, and that is horrible.
> I'm against this trend, but I believe that such restriction as
> case-insensitivity can not enforce good naming convention.
>
> Just consider two different identifiers: O0 and OO.
> IMHO, they are very confusing.

No argument here.

> If we will not allow numerals in identifiers, it will make such a
> confusion impossible, but is it an adequate solution?

No. There is much utility to be gained from having identifiers with
numbers in them. I have often used identifiers that are the same up
to a numeric suffix.

I haven't seen a compelling argument that identifiers that differ only
in case ought to be considered different identifiers. I *have* seen
numerous misuses of case sensitivity to encode micro-syntax.


> I think that in all this cases the game is not worth the candle.

That's an expression I haven't heard before.

David Rush

unread,
Apr 12, 2001, 4:06:57 AM4/12/01
to
bitd...@hotmail.com writes:

> liso...@acm.org (Kirill Lisovsky) writes:
> > (display (symbol->string '|FooBar|)) evaluates to:
> > FooBar - Chez, chicken (I guess this is what you mean)
> > |FooBar| - guile, kawa, scsh
>
> Actually it would be more interesting to see what
> (write (string->symbol "Foo Bar")) returns.

Larceny: Foo Bar
Scheme48: Foo Bar
Bigloo (2.2a): |Foo Bar|
VSCM: Foo Bar
mzscheme (102/15): |Foo Bar|
Guile: #{Foo\ Bar}#

Was that interesting?

> > So, '|Foo| notation is far away from a de-facto standard.

> It seems to me that there ought to be a way to print


> `non-standard' symbols such that they can be read back in.

Err. then they would be `standard', now wouldn't they?
R5RS explicitly states that string->symbol can make symbols that will
*not* be read-able. This is a bug, IMNSHO.

> > If we will not allow numerals in identifiers, it will make such a
> > confusion impossible, but is it an adequate solution?
>
> No. There is much utility to be gained from having identifiers with
> numbers in them.

Ditto case-sensitivity. In other languages, I find case-sensitivity
provides useful information to me. This is apart from the word
boundaries in identifiers issue (in which I prefer the Lispy use of
embedded hyphens).

At the end of the day, TEX and TeX *are* different (just ask Donald
Knuth ;) There is real information present in identifier case. In a cs
language you are under no obligation to use it (and you can berate
people all you want for violating your code of esthetics), but it *is*
useful to have it available.

> I haven't seen a compelling argument that identifiers that differ only
> in case ought to be considered different identifiers. I *have* seen
> numerous misuses of case sensitivity to encode micro-syntax.

And you've never seen someone use variables like temp1, temp2, temp3?
People can write crap under any set of conventions. I don't believe
it's good language design to restrict the construction of unique
identifiers (modulo operator syntax...but that's another issue
entirely)



> > I think that in all this cases the game is not worth the candle.
> That's an expression I haven't heard before.

Me neither. Do you mean it is not worth the expense of the candle used
to light the playing board? I think I like this espression.

david rush
--
With guns, we are citizens. Without them, we are subjects.
-- YZGuy, IPL

Biep @ http://www.biep.org

unread,
Apr 12, 2001, 4:44:15 AM4/12/01
to
"David Rush" <ku...@bellsouth.net> wrote in message
news:okfeluy...@bellsouth.net...

> At the end of the day, TEX and TeX *are* different (just ask Donald Knuth
;)

And both are different from Te? (the last letter is the Greek capital Chi),
which, if I remember correctly, is the letter Knuth intended.
Actually, shouldn't the vowel be a small cap, rather than a standard upper
or lower case letter?
In both differences "there is real information present" - do you consider
that a reason to allow them as distinct in identifiers?


> With guns, we are citizens. Without them, we are subjects.

With guns, those with no qualms to use them are masters. Without them, we
are equals.
But that belongs in a different newsgroup.

David Rush

unread,
Apr 12, 2001, 8:50:48 AM4/12/01
to
> "David Rush" <ku...@bellsouth.net> wrote in message
> news:okfeluy...@bellsouth.net...
> > At the end of the day, TEX and TeX *are* different (just ask Donald Knuth
> ;)
>
> And both are different from Te? (the last letter is the Greek capital Chi),
> which, if I remember correctly, is the letter Knuth intended.

Actually, he explicitly indicated that the spelling `TeX' should be
used when the rendering system was unable to produce the correct
construction (almost as you noted: cap-T cap-E lowered by .5ex cap-X)

> In both differences "there is real information present" - do you consider
> that a reason to allow them as distinct in identifiers?

Yes.

> > With guns...


> But that belongs in a different newsgroup.

Oh yes. Blame my sigmonster, Lately he's been quite bold.

david rush
--
As I've gained more experience with Perl it strikes me that it resembles
Lisp in many ways, albeit Lisp as channeled by an awk script on acid.
-- Tim Moore (on comp.lang.lisp)

Biep @ http://www.biep.org

unread,
Apr 12, 2001, 9:44:29 AM4/12/01
to
> In both [Chi vs. Iks, and lowered E] "there is real information present"

> - do you consider that a reason to allow them as distinct in identifiers?

"David Rush" <ku...@bellsouth.net> wrote in message

news:okf66ga...@bellsouth.net...
> Yes.

O.K., so you are really going for full Unicode + font make-up in
identifiers.

I am afraid I see two huge problems:
(1) Lots of visually undistinguishable identifiers (e.g. X vs. X - one being
Iks, the orher Chi)
(2) Failure of other existing systems to generate code to be used by your
system.

Now, if you wanted to require STRINGS to preserve all these differences..

David Rush

unread,
Apr 12, 2001, 12:18:07 PM4/12/01
to
Surrendering entirely to the dark side of OT pedantry, I am ;)

"Biep @ http://www.biep.org" <repl...@my.webpage.com> writes:
> "Biep @ http://www.biep.org" <repl...@my.webpage.com> writes:
> > In both [Chi vs. Iks, and lowered E] "there is real information
> > present"

The point, as you have conveniently cut out, was that Knuth specified
a case-sensitive construction as the name of his system *and*
specifically contrasted that case-sensitive construction with another
case sensitive name of a different system in the same (very general)
domain.

Therefore w/rt case-sensitivity:

> > - do you consider that a reason to allow them as distinct in identifiers?
>
> "David Rush" <ku...@bellsouth.net> wrote in message

> > Yes.
>
> O.K., so you are really going for full Unicode + font make-up in
> identifiers.

Full Unicode, yes. It already exists (in principle anyway, UTF-8 works
pretty well otherwise) in many editors and for many languages
(handwaving at various lexical specifications). OTOH, fonts are the
prerogative of the rendering system, so the same code point in two
different fonts should have the same underlying meaning. This is
slightly horrible (in that I can imagine some very pathological
straw-men), but in practice this is not merely good enough, It's as
good as you *can* do.

And there are some pretty ugly issues in Unicode, too (the German
sharp `s' springs immediately to mind), but none of that invalidates
the fact that identifiers are essentially a horribly inefficient
(although convenient) encoding of the integers. I see no reason why
preventing, or even worse, *conflating*, arbitrary (modulo other
syntactical issues) sequences of bits in the formation of an
identifier gains anything in terms of usability or readbility.

> I am afraid I see two huge problems:
> (1) Lots of visually undistinguishable identifiers (e.g. X vs. X - one being
> Iks, the orher Chi)

Use a better editor, then. I'd suggest Emacs+Mule.

> (2) Failure of other existing systems to generate code to be used by your
> system.

Fix 'em or fuck 'em. ASCII is the past. Welcome to the 3rd Millenium.

> Now, if you wanted to require STRINGS to preserve all these differences..

They already do. In fact, that is why Scheme48 and it's ilk go out of
their way to make char->integer return non-ascii encodings for
characters (specifically S48 uses ASCII(x) + 1000).

david rush
--
And Visual Basic programmers should be paid minimum wage :)
-- Jeffrey Straszheim (on comp.lang.functional)

bitd...@hotmail.com

unread,
Apr 12, 2001, 9:05:27 PM4/12/01
to
David Rush <ku...@bellsouth.net> writes:

> bitd...@hotmail.com writes:
> > liso...@acm.org (Kirill Lisovsky) writes:
> > > (display (symbol->string '|FooBar|)) evaluates to:
> > > FooBar - Chez, chicken (I guess this is what you mean)
> > > |FooBar| - guile, kawa, scsh
> >
> > Actually it would be more interesting to see what
> > (write (string->symbol "Foo Bar")) returns.
>
> Larceny: Foo Bar
> Scheme48: Foo Bar
> Bigloo (2.2a): |Foo Bar|
> VSCM: Foo Bar
> mzscheme (102/15): |Foo Bar|
> Guile: #{Foo\ Bar}#
>
> Was that interesting?

Indeed it was.

>
> > > So, '|Foo| notation is far away from a de-facto standard.
>
> > It seems to me that there ought to be a way to print
> > `non-standard' symbols such that they can be read back in.
>
> Err. then they would be `standard', now wouldn't they?

Symbols with spaces in them (for example) are not usually considered
`standard' in that you rarely find them in your average program.

RnRS allows (and suggests) using slashification of symbols to deal with
character strings that cannot normally be read as symbols (so there is
in fact a `standard' that is not universally implemented).

> R5RS explicitly states that string->symbol can make symbols that will
> *not* be read-able. This is a bug, IMNSHO.

It says that some implementations that do not support `slashification'
of symbols may not be able to read funny symbols back in. Apparently,
there was some disagreement among the RnRS authors about escape
conventions (perhaps having to do with the preferred case being
different among different schemes?)

> > > If we will not allow numerals in identifiers, it will make such a
> > > confusion impossible, but is it an adequate solution?
> >
> > No. There is much utility to be gained from having identifiers with
> > numbers in them.
>
> Ditto case-sensitivity. In other languages, I find case-sensitivity
> provides useful information to me. This is apart from the word
> boundaries in identifiers issue (in which I prefer the Lispy use of
> embedded hyphens).

Can you give an example? My assertion is that it is *generally* poor
style to have identifiers that mean different things to be spelled the
same (with only the difference being in the case). There are several
reasons for this:

1. Many people `fold case' when they read. i trust that the
capitalization of the first word in many of my sentences doesn't
confuse you. i would also assume that you had no problem
understanding these sentences despite the fact that i forgot to
capitalize the `i'.

Do you percieve a difference between something `on sale' and
something `On Sale'?

Even if *you* have no problem identifying the difference between
Foo and foo in a program, other people may.

2. It is difficult to talk about things that differ only in case:
It is easy to say `while i is less than limit', but far more
difficult to say `while lower-case i is less that upper-case
i'.

I also assert that identifiers that differ only by a numeric suffix
(like LPT1 and LPT2) are far less likely to be confused.

> At the end of the day, TEX and TeX *are* different (just ask Donald
> Knuth ;)

Oh? I've never heard of Knuth's TEX program. How different is it
from TeX?

> There is real information present in identifier case.

I don't deny this, I question
a) the *amount* of the information
b) the *kind* of the information, and
c) the *utility* of the information

The fact of the matter is that in *general* (in english), that case
does not encode *identifying* information. Sure there is the
occasional acronym or funky trademark, but the vast majority of words
do not take on different meanings depending on case.

> > I haven't seen a compelling argument that identifiers that differ only
> > in case ought to be considered different identifiers. I *have* seen
> > numerous misuses of case sensitivity to encode micro-syntax.
>
> And you've never seen someone use variables like temp1, temp2, temp3?

Rick Greenblatt's favorite variable names!

But back to the question at hand, do you think there is any danger
that someone would confuse temp1 with temp2? What about fooBar and
FooBar?

Now don't get me wrong, I think that there should be the *ability* to
use case-sensitivity in exceptional circumstances, or perhaps
pervasively under some circumstances. However I think that folding
case is the correct thing to do under *most* circumstances.

bitd...@hotmail.com

unread,
Apr 12, 2001, 9:09:47 PM4/12/01
to
David Rush <ku...@bellsouth.net> writes:

> "Biep @ http://www.biep.org" <repl...@my.webpage.com> writes:
> > "David Rush" <ku...@bellsouth.net> wrote in message
> > news:okfeluy...@bellsouth.net...
> > > At the end of the day, TEX and TeX *are* different (just ask Donald Knuth
> > ;)
> >
> > And both are different from Te? (the last letter is the Greek capital Chi),
> > which, if I remember correctly, is the letter Knuth intended.
>
> Actually, he explicitly indicated that the spelling `TeX' should be
> used when the rendering system was unable to produce the correct
> construction (almost as you noted: cap-T cap-E lowered by .5ex cap-X)

Yes, but does this convey identifying information? (It certainly
conveys information about Knuth!) When running TeX, must one be very
careful to use the .TeX file extension rather than the .tex file
extension, or are they operationally equivalent?

Alpha-Decay Petrofsky

unread,
Apr 14, 2001, 5:44:34 AM4/14/01
to
ol...@pobox.com writes:

> When developing a test suite for a SSAX XML parser, I needed a
> portable and _concise_ way of entering _case-sensitive_
> symbols. Again, I needed that way only for validation self-tests,
> which are always enclosed within a special form run-test:
> (run-test (test1) (test2) ...)
...
> This fortuitous circumstance suggested that run-test can do a bit
> more than just expanding into a begin form. It can be used to enable
> truly portable and truly concise case-sensitive symbols.
...
> Therefore, we introduce a notation '"ASymbol" (a quoted string) that
> stands for a case-_sensitive_ ASymbol -- on any R5RS Scheme system
> with a low-level macro system. This notation is valid only within the
> body of run-test.
...
> (define-macro run-test
...

> We must also stress that the above source code transformation can
> only be effected by a low-level macro facility. High-level (aka
> "portable" aka R5RS) macros _cannot_ express this transformation. By
> design, syntax-rules prohibit manufacturing of symbols and
> identifiers: otherwise, it will be impossible to guarantee hygiene.

That is indeed a CooL hack.

However, it occurs to me that getting something close to what you
describe while maintaining hygiene is not as impossible as you might
think. Although your implementation supports case-sensitive variable
names, it appears that you don't really desire them, you just want
case-sensitive literals. In r5rs, there are only three expression
types in which literals occur: quote, quasiquote, and case. What you
need is for the tests to be evaluated in a syntactic environment that
has modified versions of these syntaxes that understand the '"ASymbol"
notation. The only constraint hygiene imposes is that you must pass
in to the macro the names of the keywords that will be rebound (in
other words, because run-test is really a binding construction, the
identifiers being bound must be lexically visible from the expressions
that use them).

Below is an implementation of run-test that takes as extra arguments
the identifiers to be bound to the '"ASymbol"-aware versions of quote,
quasiquote, and case. It is called like so:

(run-test '`case
(and
(symbol? ''"ASymbol")
(symbol? (car '('"ASymbol")))
(eq? (string->symbol "ASymbol") ''"ASymbol")
(case (string->symbol "ASymbol")
(('"ASymbol") #t) (else #f))))
=> #t

-al


(define-syntax run-test
(syntax-rules ()
((_ (q (qq case)) . tests)
(letrec-syntax
((q (syntax-rules (q) ((_ x) (generic-quote q x))))
(qq
(syntax-rules (q qq unquote unquote-splicing)
((_ ,x) x)
((_ (,@x . y)) (append x (qq y)))
((_ (qq x) . depth) (list 'qq (qq x depth)))
((_ ,x depth) (list 'unquote (qq x . depth)))
((_ ,@x depth) (list 'unquote-splicing (qq x . depth)))
((_ x . depth) (generic-quote qq x . depth))))
(generic-quote
(syntax-rules (q)
((_ q/qq (q x) . rest) (cond ((string? 'x) (string->symbol 'x))
(else (cons 'q (q/qq (x) . rest)))))
((_ q/qq (x . y) . rest) (cons (q/qq x . rest) (q/qq y . rest)))
((_ . rest) (run-test "dots" . rest))))
(case
(syntax-rules (else)
((_ (x . y) . clauses) (let ((key (x . y))) (case key . clauses)))
((_ key) 'unspecified)
((_ key (else . exps)) (begin . exps))
((_ key (atoms . exps) . clauses)
(cond ((memv key (q atoms)) . exps)
(else (case key . clauses)))))))
(begin . tests)))
;; These rules need to be out here because of r5rs ellipsis shortcomings.
((_ "dots" q/qq #(elt ...) . rest) (list->vector (q/qq (elt ...) . rest)))
((_ "dots" q/qq x . rest) 'x)))

0 new messages