Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Wide character implementation
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 26 - 50 of 160 - Collapse all  -  Translate all to Translated (View all originals) < Older  Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Erik Naggum  
View profile  
 More options Mar 22 2002, 10:03 pm
Newsgroups: comp.lang.lisp, comp.lang.scheme
From: Erik Naggum <e...@naggum.net>
Date: Sat, 23 Mar 2002 03:03:52 GMT
Local: Fri, Mar 22 2002 10:03 pm
Subject: Re: Wide character implementation
* Sander Vesik
| Wake up, smnell the coffee and learn about 'combiners'.  And then *think*
| just a little bit, including about thinks like collation, sort order and
| similar.

  Perhaps you are unaware of the character concept as used in Unicode?  It
  would seem prudent at this time for you to return to the sources and
  obtain the information you lack.  To wit, what you incompetently refer to
  as "combiners" are actually called "combining characters".  I suspect you
  knew that, too, since nobody _else_ calls them "combiners".  But it seems
  that you are fighting for your honor, now, not technical correctness, and
  I shall leave to you another pathetic attempt to feel good about yourself
  when you should acknowledge inferior knowledge and learn something.

  Oh, by the way, Unicode has three levels.  Study Unicode, and you will
  know that they mean and what they do.  Hint: "variable-length character"
  is an incompetent restatement.  A single _glyph_ may be made up of more
  than one _character_ and a given glyph may be specifed using more than
  one character.  If you had known Unicode at all, you would know this.

///
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Sander Vesik  
View profile  
 More options Mar 23 2002, 1:52 pm
Newsgroups: comp.lang.lisp, comp.lang.scheme
From: Sander Vesik <san...@haldjas.folklore.ee>
Date: Sat, 23 Mar 2002 18:51:39 +0000 (UTC)
Local: Sat, Mar 23 2002 1:51 pm
Subject: Re: Wide character implementation
In comp.lang.scheme Erik Naggum <e...@naggum.net> wrote:

> * Sander Vesik
> | Wake up, smnell the coffee and learn about 'combiners'.  And then *think*
> | just a little bit, including about thinks like collation, sort order and
> | similar.

>  Perhaps you are unaware of the character concept as used in Unicode?  It
>  would seem prudent at this time for you to return to the sources and
>  obtain the information you lack.  To wit, what you incompetently refer to
>  as "combiners" are actually called "combining characters".  I suspect you
>  knew that, too, since nobody _else_ calls them "combiners".  But it seems
>  that you are fighting for your honor, now, not technical correctness, and
>  I shall leave to you another pathetic attempt to feel good about yourself
>  when you should acknowledge inferior knowledge and learn something.

I don't subscribe to the concept of honour. I also couldn't care less what
you think of me.

>  Oh, by the way, Unicode has three levels.  Study Unicode, and you will
>  know that they mean and what they do.  Hint: "variable-length character"
>  is an incompetent restatement.  A single _glyph_ may be made up of more
>  than one _character_ and a given glyph may be specifed using more than
>  one character.  If you had known Unicode at all, you would know this.

It is pointless to think of glyph in any other way than characters - it should
not make any difference whetever adiaresis is represented by one code point
- the precombined one - or two. In fact, if there is a detctable difference
from anything dealing with text strings the implementation is demonstratably
broken.

> ///

--
        Sander

+++ Out of cheese error +++


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Mar 23 2002, 8:46 pm
Newsgroups: comp.lang.lisp, comp.lang.scheme
From: Erik Naggum <e...@naggum.net>
Date: Sun, 24 Mar 2002 01:46:30 GMT
Local: Sat, Mar 23 2002 8:46 pm
Subject: Re: Wide character implementation
* Sander Vesik
| I also couldn't care less what you think of me.

  You should realize that only people who care a lot, make this point.

| It is pointless to think of glyph in any other way than characters - it
| should not make any difference whetever adiaresis is represented by one
| code point - the precombined one - or two.  In fact, if there is a
| detctable difference from anything dealing with text strings the
| implementation is demonstratably broken.

  It took the character set community many years to figure out the crucial
  conceptual and then practical difference between the "characteristic
  glyph" of a character and the character itself, namly that a character
  may have more than one glyph, and a glyph may represent more than one
  character.  If you work with characters as if they were glyphs, you
  _will_ lose, and you make just the kind of arguments that were made by
  people who did _not_ grasp this difference in the ISO committees back in
  1992 and who directly or indirectly caused Unicode to win over the
  original ISO 10646 design.  Unicode has many concessions to those who
  think character sets are also glyph sets, such as the presentation forms,
  but that only means that there are different times you would use
  different parts of the Unicode code space.  Some people who try to use
  Unicode completely miss this point.

  It also took some _companies_ a really long time to figure the difference
  between glyph sets and character sets.  (E.g., Apple and Xerox, and, of
  course, Microsoft has yet to reinvent the distinction badly in the name
  of "innovation", so their ISO 8859-1-like joke violates important rules
  for character sets.)  I see that you are still in the pre-enlightenment
  state of mind and have failed to grasp what Unicode does with its three
  levels.  I cannot help you, since you appear to stop thinking in order to
  protect or defend yourself or whatever (it sure looks like som mideast
  "honor" codex to me), but if you just pick up the standard and read its
  excellent introductions or even Unicode: A Primer, by Tony Graham, you
  will understand a lot more.  It does an excellent job of explaining the
  distinction between glyph and character.  I think you need it much more
  than trying to defend yourself by insulting me with your ignorance.

  Now, if you want to use or not use combining characters, you make an
  effort to convert your input to your preferred form before you start
  processing.  This isolates the "problem" to a well-defined interface, and
  it is no longer a problem in properly designed systems.  If you plan to
  compare a string with combining characters with one without them, you are
  already so confused that there is no point in trying to tell you how
  useless this is.  This means that thinking in terms of "variable-length
  characters" is prima facie evidence of a serious lack of insight _and_ an
  attitude problem that something somebody else has done is wrong and that
  you know better than everybody else.  Neither are problems with Unicode.

///
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Bushnell, BSG  
View profile  
 More options Mar 23 2002, 11:30 pm
Newsgroups: comp.lang.lisp, comp.lang.scheme
From: tb+use...@becket.net (Thomas Bushnell, BSG)
Date: 23 Mar 2002 20:25:49 -0800
Local: Sat, Mar 23 2002 11:25 pm
Subject: Re: Wide character implementation

So a secondary question; if one is designing a new Common Lisp or
Scheme system, and one is not encumbered by any requirements about
being consistent with existing code, existing operating systems, or
existing communications protocols and interchange formats: that is, if
one gets to design the world over again:

Should the Scheme/CL type "character" hold Unicode characters, or
Unicode glyphs?  (It seems clear to me that it should hold characters,
but I might be thinking about it poorly.)

And, whichever answer, why is that the right answer?

Thomas


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
cr88192  
View profile  
 More options Mar 23 2002, 11:59 pm
Newsgroups: comp.lang.lisp, comp.lang.scheme
Followup-To: comp.lang.lisp
From: cr88192 <cr88...@hotmail.com>
Date: Sat, 23 Mar 2002 21:02:30 -0500
Local: Sat, Mar 23 2002 9:02 pm
Subject: Re: Wide character implementation

> Should the Scheme/CL type "character" hold Unicode characters, or
> Unicode glyphs?  (It seems clear to me that it should hold characters,
> but I might be thinking about it poorly.)

> And, whichever answer, why is that the right answer?

one could use "the cheap man's unicode" or utf-8.
actually personally I don't care so much about unicode and have held it in
the "possibly later" respect. for now it is not terribly important as I can
just restrict myself to the lower 128 characters.
in any case it sounds simpler to implement than the "codepage" system, so I
will probably use it.

"ich bin einen Amerikaner, und ich tun nicht erweiterter Zeichen noetig"
(don't mind bad grammar, as I don't really know german...).

nevermind...


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Mar 24 2002, 1:51 am
Newsgroups: comp.lang.lisp, comp.lang.scheme
From: Erik Naggum <e...@naggum.net>
Date: Sun, 24 Mar 2002 06:51:53 GMT
Local: Sun, Mar 24 2002 1:51 am
Subject: Re: Wide character implementation
* tb+use...@becket.net (Thomas Bushnell, BSG)
| Should the Scheme/CL type "character" hold Unicode characters, or
| Unicode glyphs?  (It seems clear to me that it should hold characters,
| but I might be thinking about it poorly.)

  There are no Unicode glyphs.  This properly refers to the equivalence of
  a sequence of characters starting with a base character and optinoally
  followed combining characters, and "precomposed" characters.  This is the
  canonical-equivalence of character sequences.  A processor of Unicode
  text is allowed to replace any character sequence with any of its
  canonically-equivalent character sequences.  It is in this regard that an
  application may want to request a particular composite character either
  as one character or a character sequence, and may decide to examine each
  coded character element individually or as an interpreted character.
  These constitute three different levels of interpretation that it must be
  possible to specify.  Since an application is explicitly permitted to
  choose any of the canonical-equivalent character sequences for a
  character, the only reasonable approach is to normalize characters into a
  known internal form.

  There is one crucial restriction on the ability to use equivalent
  character sequences.  ISO 10646 defines implementation levels 1, 2 and 3
  that, respectively, prohibit all combining characters, allow most
  combining characters, and allow all combining characters.  This is a very
  important part of the whole Unicode effort, but Unicode has elected to
  refer to ISO 10646 for this, instead of adopting it.  From my personal
  communication with high-ranking officials in the Unicode consortium, this
  is a political decision, not a technical one, because it was feared that
  implementors that would be happy with trivial character-to-glyph--mapping
  software (such as a conflation of character and glyph concepts and fonts
  that support this conflation), especially in the Latin script cultures,
  would simply drop support for the more complex usage of the Latin script
  and would fail to implement e.g., Greek properly.  Far from being an
  enabling technology, it was feared that implementing the full set of
  equivalences would be omitted and thus not enable the international
  support that was so sought after.  ISO 10646, on the other hand, has
  realized that implementors will need time to get all this right, and may
  choose to defer implementation of Unicode entirely if they are not able
  to do it stepwise.  ISO 10646 Level 1 is intended to be workable for a
  large number of uses, while Level 3 is felt not to have an advantage qua
  requirement until languages that require far more than composition and
  decomposition to be fully supported.  I concur strongly with this.

  The character-to-glyph mapping is fraught with problems.  One possible
  way to do this is actually to use the large private use areas to build
  glyphs and then internally use only non-combining characters.  The level
  of dynamism in the character coding and character-to-glyph mapping here
  is so much difficult to get right that the canonical-equivalent sequences
  of characters (which is a fairly simple table-lookup process) pales in
  comparison.  That is, _if_ you allow combining characters, actually being
  able to display them and reason about them (such as computing widths or
  dealing with character properties of the implicit base character or
  converting their case) is far more difficult than decomposing and
  composing characters.

  As for the scary effect of "variable length" -- if you do not like it,
  canonicalize the input stream.  This really is an isolatable non-problem.

///
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Mar 24 2002, 2:00 am
Newsgroups: comp.lang.lisp, comp.lang.scheme
From: Erik Naggum <e...@naggum.net>
Date: Sun, 24 Mar 2002 07:00:47 GMT
Local: Sun, Mar 24 2002 2:00 am
Subject: Re: Wide character implementation
* Thomas Bushnell, BSG
| So a secondary question; if one is designing a new Common Lisp or Scheme
| system, and one is not encumbered by any requirements about being
| consistent with existing code, existing operating systems, or existing
| communications protocols and interchange formats: that is, if one gets to
| design the world over again:

  If we could design the world over again, the _first_ ting I would want to
  do is making "capital letter" a combining modifier instead of doubling
  the size of the code space required to handle it.  Not only would this be
  such a strong signal to people not to use case-sensitive identifiers in
  programming languages, we would have a far better time as programmers.
  E.g., considering the enormous amount of information Braille can squeeze
  into only 6 bits, with codes for many common words and codes to switch to
  and from digits and to capital letters, the limitations of their code
  space has effectively been very beneficial.

///
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "case-sensitivity and identifiers (was Re: Wide character implementation)" by Ed L Cashin
Ed L Cashin  
View profile  
 More options Mar 24 2002, 11:08 pm
Newsgroups: comp.lang.lisp
From: Ed L Cashin <ecas...@uga.edu>
Date: 24 Mar 2002 23:08:10 -0500
Local: Sun, Mar 24 2002 11:08 pm
Subject: case-sensitivity and identifiers (was Re: Wide character implementation)

Erik Naggum <e...@naggum.net> writes:

...

>   If we could design the world over again, the _first_ ting I would
>   want to do is making "capital letter" a combining modifier instead
>   of doubling the size of the code space required to handle it.  Not
>   only would this be such a strong signal to people not to use
>   case-sensitive identifiers in programming languages, we would have
>   a far better time as programmers.

Could you elaborate on that a bit?  I'm interested because it appears
that you're position is that case-sensitivity in identifiers is a Bad
Thing for programming languages.

A general principle of mine is that if things are distinguishable,
they should not be collapsed but the distinction should be preserved
whenever possible.  Treating different characters as the same
character, or treating different character sequences as equivalent,
should be postponed as long as possible in order to preserve
information.

Are you suggesting that this principle is inappropriate to apply to
the character sequences that compose identifiers in source code?  That
would mean that "ABLE" is the same identifier as "able".  I must admit
that when I first found out that current lisps have case-insensitive
symbol names, I thought it reminiscent of BASIC -- kind of a throwback
to a time when memory was much more at a premium.  (I know that Lisp
predates BASIC.  I'm talking about my reaction.)  I'd be happy to hear
a good case for case-insensitive identifiers.

--
--Ed L Cashin            |   PGP public key:
  ecas...@uga.edu        |   http://noserose.net/e/pgp/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kent M Pitman  
View profile  
 More options Mar 24 2002, 11:46 pm
Newsgroups: comp.lang.lisp
From: Kent M Pitman <pit...@world.std.com>
Date: Mon, 25 Mar 2002 04:45:10 GMT
Local: Sun, Mar 24 2002 11:45 pm
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)
Ed L Cashin <ecas...@uga.edu> writes:

Psychology experiments have empirically shown that memory is auditory.
That is, when you misremember words, you misremember them by soundalike,
not by lookalike.  There is also ample linguistic evidence that the core of
human language is an auditory phenomenon.  When languages vary, they first
change in their spoken form and then later writing catches up, not much
vice versa.  Since the spoken form has no notation for case differentiation,
the pretty obvious conclusion is that conceptual information is not best
carried in case.  People don't remember whether they saw a word written in
uppercase or lowercase, they just remember the word.  It is very rare and
quite awkward for someone to say "Use Capitalized-Foo" or
"Use All-Uppercase-FOO" to someone out loud in areas other than computer
science where people have worked themselves into corners by being pedantic
on a "general principle" as in your previous paragraph rather than observing
well-researched truths about how people really think.  

Some of us believe that a proper harmonization/synchronization with the
way peoples' brains work is more important than catering to a theoretical
model that some people think would be a nice way for people to think.

I personally have made it a design goal in languages that I've worked on
to think hard about making even programming languages gracefully pronounceable
so that people can talk about programs aloud to each other over dinner, etc.
Modern Lisp has mostly moved away from obscure little names like "rplacd"
and such (a small number being retained mostly for history).  For new
concepts, make names like MOST-POSITIVE-FIXNUM not MAXINT.  

Even in cased languages, mostly people don't use case to distinguish, they
just use it for controlling the look of code.  It's not uncommon for people
to have some things named Foo and others named BAR, but it's rarer for things
to be both named foo and Foo in a context where simple namespacing can't
tell the difference.  So often again you don't hear people saying the case
out loud because it can be determined from other factors.  At that point,
you might as well let people write stuff in whatever case they want, for
ease of input, and just let code pretty-printers adjust the case to a pretty
look if it's really needed.

IMO, no ordinary code should ever be case-sensitive and it's a darned shame
that XML is uses case-sensitive identifiers.  I think it does mainly so it
can service languages that have made a bad design decision ... so it's a
dependent bad decision, not an independent one.

> Are you suggesting that this principle is inappropriate to apply to
> the character sequences that compose identifiers in source code?  That
> would mean that "ABLE" is the same identifier as "able".

Yes.

> I must admit
> that when I first found out that current lisps have case-insensitive
> symbol names, I thought it reminiscent of BASIC -- kind of a throwback
> to a time when memory was much more at a premium.  (I know that Lisp
> predates BASIC.  I'm talking about my reaction.)  I'd be happy to hear
> a good case for case-insensitive identifiers.

Cased names are often a substitute in infix languages for having given up
hyphen in a way that got messy.  You can't call a variable MOST-POSITIVE-FIXNUM
in most languages, because it thinks you mean MOST - POSITIVE - FIXNUM, a
subtraction.  Dylan requires you to put spaces around minus so it can
have both minus and subtraction.  Doing MostPositiveFixnum is not very
natural and also forces case to be used in a way that supports separation,
taking away the ability to use case for what it was intended for: supporting
the underlying language.  So if I have a word like eBusiness in "English"
and I want to compose it into a function, do I make it be MakeeBusinessName
or MakeEbusinessName or .... personally, I prefer make-eBusiness-name.

It might even be better to use _'s, but it's a shifted character on most
keyboards, and people with weak fingers hate shifting that often, so hyphens
tend to be preferred.  make_eBusiness_name might otherwise be better, and
would save confusion with minus sign.

[CL uses uppercase as the canonical case for the case-normalized name,
and that's controversial with some people, but some of us like it.  In any
case, it's orthogonal to this other question about case translation.]

In any case, my real point is not to say there's a 100% clear answer here,
but merely to motivate that the choice of case-translation is not archaic
but definitely has support from people who think themselves to be living
in the present.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Mar 25 2002, 12:06 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Mon, 25 Mar 2002 05:06:41 GMT
Local: Mon, Mar 25 2002 12:06 am
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)
* Ed L Cashin <ecas...@uga.edu>
| Could you elaborate on that a bit?  I'm interested because it appears
| that you're position is that case-sensitivity in identifiers is a Bad
| Thing for programming languages.

  I consider it a bad thing to believe that A is a different character from
  a just because it has a certain "presentation property".  I mean, we do
  not distinguish characters based on font or face, underlining or color,
  and most people realize that these are incidental properties.  However,
  capitalness of a letter is just as incidental: The fact that a letter is
  capitalized depending on such randomness as the position of the word in
  the sentence is a very strong indicator that "However" and "however" are
  not different words, which is effectively what case-sensitive people
  think they are.  I tried to publish text without this incidental property
  for a while, but it seemed to tick people off even more than calling an
  idiot an idiot.

| A general principle of mine is that if things are distinguishable, they
| should not be collapsed but the distinction should be preserved whenever
| possible.  Treating different characters as the same character, or
| treating different character sequences as equivalent, should be postponed
| as long as possible in order to preserve information.

  If you use colors to distinguish keywords from identifiers in our editor,
  can you use a keyword with a different color as an identifier?

| Are you suggesting that this principle is inappropriate to apply to the
| character sequences that compose identifiers in source code?  That would
| mean that "ABLE" is the same identifier as "able".

| I must admit that when I first found out that current lisps have
| case-insensitive symbol names, I thought it reminiscent of BASIC -- kind
| of a throwback to a time when memory was much more at a premium.

  But this is not the case.  The symbol names are case-sensitive, but the
  Common Lisp reader maps all unescaped characters to uppercase by default.
  You can change this.  Symbols are in this fashion just like normal words
  in your natural language.

| (I know that Lisp predates BASIC.  I'm talking about my reaction.)  I'd
| be happy to hear a good case for case-insensitive identifiers.

  I think case sensitivity is an abuse of an incidental property.  Thus, I
  want to hear a good case for case-sensitive identifers.  Older languages
  did not have this property, but after Unix (which has a case-insensitive
  tty mode!), the norm became to distinguish case, largely because there
  were no other namespace functionality in early C.  Unix also chose to use
  lower-case commands whereas Multics had always supported case-folding.  I
  believe the reason that the Unix people wanted to distinguish case was
  that it would require an extra instruction and a lookup table that would
  waste a precious 128 bytes of memory in the kernel, while we currently
  waste an enormous amount of memory to keep case-folding tables several
  times over.  In my view, case-sensitive identifiers has become the norm
  in a community that has failed to think about proper solutions to their
  problems, but rather choose to solve only the immediate problem, much
  like C strongly encourages irrelevant micro-optimization.  So instead of
  being nice to the user, they were nice to the programmer, who did not
  have to case-fold the incomding identifiers.  I consider moving this
  burdon onto the user to be quite user-inimical and actually quite foreign
  to people who do not know the character coding standards.  I mean, do we
  have case-sensitive trademarks, even though we traditionally capitalize
  proper names?  Are Oracle and ORACLE different companies any more than
  ORACLE in red boldface 14 point Times Roman is a different company than
  ORACLE in blue italic 12 point Helvetica?

  There has definitely been "paradigm shift" in computer people's view on
  case, but not in non-computer people.  Internet protocols like SMTP use
  case-insensitive commands.  The DNS is case-insensitive.  SGML is
  case-insensitive and so is HTML.  Because of the huge problems we face
  with case-folding Unicode (which must be done with a table of some kind),
  some people have figured that we should _not_ do case-folding.  That is
  the wrong solution to the problem.  The right solution to the problem is
  to get rid of case as a character property.

  Now, assume that we no longer have different character codes for lower-
  case and upper-case letters.  Would there be any difference in how we
  look at text on computer screens, in print, etc?  No, of course not.
  Therefore, people would still be able to distinguish identifiers visually
  based on case if they want to -- just like the Common Lisp reader allows
  you to write |car| to refer to the symbol named "car", and |CAR| to refer
  to the symbol named "CAR", and just like Unix can deal with upper- and
  lower-case letters even when iuclc and olcuc is in effect with the xcase
  option by backslashing the real uppercase characters in your input.  (In
  Common Lisp, you would backslash a lower-case character in the default
  reader mode, and the printer will escape those characters that should not
  be case-folded.)  However, being able to do something and actually doing
  it are two very different things.  E.g., on TOPS-20, you could use
  lower-case letters in filenames if you really wanted to, by prefixing
  them with ^V.  Very few people bothered to do this because typing it in
  was a hassle.  I do not propose any change to how we input upper and
  lower case, but with the anal-retentive approach to saving bits, which
  has even gone so far as to write FooBarZot instead of foo-bar-zot, the
  probablity that they C freaks would have chosen case-sensitivity would be
  remarkably lower -- if we could go back and design the world over...

///
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Christopher Browne  
View profile  
 More options Mar 25 2002, 12:37 am
Newsgroups: comp.lang.lisp
From: Christopher Browne <cbbro...@acm.org>
Date: Mon, 25 Mar 2002 00:28:15 -0500
Local: Mon, Mar 25 2002 12:28 am
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)
Centuries ago, Nostradamus foresaw when Kent M Pitman <pit...@world.std.com> would write:

> Psychology experiments have empirically shown that memory is
> auditory.  That is, when you misremember words, you misremember them
> by soundalike, not by lookalike.  There is also ample linguistic
> evidence that the core of human language is an auditory phenomenon.
> When languages vary, they first change in their spoken form and then
> later writing catches up, not much vice versa.

I agree in part.

The "western" languages certainly are representative of that; our
languages are largely a way of taking what we say and putting it on
paper.  (Computers being an insignificant "blip" thus far in the
history of it :-).)

My understanding of the Asian languages is that they are often _not_
such a representation; what is written is _not_ an account what is
spoken.  Writing is, there, representative of a separate language.  In
more clearly "pictographic" languages, there may _not_ be an auditory
form except as constructed afterwards.

That caveat being given, words don't usually sound different when they
have different casing and aren't usually recognized as being
different.

"That" is not a different word from "that."
--
(reverse (concatenate 'string "ac.notelrac.teneerf@" "454aa"))
http://www.ntlug.org/~cbbrowne/linux.html
"Of  _course_ it's the murder weapon.   Who would frame someone with a
fake?"


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Duane Rettig  
View profile  
 More options Mar 25 2002, 5:01 am
Newsgroups: comp.lang.lisp
From: Duane Rettig <du...@franz.com>
Date: Mon, 25 Mar 2002 10:00:01 GMT
Local: Mon, Mar 25 2002 5:00 am
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)
Ed L Cashin <ecas...@uga.edu> writes:

> A general principle of mine is that if things are distinguishable,
> they should not be collapsed but the distinction should be preserved
> whenever possible.  Treating different characters as the same
> character, or treating different character sequences as equivalent,
> should be postponed as long as possible in order to preserve
> information.

This is your opinion, and many people agree with you, but many do not,
as well.  This is a very controversial subject.   And it's not just in
comp.lang.lisp that you'll find this same controversy; at about the same
time as our last discussion here there was a similar one raging on
comp.arch.  The difference was that here the case-insensitive style being
advocated was (of course) the case-folding style that the Common Lisp
reader standardizes, and in comp.arch the predominant case-insensitive
style being argued was the "case-preserving" style, which is the kind
of recognition style that both Mac and Windows filesystems support
(i.e. first reference gets internalized as originally specified, but
subsequent references are matched against the filename without regard
to case).  This case-preserving insensitive style was being pitted
against the Unix case-sensitive style.  Of course, neither side
changed the other's mind.

Arguing case-sensitivity is very similar to arguing endianness; there
are good arguments for both big-endian and little-endian, and neither
side is fully right or fully wrong, though a decision must usually be
made, because it is generally hard to mix the two together in the same
machine.

> Are you suggesting that this principle is inappropriate to apply to
> the character sequences that compose identifiers in source code?  That
> would mean that "ABLE" is the same identifier as "able".  I must admit
> that when I first found out that current lisps have case-insensitive
> symbol names, I thought it reminiscent of BASIC -- kind of a throwback
> to a time when memory was much more at a premium.  (I know that Lisp
> predates BASIC.  I'm talking about my reaction.)  I'd be happy to hear
> a good case for case-insensitive identifiers.

First, I'll note (as others have) that Common Lisp does have
case-sensitive identifiers, and always has.  It is the reader that
is specified to fold to uppercase by default.  And even the
standard CL reader is highly configurable, to allow cases to be
specified by readtable options.

Second, the choice of case-sensitivity or not is not bounded by
time.  Going back to the endianness question, some engineers 10
years ago said "the little-endian side has lost".  However, I
suspect that if you count all of the little-endian machines in
existence today, you find it hard to justify that claim.  In
fact, even many computers which are generally considered to be
big-endian are now architected to allow for either endianness.

Finally, I personally believe in choice.  Our own product has
always allowed one to choose whether to decide on the Common Lisp
specified case-insensitive reader, or whether to configure the reader
to be case-sensitive by default.  Our customer base has always taken
advantage of that choice, with anywhere from approximately 20% to 35%
choosing the case-sensitive mode, and the majority choosing the Common
Lisp (case-insensitive, folding to uppercase) mode.  And of course,
this does not account for people who use lisps of both modes for
different purposes.  Nowadays, there is a slight increase in
case-sensitive mode for the purpose of interfacing relatively directly
with some currently popular case-sensitive languages.  The point,
though, is that we have always provided a choice, and always intend
to provide a choice.

In fact, Kent Pitman recently sent us a proposal for unifying
the two major case-modes that Allegro CL provides, in such a
way that the two can exist in the same lisp simultaneously.
We have an rfe (request for enhancement document) which starts
with his proposal as a basis.  I would love to see us succeed
in making this or any similar unification, and I was excited to
see Kent's proposal when he sent it to us.

It's all about choice.  Calling the case-insensitive choice a
"throwback" is the same as calling it invalid (or no longer
valid).  And based on my own experience here and in comp.arch,
that is simply incorrect.  People still choose both styles,
and probably always will.

--
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   du...@Franz.COM (internet)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthias Blume  
View profile  
 More options Mar 25 2002, 8:22 am
Newsgroups: comp.lang.lisp
From: Matthias Blume <matth...@shimizu-blume.com>
Date: Mon, 25 Mar 2002 13:17:16 GMT
Local: Mon, Mar 25 2002 8:17 am
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)

Erik Naggum <e...@naggum.net> writes:
> * Ed L Cashin <ecas...@uga.edu>
> | Could you elaborate on that a bit?  I'm interested because it appears
> | that you're position is that case-sensitivity in identifiers is a Bad
> | Thing for programming languages.

>   I consider it a bad thing to believe that A is a different character from
>   a just because it has a certain "presentation property".  I mean, we do
>   not distinguish characters based on font or face, underlining or color,
>   and most people realize that these are incidental properties.  However,
>   capitalness of a letter is just as incidental: The fact that a letter is
>   capitalized depending on such randomness as the position of the word in
>   the sentence is a very strong indicator that "However" and "however" are
>   not different words, which is effectively what case-sensitive people
>   think they are.

This is not strictly true in all (natural) languages.

Example 1: German:
   - no 1-1 correspondence between upper-case and lower-case (there is one
     letter that only exists in the lower-case set)
   - some words change class, meaning, and pronunciation when going from
     one case to the other (example: Weg vs. weg)
   - case is used (or at least has been -- until it became non-pc in some
     circles) to put semantic fine points into print (e.g., capitalization of
     the second person in letters for politeness)

Example 2: Japanese
   - there is no distinction between upper-case and lower-case at all
   - HOWEVER: there are still two distinct sets of the phonetic characters
     called "hiragana" and "katakana".  Either one could spell the entire
     language, but usage of the two sets again depends on things like
     origin of the word in question, emphasis, style, etc.
     One could think of katakana as the upper-case version of hiragana.
     Usage is often analogous, for example one would sometimes find
     hiragana words spelled in katakana for EMPHASIS.
   - Written Japanese also uses kanji (Chinese characters), all of which could
     be spelled either in hiragana or katakana.  Unfortunately, the mapping
     between kanji and hiragana is many-to-many, which shows that the "is the
     same word" relationship is not an equivalence relation because it is
     not transitive:  "hashi" (chopsticks) and "hashi" (bridge) are spelled
     exactly the same in hiragana (but are pronounced slightly differently),
     but the kanji for the respective words are not the same.  OTOH, "kyou"
     and "konnichi" are clearly not the same words when spelled phonetically,
     but both correspond to the same kanji combination.  There are literally
     thousands of examples for this in Japanese (which does not make it particularly
     easy to learn :-).

Example 3: English
   - Speaking of "him" and speaking of "Him" are clearly semantically very different.

Example 4: Mathematics  (well, this one is not "natural", after all...)
   - In the "language of mathematics" we frequently make semantic distinctions
     between typographically different versions of the "same" character.

Anyway, all I wanted to say was that the distinction between different
versions of a character set are not completely incidental in many
(most?) natural languages.  I do not want to use this as as argument
for or against case-sensitive identifiers in programming languages,
since I do not think that programming languages should in any form or
manner be modelled after natural ones.  (However, I must admit that I
personally prefer being able to use mixed case when programming.)

Matthias


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Mar 25 2002, 9:14 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Mon, 25 Mar 2002 14:14:10 GMT
Local: Mon, Mar 25 2002 9:14 am
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)
* Matthias Blume <matth...@shimizu-blume.com>
| This is not strictly true in all (natural) languages.

  All of these arguments indicate that using the capital letter for the
  sentence-initial word is a very bad design choice for a written language;
  it violates that strong sense of difference that those who want it to
  exist focus so strongly on.  However, I would argue that the sheer
  acceptability of destroying the importance of the capital letter in the
  sentence-intiial word cannot be ignored.  When I tried to _preserve_ the
  case of the word despite its position in the sentence, this was regarded
  as Very Wrong by a bunch of hostile lunatics.  This indicated to me that
  case is _primarily_ incidental, since the intrinsic role can at any time
  be overridden by the incidental role -- specifically, you have no idea
  whatsoever what the capitalization of the sentence-initial word would be
  if it were moved, yet this causes absolutely no problem for anyone.

| Anyway, all I wanted to say was that the distinction between different
| versions of a character set are not completely incidental in many (most?)
| natural languages.

  In real life, nothing is ever completely anything.  People use and abuse
  case "because it's there".  This would not change if capital letters were
  coded with a "flag" that communicated capitalness.  On the contrary, if
  we had such a flag, the natural development is to have _two_ flags: One
  for the incidental capital and one for the intrinsic capital.  In either
  case, the display and the coding properties of a character should be
  separated.  You provided an excellent example of this with hiragana and
  katakana.

| I do not want to use this as as argument for or against case-sensitive
| identifiers in programming languages, since I do not think that
| programming languages should in any form or manner be modelled after
| natural ones.

  That is not the argument.  Please try to understand this.  The point is
  that I have taken the liberty to design the world over again, backing up
  to _before_ computer geeks coded their character sets, and making a
  crucial change to the coding of upper-case vs lower-case characters.  The
  names "upper-case" and "lower-case" refer to typographic characteristics,
  not meaning.  Meaning may be coded separately from typography, just as we
  do in almost every other case,

| (However, I must admit that I personally prefer being able to use mixed
| case when programming.)

  If it had been most costly for you to achieve this, in terms of "knowing"
  that you would waste additional space to encode capital letters, would
  you still have done preferred it?  I believe, from the reactions to the
  extended experiment with not randmoly upcasing the sentence-initial word,
  that people would be inclined to accept a coding overhead for that role,
  as well as for proper nouns, but randmonly and liberally sprinkling such
  overhead throughout identifiers in order to achieve an unnatural visual
  effect only because it could be done, would most likely not happen.  As
  Common Lisp uses the hyphen to separate words, which would have no higher
  overhead than embedded capital letters, other languages would have far
  less inclination to make this horrible mistake, and would therefore not
  _require_ case-sensitivity.

  Whether the programmers would prefer a case-folding or a case-preserving
  case-insensitivty is an open question, but at least designing languages
  and coding conventions to use case would not likely happen if case was
  regarded as just as incidental as color or typeface.

///
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthias Blume  
View profile  
 More options Mar 25 2002, 10:54 am
Newsgroups: comp.lang.lisp
From: Matthias Blume <matth...@shimizu-blume.com>
Date: 25 Mar 2002 10:40:50 -0500
Local: Mon, Mar 25 2002 10:40 am
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)

Erik Naggum <e...@naggum.net> writes:
>   [ ... ] The point is
>   that I have taken the liberty to design the world over again [...]

Oh, how I'd *love* to live in a world where Erik Naggum is God... :-)

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Mar 25 2002, 11:12 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Mon, 25 Mar 2002 16:11:59 GMT
Local: Mon, Mar 25 2002 11:11 am
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)
* Matthias Blume <matth...@shimizu-blume.com>
| Oh, how I'd *love* to live in a world where Erik Naggum is God... :-)

  Yeah, me too.  Then I could force you to pay attention to the premises
  that start a discussion instead of completely ignoring the context.
  Please see <3225942059872...@naggum.net>, and pay particular attention to
  what Thomas Bushnell wrote.

  Sheesh, some people.

///
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthias Blume  
View profile  
 More options Mar 25 2002, 12:41 pm
Newsgroups: comp.lang.lisp
From: Matthias Blume <matth...@shimizu-blume.com>
Date: 25 Mar 2002 12:35:46 -0500
Local: Mon, Mar 25 2002 12:35 pm
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)

Erik Naggum <e...@naggum.net> writes:
> * Matthias Blume <matth...@shimizu-blume.com>
> | Oh, how I'd *love* to live in a world where Erik Naggum is God... :-)

>   Yeah, me too.

I was under the impression that you thought you already did. :-)

>   Then I could force you to pay attention to the premises
>   that start a discussion instead of completely ignoring the context.
>   Please see <3225942059872...@naggum.net>, and pay particular attention to
>   what Thomas Bushnell wrote.

To be frank, I do not care *one bit* about what this discussion was
originally about.  I was merely commenting on your claim about
capitalization being "incidental".  The debate of whether or not
case-sensitive identifiers in programming languages are Good or Evil,
or which character set design use up more bits than others, etc., bore
me.

Matthias


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kent M Pitman  
View profile  
 More options Mar 25 2002, 1:00 pm
Newsgroups: comp.lang.lisp
From: Kent M Pitman <pit...@world.std.com>
Date: Mon, 25 Mar 2002 17:59:59 GMT
Local: Mon, Mar 25 2002 12:59 pm
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)

Capitalization _is_ incidental.  It is ceremonially marked in written
text, but my impression based on a basic knowledge of linguistics and
a casual outside view of German [I don't purport to speak the
langauge] is that German people may claim that "weg" and "Weg" are
different words, but the capitalization is not pronounced audibly, so
there is generally enough contextual information to disambiguate in
speech.  Certainly this is the case for English situations like "God
loves you." and "The god loves you."  These are different words, God.
One is a proper name and one isn't.  But if it were miscapitalized
"god loves you" or "The God loves you".  It is possible for there to
be ambiguity in spite of this in some cases, but it's also possible to
have ambiguity in the case of correct case, too.  Human language is
not precise.  But normally where a confusion is common, some audible
notation arises to disambiguate.  And, incidentally, the audible
notation is [to my knowledge] never the addition of the word
"uppercase" or "lowercase" because that just isn't the issue in play.
It's usually the addition of a guide word, a case marking, a
determiner, etc.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthias Blume  
View profile  
 More options Mar 25 2002, 1:59 pm
Newsgroups: comp.lang.lisp
From: Matthias Blume <matth...@shimizu-blume.com>
Date: 25 Mar 2002 13:43:11 -0500
Local: Mon, Mar 25 2002 1:43 pm
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)
Kent M Pitman <pit...@world.std.com> writes:

> [ ... ] outside view of German [I don't purport to speak the
> langauge] is that German people may claim that "weg" and "Weg" are
> different words, but the capitalization is not pronounced audibly,

The two words are pronounced very differently.

> so there is generally enough contextual information to disambiguate in
> speech.

Ok, so everything that can be inferred from context is "incidental"
then?  Most spelling mistakes can be inferred from context, so should
we make programming languages tolerate them?  (It has been tried, as
you know.)

Anyway, this whole debate is supremely silly, IMHO.  Fortunately
neither you nor Erik get to dictate the rules, at least not for those
languages that I speak or program in...

Matthias


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Back to character set implementation thinking" by Thomas Bushnell, BSG
Thomas Bushnell, BSG  
View profile  
 More options Mar 25 2002, 2:00 pm
Newsgroups: comp.lang.lisp, comp.lang.scheme
From: tb+use...@becket.net (Thomas Bushnell, BSG)
Date: 25 Mar 2002 10:56:42 -0800
Local: Mon, Mar 25 2002 1:56 pm
Subject: Back to character set implementation thinking

Erik Naggum <e...@naggum.net> writes:
>   Yeah, me too.  Then I could force you to pay attention to the premises
>   that start a discussion instead of completely ignoring the context.
>   Please see <3225942059872...@naggum.net>, and pay particular attention to
>   what Thomas Bushnell wrote.

So, getting back to my original question about charset implementations
in Lisp/Scheme (though actually Smalltalk or any such
dynamically-typed language will have the same questions and probably
the same kinds of solutions), I've done some more study and thinking,
so let me try again.  My previous question was a tad innocent, it
appears, because I was unaware of the great changes that have taken
place in Unicode since the last time I read through it and grokked the
whole thing (which was back at version 1.2 or something).  

I haven't fully internalized the terminology yet, though I'm trying.
So please bear with any minor terminological gaffes (and correct them,
too).  

The GNU/Linux world is rapidly converging on using UTF-8 to hold
31-bit Unicode values.  Part of the reason it does this is so that
existing byte streams of Latin-1 characters can (pretty much) be used
without modification, and it allows "soft conversion" of existing
code, which is quite easy and thus helps everybody switch.

But I'm thinking about a "design the world over again" kind of
strategy.  Now Erik is certainly right that capitalization *should* be
a combining character kind of thing.  So let me stipulate that I want
to take Unicode as-is; I get to design *my computer system*, subject
to the a priori constraint that Unicode has done a *lot* of work, so I
will accept slight deficiencies if they help Unicode work right on the
system.  So I'll take the existing Unicode encodings, even if they
don't do capitals just like we'd want.

But I don't get to redesign existing communications protocols and
such; however, that's an externalization issue, and for internal use
on the system, such protocols don't matter.  Similar comments apply
for existing filesystems formats, file conventions, and the like.

Now, I *could* just use UTF-8 internally, but that seems rather
foolish.  I think it's obvious that characters should be "immediately"
represented in pointer values in the way that fixnums are.

Now the Universal Character Set is officially 31 bits, but only 16
bits are in use now, and it is expected that at most 21 bits will be
used.  So that means it's pretty easy to make sure the whole space of
UCS values fits in an immediate representation.  That's fine for
working with actively used data.

However, strings that are going to be kept around a long time should,
it seems to me, be stored more compactly.  Essentially all strings
will be in the Basic Multilingual Plane, so they can fit in 16 bits.
That means there would be two underlying string datatypes.  I don't
think this is a serious problem.  Is it worth having a third (for
8-bit characters) so that Latin-1 files don't have to be inflated by a
factor of two?  It seems to me that this would be important too.
Basically then we would have strings which are UCS-4, UCS-2 and
Latin-1 restricted (internally, not visibly to users).

So even if strings are "compressed" this way, they are not UTF-8.
That's Right Out.  They are just direct UCS values.  Procedures like
string-set! therefore might have to inflate (and thus copy) the entire
string if a value outside the range is stored.  But that's ok with me;
I don't think it's a serious lose.

So is this sane?

Ok, then the second question is about combining characters.  Level 1
support is really not appropriate here.  It would be nice to support
Level 3.  But perhaps Level 2 with Hangul Jamo characters [are those
required for Level 2?] would be good enough.

It seems to me that it's most appropriate to use Normalization Form
D.  Or is that crazy?  It has the advantage of holding all the Level 3
values in a consistent way.  (Since precombined characters do not
exist for all possibilities, Normalization Form C results in some
characters precombined and some not, right?)

And finally, should the Lisp/Scheme "character" data type refer to a
single UCS code point, or should it refer to a base character together
with all the combining characters that are attached to it?

Thomas


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "case-sensitivity and identifiers (was Re: Wide character implementation)" by Kent M Pitman
Kent M Pitman  
View profile  
 More options Mar 25 2002, 2:31 pm
Newsgroups: comp.lang.lisp
From: Kent M Pitman <pit...@world.std.com>
Date: Mon, 25 Mar 2002 19:30:34 GMT
Local: Mon, Mar 25 2002 2:30 pm
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)

Please read Aristotle on Virtue Ethics.  The mean between unreasonable
extremes is not something with a fixed answer.  The fact that its precise
point in design space is not uniquely determined does not mean it should
not be something people strive for.  If anyone seriously wants to defend
spelling errors as a good design theory, we could have a discussion about
it.  Otherwise, it's a pointless red herring.  I do, however, contend a
theory behind the point of view CL has, and was merely describing that
point of view.

> Anyway, this whole debate is supremely silly, IMHO.  Fortunately
> neither you nor Erik get to dictate the rules, at least not for those
> languages that I speak or program in...

We aren't dictating rules, and I personally don't really appreciate this
attempt to recast my defense of an arbitrary but reasonable design choice
into some sort of attempt at an ignorant attempt to control the world.

All we have done is to try to explain the present state of affairs based
on an attempt for harmony with something people do with a great deal of
statistical regularity.  Probably there is no deed that everyone does with
any predictability other than, as they say, death and taxes, but it seems
inappropriate to base design on the idea that this implies no other
large scale regularities worth checking into...


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Bushnell, BSG  
View profile  
 More options Mar 25 2002, 2:50 pm
Newsgroups: comp.lang.lisp
From: tb+use...@becket.net (Thomas Bushnell, BSG)
Date: 25 Mar 2002 11:44:55 -0800
Local: Mon, Mar 25 2002 2:44 pm
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)
Kent M Pitman <pit...@world.std.com> writes:

> Please read Aristotle on Virtue Ethics.  The mean between unreasonable
> extremes is not something with a fixed answer.  

It can also only be determined by the man with a particular virtue
known as "practical wisdom", as well.  And, with practical wisdom,
comes all the virtues, not just one or two.  Which means that only the
person with true virtue is even able to tell what the Right Thing to
do is.

Aristotle's talk of a "mean" is a metaphor, of course.  It's some kind
of balance, some kind of "just enough" notion.

Some medievals liked to poo poo this by taking it overliterally, with
a rather snide attack.  Thomas Aquinas, however, liked the "mean"
theory, and here's how he treats of the snide attackers (from the
"Quastio disputata de virtutibus in communi", Article 13, Objection 7
and the response):

  Whether virtue lies in a mean.  It seems not....Boethius in "On
  arithmetic" speaks of a threefold mean, the arithmetical, as 6
  between 4 and 8 which is an equal distance from both, and the
  geometrical, as 6 between 9 and 4, which is proportionally the same
  distance from both, and the harmonic or musical mean, as 3 between 6
  and 2 because there is the same proportion of one extreme to the
  other, namely, 3 (which is the different between 6 and 9) to 1 which
  is the difference between 2 and 3.  But none of these means is found
  in virtue, since the mean of virtue does not relate equally to
  extremes, nor in a quantitative way nor according to some proportion
  of the extremes and differences.  Therefore, virtue does not lie in
  the mean.

  [replies Thomas]: It should be said that the means spoken of by
  Boethius lie in things and thus are not relevant to the mean of
  virtue which is determined by reason.  Justice seems to be an
  exception since it involves both a mean in things and another
  according to reason: The arithmetical mean is relevant to exchange
  and the geometrical to distribution, as is clear from [Aristotle's
  Nicomachean] Ethics [book] 5.

Anyway, I'd recommend the Nicomachean Ethics of Aristotle to anyone
interested in thinking.  You'll find it aggravating; he's quite
unmodern and actually quite bogus in a lot of ways, but he is truly
important and it will change a great deal about how you think, if you
take it seriously.

Thomas


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Michael Parker  
View profile  
 More options Mar 25 2002, 3:13 pm
Newsgroups: comp.lang.lisp
From: mparker...@hotmail.com (Michael Parker)
Date: 25 Mar 2002 12:13:25 -0800
Local: Mon, Mar 25 2002 3:13 pm
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)

Erik Naggum <e...@naggum.net> wrote in message <news:3226054464281011@naggum.net>...
>   ... but at least designing languages
>   and coding conventions to use case would not likely happen if case was
>   regarded as just as incidental as color or typeface.

OTOH, if terminals had gotten color and typefaces earlier, maybe
programming languages would have evolved to use them.  Maybe give
each namespace its own color, so you would specify the value of a
name by putting it in blue, the function by using red, keywords in
italics, macros in green.  The mind boggles at the possibilities.
In fact, if you want to boggle your mind, see

http://www.sleepless-night.com/cgi-bin/twiki/view/Main/ColorForth

Which describes Chuck Moore's latest dialect of forth that does
this sort of thing.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthias Blume  
View profile  
 More options Mar 25 2002, 3:26 pm
Newsgroups: comp.lang.lisp
From: Matthias Blume <matth...@shimizu-blume.com>
Date: 25 Mar 2002 15:08:13 -0500
Local: Mon, Mar 25 2002 3:08 pm
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)
Kent M Pitman <pit...@world.std.com> writes:

> We aren't dictating rules, and I personally don't really appreciate this
> attempt to recast my defense of an arbitrary but reasonable design choice
> into some sort of attempt at an ignorant attempt to control the world.

Sorry, I was unreasonably hash on you, Kent.

> All we have done is to try to explain the present state of affairs based
> on an attempt for harmony with something people do with a great deal of
> statistical regularity.

As I have tried to point out, this sort of regularity isn't actually
quite as regular as some try to make it.  The Japanese language is a
great example (although there the distiction is not called "uppercase vs.
lowercase").

By the way, here is an example in a case-sensitive natural language
where the distinction between uppercase and lowercase gets
*pronounced*: "mit" vs. "MIT" in German.  The first means "with" and is
pronounced like "mitt", the second is the Massachussetts Institute of
Technology and is pronounced like speakers of English would pronounce
it: em-ay-tee.  I think that there are enough examples of this around
so that making a distinction between uppercase and lowercase is
warranted in the natural language case.  Again, I do not think that
this needs to be in any way correlated with the PL case.

Matthias


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andreas Eder  
View profile  
 More options Mar 25 2002, 4:25 pm
Newsgroups: comp.lang.lisp
From: Andreas Eder <Andreas.E...@t-online.de>
Date: 25 Mar 2002 22:02:01 +0100
Local: Mon, Mar 25 2002 4:02 pm
Subject: Re: case-sensitivity and identifiers (was Re: Wide character implementation)
Kent M Pitman <pit...@world.std.com> writes:

> Capitalization _is_ incidental.  It is ceremonially marked in written
> text, but my impression based on a basic knowledge of linguistics and
> a casual outside view of German [I don't purport to speak the
> langauge] is that German people may claim that "weg" and "Weg" are
> different words, but the capitalization is not pronounced audibly, so
> there is generally enough contextual information to disambiguate in
> speech.

Well, in fact 'Weg' and 'weg' *are* pronounced differently, one with a
long 'e' and the other with a short one - that is because they are
different words. Should you incidentally start a sentence with 'weg',
thus writing it with capital 'W' it would still be pronounced like
'weg'. This might be difficult to understand, but that is how natural
languages are, I guess.

Andreas
--
Wherever I lay my .emacs, there´s my $HOME.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 26 - 50 of 160 < Older  Newer >
« Back to Discussions « Newer topic     Older topic »