Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

FS CaSe thIng

1 view
Skip to first unread message

Kai Henningsen

unread,
May 29, 1999, 3:00:00 AM5/29/99
to
I thought this rant might well go to a slightly wider audience than darwin-
development ...

yu...@aladdin.de (Yury Levin) wrote on 28.05.99 in <4125677F.0...@saturn.aladdin.de>:

> Cesar,
> The question should be "how to make this better, not "how to get
> used to system designed 30 years ago". The algorithm purposed here on
> the list seems to deal well with the 1st question (btw w/o compromizing
> unix). Well, maybe case sensitivity is a good thing, but then it seems
> that going back to ones and zeros is even more strict and convinient
> for the computer...

I've been living with both case-sensitive (cs) and case-insensitive (ci)
systems for a long time now (some of those ci systems being case-
preserving (cicp), some not).

As long as the cs systems were isolated from the ci systems, all was fine.

But once they were no longer isolated, all hell broke loose. And it hasn't
stopped. And the *worst* offenders by far were cicp systems.

99.9% of name case problems occur because a cicp system confused a cs
system. I don't even remember ever being upset about a cs system confusing
a ci or cicp system.

As for a cs system actually having files differing only in case, I can
think of three possible cases here:

1. Some idiot actually using the makefile vs. Makefile search path built
into make since ancient times. (Never seen that.)
2. Machine-generated file names not meant for users to use (temp file
names, spool file names).
3. C++ header file names. (Don't ask me why. I think it's insane. But
then, I think C++ is an insane language.)

And from experience converting code between Pascal and C, my personal
conviction is that what we really want is case-sensitive with a twist:
don't allow names that only differ in case. Ok, not really, but my point
is that there's a difference between what a (file system | compiler) says
is different, and what a (user | programmer) *relies on* to describe
different things. And that is a good thing.

You don't usually name your (variables | files) "I1I1I1", "III111", and so
on, even though every compiler and file system I've seen would allow you
to get away with it. OTOH, you don't expect those to be recognized as "the
same", either. Except maybe for a spelling checker or a search engine,
which are *specialized* to solve this type of problem.

Or substitute "My income tax report for 1999" and "1999 income tax" for
that if you think about poor dumb users who save their file under two
different cases and then don't know which is which: eliminating case
differences *doesn't bloody solve the problem anyway*!

Case differences are one, *very* small, part of the problem. Getting a
file system to ignore case solves perhaps 0.01% of the problem. It's
ridiculous as a substitute for a solution. It's nothing more than an
illusion of a solution. It doesn't solve any of the important problems.

Oh, and it is nearly impossible to get right, anyway, because case
differences don't work the same in different languages. Tell me, what does
your favourite case-insensitive algorithm say about these German words:
"MASSE", "MASZE", "Masse", and "Maße"? (Hint: without context, it is not
clear if "MASSE" means "Masse" or "Maße", but "Masse" (mass, weight)
definitely isn't "Maße" (measures). And I think it may be slightly
different in Swiss German.)

Case insensitivity as in "we only allow a very restricted subset of ASCII"
is one thing, and there are a number of places where it's appropriate.

Case insensitivity coupled with case preserving, however, is a solution
looking for a problem - or even a problem in itself. I suspect it was
invented by English speakers, who just had no idea how complex a problem
case really is.

MfG Kai

0 new messages