Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Attn: Unicode Inc worker Philippe Verdy

0 views
Skip to first unread message

Tulasi

unread,
Sep 18, 2011, 1:06:08 AM9/18/11
to ver...@wanadoo.fr, v-ma...@microsoft.com
Attn: Philippe Verdy
C/o Magda Danish
Sr Administrative Director
Unicode Inc
<ver...@wanadoo.fr>, <v-ma...@microsoft.com>,

None from Assam Government and Assam Literary Society has asked
Unicode Inc to encode Assamese stuff.

Can you reply back with detailed information on what prompt Unicode
Inc
to encode Assamese stuff as "Bengali"?

Thank you in advance for providing this information,

Tulasi
PS: Your email thread appended as reference


From: Philippe Verdy <ver...@wanadoo.fr>
Date: Wed, Sep 14, 2011 at 2:03 AM
Subject: Re: VS: continue: Glaring Mistake in nomenclature
To: Erkki I Kolehmainen <e...@iki.fi>
Cc: delex r <del...@indiatimes.com>, uni...@unicode.org


I may give some excuses to him, if he is not aware of the technical
justification of why names are immutables. But what he really wants is
to avoid being exposed to these "Bengali" names. This is not a matter
of tehcnical encoding, but more a question of localisation (for
example when using a character picker application, or when searching
character collections by names).

Nothing forbids a localized application of using accurate names that
match a specific language expectation about its alphabet.

But Mr Delex must understand that the UCS (by Unicode or ISO/IEC
10646) does NOT encode language-specific alphabets, but "unified"
scripts that share a lot of common letters and a common structure is
such a way that those languages can be freeely mixed and interchanged
without duplicating the letters.

May be the question could be forwarded to the CLDR project, about the
localisation of letter names for language-specific alphabets. For now
the CLDR project still has problems in just knowing which letters are
representative of the orthography used a single language (for example,
is the letter "é" is part of the English alphabet ? Is the letter "ā"
part of the French alphabet, because it is used in official toponymy ?
Same thing about "Å" for example in "Åland"...)

Just consider how we use the alphabets today, we frequently borrow
foreign letters from foreign alphabet, very easily because they are in
fact part of the same unified "script". Still, we do not need to
necessarily locally name those borrowed letters using the name of our
local alphabet for out local language.

But new characters won't generally be reencoded in the UCS (the UCS
still chose to NOT unify the Latin, Greek and Cyrilic alphabets, and
even chose later to desunify the Coptic alphabet from Greek; on the
opposite, it refused to desunify the Fraktur and Celtic alphabets from
Latin, because there was no frequent cases, clearly contrasting, where
such desunification would be necessary; at the same time the UCS still
maintains the IPA symbol set as a full part of the Latin alphabet,
even if it required reencoding in the Latin script some Greek
letters).

There are tehnical tradeoffs in those decisions of unification or
desunification of scripts. But it is important to understand that
scripts in the UCS are definitely not the same as alphabets; scripts
still need to be (arbitrarily) named from the name of the most
representative alphabet encoded in it (or an alphabet already
supported by a widely used legacy standard), and there's a good reason
to give technical names for all characters encoded in that script,
that reference this arbitrary script name, independantly of their use
in language specific alphabets.

Notes:

- ignore above the subclassification of "alphabets" into "true
alphabets", "abjads", "abugidas", or even "syllabaries", even if this
classification plays a very important role in the decision of unifying
them or desunifying them in the same "script" in the UCS.

- For Mr Delex: the "UCS" is the Universal Character Set, i.e. the
same **unified** repertoire standardized and encoded internationally
by both the Unicode standard and the ISO/IEC 10646 standard (and all
their annexes). Both standards do not encode language-specific
alphabets and cannot even give distinctive names used in various
languages to reference the same unified characters.

-- Philippe.

2011/9/14 Erkki I Kolehmainen <e...@iki.fi>:
> Dear Mr. Delex,
>
> Please, please spare us from further details in support of your crusade. You should finally accept the fact that the official block name cannot be changed, the rest is effectively OT.
>
> Thank you!
>
> Erkki I. Kolehmainen


Tulasi

unread,
Sep 18, 2011, 10:45:12 AM9/18/11
to ver...@wanadoo.fr, Magda Danish
On Sep 17, 10:06 pm, Tulasi <tulas...@gmail.com> wrote:
> Attn: Philippe Verdy
> C/o   Magda Danish
>         Sr Administrative Director
>         Unicode Inc
>         <verd...@wanadoo.fr>, <v-mag...@microsoft.com>,
>
> None from Assam Government and Assam Literary Society has asked
> Unicode Inc to encode Assamese stuff.
>
> Can you reply back with detailed information on what prompt Unicode
> Inc
> to encode Assamese stuff as "Bengali"?
>
> Thank you in advance for providing this information,
>
> Tulasi
> PS: Your email thread appended as reference
>
> From: Philippe Verdy <verd...@wanadoo.fr>
> Date: Wed, Sep 14, 2011 at 2:03 AM
> Subject: Re: VS: continue: Glaring Mistake in nomenclature
> To: Erkki I Kolehmainen <e...@iki.fi>
> Cc: delex r <del...@indiatimes.com>, unic...@unicode.org
>
> http://groups.google.com/group/soc.culture.french/browse_thread/thread/31257e85b3088a56#


You did not answer the question.

> cruisade

History says that "cruisade" is war between Christian and Islam while
Christian initiates the war.

I am neither Christian not Islam. Nor do/would I encourage "cruisade".

Look forward to your answer, no irreverent stuff please.

Tulasi


From: Philippe Verdy <ver...@wanadoo.fr>
Date: Sun, Sep 18, 2011 at 3:49 AM
Subject: Re: Attn: Unicode Inc worker Philippe Verdy
To: Tulasi <tula...@gmail.com>
Cc: v-ma...@microsoft.com


Please don't include me in your cruisade. Everybody has explained you
that no reencoding will occur. Even in the ISCII standard, where both
language names were used, there was no difference between the two
codepages that were identical. Under the UCS, there will be no
reencoding, and as such, the two alphabets will remain unified in the
same script.

It's impossible to have two normative names for the script. Remember
that they are used as technical references, not linguistic references.
The UCS does NOT encode separate alphabets, but only unified scripts.
And there's absolutely no reason to disunify the Bengali script, even
if it is known under another name in Assamese.

Once again, your request to change normative names will fail. It's
definitely not a matter of encoding, there's no "bug". It's only a
matter of localisation. But here again, script names and character
names are NOT localized in the UCS. Stop asking something that does
not interest the Unicode standard, or the ISO/IEC 10646 international
standard which is using exactly the same technical names.

What I proposed you is to work on a localisation project for your
Assamese language. And it's perfectly possible to develop a character
picker in your language that will display other localized names than
those technical names. This can be developed separately from the
standard, just like it was done in the French language by Unipax.ca
for listing French names of scripts and characters in the UCS, or in
the Windows charmap tool for example.

And anyway, I have stopped replying to you since long on the Unicode
mailing list, notably because you have been extremely harassing to
peoples. I'm fed up of reading your hateful messages. You have abused
the etiquette of this mailing list or even the general netiquette
which your ISP must have instructed you to respect in its contract
with your subscription.

Stop immediately, and learn to read all those responses you have had
from lots of people irritated by the tone of your messages. We no
longer want to hear you before a long time (you'll have first to
demonstrate that you can do something useful to the community, on a
separate project or website of your own, or within your community, but
for now you're just going nowhere).

-- Philippe.

> http://groups.google.com/group/soc.culture.french/browse_thread/thread/31257e85b3088a56#
0 new messages