Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Convert to UpperCase WITHOUT accent characters

1,369 views
Skip to first unread message

Olivier Vit

unread,
Oct 12, 1998, 3:00:00 AM10/12/98
to
Hi

I'd like to convert a String objet to a String containing only ascii 7 bits upper case characters,
without accents.

Using myString.toUpperCase(Locale.US) returns "MÉTHODE" when the original string is "Méthode" while
I'd like to get "METHODE".

What should I use ?
How to avoid the Unicode richness ??!

Best Regards
Thanks for any input

bspikol

unread,
Oct 12, 1998, 3:00:00 AM10/12/98
to
There is a method in String that returns a BYTE representation of the
string. Seek it out!
Olivier Vit wrote in message <36223536...@anjou-telematique.fr>...

Olivier Vit

unread,
Oct 13, 1998, 3:00:00 AM10/13/98
to bspikol
The result is the same...

bspikol a écrit:

Pauline Seymour

unread,
Oct 14, 1998, 3:00:00 AM10/14/98
to
In article <36223536...@anjou-telematique.fr>, Olivier Vit

<URL:mailto:ov...@anjou-telematique.fr> wrote:
> Hi
>
> I'd like to convert a String objet to a String containing only ascii 7 bits upper case
> characters,
> without accents.
>
> Using myString.toUpperCase(Locale.US) returns "MÉTHODE" when the original string is "M
> éthode" while
> I'd like to get "METHODE".
>
> What should I use ?

Your real problem is that there is no mathematical connection between accented
characters and their non-accented equivalents. You will have to use some form
of hash table using the ASCII / UNICODE value as key for those characters. Sorry
my Java isn't up to actually producing code for it yet.

-------
Pauline


Olivier Vit

unread,
Oct 14, 1998, 3:00:00 AM10/14/98
to Pauline Seymour
in a word there is not method in the vast java API to get ASCII 7 bits code from UNICODE with the
proper approximation...

is it what you meant ?
Best Regards

Pauline Seymour a écrit:

Matt Francomb

unread,
Oct 14, 1998, 3:00:00 AM10/14/98
to
Pauline Seymour wrote:

> In article <36223536...@anjou-telematique.fr>, Olivier Vit wrote:
> > Hi
> > I'd like to convert a String objet to a String containing only ascii 7 bits upper case
> > characters,
> > without accents.
> >
> > Using myString.toUpperCase(Locale.US) returns "MÉTHODE" when the original string is "M
> > éthode" while
> > I'd like to get "METHODE".
> >
> > What should I use ?
>
> Your real problem is that there is no mathematical connection between accented
> characters and their non-accented equivalents. You will have to use some form
> of hash table using the ASCII / UNICODE value as key for those characters. Sorry
> my Java isn't up to actually producing code for it yet.

That may not be the case - I've just noticed (having read a few of the
posts on this subject, which I don't know a lot about, I must admit)
that Olivier is using Locale.US for his case conversion. I am told
that the Québécois prefer to have accents on their uppercase letters as
well, whereas most Europeans don't. Could it be that the US locale has
this rule as well? Why not try Locale.FR and tell us how you get on?
You'll have to get an international version of the runtime for this sort
of thing to work, if you haven't already.

--
Matt Francomb

0 new messages