Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Making GAWK work right even in the awful world of i18n...

37 views
Skip to first unread message

Kenny McCormack

unread,
May 26, 2013, 11:20:09 AM5/26/13
to
I think I asked this before, but I can't remember exactly which answer I
got. So, please forgive...

Anyway, I just noticed (again) that GAWK (and presumably other, i18n-aware
versions of AWK - that is to say, probably anything that is more or less
"current") behaves weirdly (in terms of string comparisons and reg exp
ranges) if the whacky L* variables are set (to other than the sane value of
"C"). I got bit by this recently and just did "unset LANG", after which
things started working normally again.

But, it seems to me that there is a way to get the normal behavior entirely
within the GAWK language - that there is some setting you set there to make
it work correctly. But I can't remember it at the moment. Please advise.

--
Modern Conservative: Someone who can take time out
from demanding more flag burning laws, more abortion
laws, more drug laws, more obscenity laws, and more
police authority to make warrantless arrests to remind
us that we need to "get the government off our backs".

Aharon Robbins

unread,
May 26, 2013, 1:41:41 PM5/26/13
to
In article <knt979$s4s$1...@news.xmission.com>,
Kenny McCormack <gaz...@shell.xmission.com> wrote:
>I think I asked this before, but I can't remember exactly which answer I
>got. So, please forgive...
>
>Anyway, I just noticed (again) that GAWK (and presumably other, i18n-aware
>versions of AWK - that is to say, probably anything that is more or less
>"current") behaves weirdly (in terms of string comparisons and reg exp
>ranges) if the whacky L* variables are set (to other than the sane value of
>"C"). I got bit by this recently and just did "unset LANG", after which
>things started working normally again.
>
>But, it seems to me that there is a way to get the normal behavior entirely
>within the GAWK language - that there is some setting you set there to make
>it work correctly. But I can't remember it at the moment. Please advise.

I feel your pain.

You have a few choices.

1. Set LC_ALL=C in your environment. I do this from my .profile.

2. Use gawk with the -b option. This is from the command line, not from
the program itself.

3. You can try additionally using "configure --disable-nls" before compiling,
but I think that just disables the translation facilities and not
the attempts to deal with locales.

4. You can force it by editing mbsupport.h and forcing a #undef of MBS_SUPPORT
and then gawk will compile itself for a single byte environment.

Note that as of gawk 4.0.1, ranges should behave rationally in regular
expressions, no matter what the locale setting.

And of course, as usual, you are best off using the latest released version,
which is 4.1.0.
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
P.O. Box 354 Home Phone: +972 8 979-0381
Nof Ayalon
D.N. Shimshon 9978500 ISRAEL

Kenny McCormack

unread,
May 26, 2013, 4:25:55 PM5/26/13
to
In article <knthgl$ukm$1...@dont-email.me>,
Aharon Robbins <arn...@skeeve.com> wrote:
...
>You have a few choices.
>
>1. Set LC_ALL=C in your environment. I do this from my .profile.

WAEF...

>2. Use gawk with the -b option. This is from the command line, not from
> the program itself.

OK - that sounds useful.

>3. You can try additionally using "configure --disable-nls" before compiling,
> but I think that just disables the translation facilities and not
> the attempts to deal with locales.
>
>4. You can force it by editing mbsupport.h and forcing a #undef of MBS_SUPPORT
> and then gawk will compile itself for a single byte environment.

Yes! Since I do all self-compiles of GAWK, I am definitely interested in
ways to compile it so it isn't brain-dead. Thanks.

>Note that as of gawk 4.0.1, ranges should behave rationally in regular
>expressions, no matter what the locale setting.

Aha! Yes, I think that is basically the advise I got the last time.
That my problem was that I was running an old version (my own, ancient
self-compile with some custom mods).

I haven't gotten around to compiling the latest version(s) yet. Must do so.

>And of course, as usual, you are best off using the latest released version,
>which is 4.1.0.

Righto.

--

Some of the more common characteristics of Asperger syndrome include:

* Inability to think in abstract ways (eg: puns, jokes, sarcasm, etc)
* Difficulties in empathising with others
* Problems with understanding another person's point of view
* Hampered conversational ability
* Problems with controlling feelings such as anger, depression
and anxiety
* Adherence to routines and schedules, and stress if expected routine
is disrupted
* Inability to manage appropriate social conduct
* Delayed understanding of sexual codes of conduct
* A narrow field of interests. For example a person with Asperger
syndrome may focus on learning all there is to know about
baseball statistics, politics or television shows.
* Anger and aggression when things do not happen as they want
* Sensitivity to criticism
* Eccentricity
* Behaviour varies from mildly unusual to quite aggressive
and difficult

0 new messages