Locales in wx

90 views
Skip to first unread message

Vadim Zeitlin

unread,
Mar 7, 2021, 7:23:09 AM3/7/21
to wx-dev
Hello,

This is continuation of the discussion which started at
https://github.com/wxWidgets/wxWidgets/pull/2124 where I previously wrote:

VZ> IMO the ideal behaviour should really be:
VZ>
VZ> 1. Use C (i.e. US) locale everywhere if wxLocale is not used at all.
VZ> 2. Use user locale if wxLocale(wxLANGUAGE_DEFAULT) is used.
VZ>
VZ> I think (2) should mostly work (except for the really bad problem
VZ> described in this ticket[*] that I'd very much appreciate your feedback on
VZ> too) and this tries to fix/improve things for (1).
VZ>
VZ> [*]: https://trac.wxwidgets.org/ticket/19023

and Stefan replied:

SC> I think we have to dive into these a litte bit more to make sure the
SC> cure is not worse than the problem
SC>
SC> 1. Right now wx under macOS is using the current CFLocale which is
SC> determined by the OS, so dateformats etc. are correct. Without
SC> setting a locale, not a c-level nor at wx level.
SC> Changing this now would suddenly change the behaviour of existing
SC> apps on macOS, eg Dates would be displayed differently ...

Clearly, there is a disagreement about what our goal should be in the
first place. I always thought that an internationalized wx application
would have to perform wxLocale::Init(wxLANGUAGE_DEFAULT) during its
initialization and that applications not doing this should be isolated from
the current user locale as much as possible. IOW, if you do _not_ use
wxLocale, then the application should get numbers entered by user using the
decimal point, whatever is the current locale. And I still believe that
it's not an unreasonable expectation and that changing this will (subtly)
break the existing code, e.g. using wxSpinCtrl::SetValue("1.234") would
stop working in German (or French, or ...) locale. Of course, this never
was the best thing to do, but AFAIK it does work right now and the string
typically is not hardcoded in the sources, but comes from a file, so it's
not that obvious to find and correct all the places where this happens.

<aside>
Floating point numbers are not used much in wx API, and so maybe we could
modify SetValue() and other places to accept numbers using either the point
or the comma, but this seems a wrong thing to do and wouldn't work in the
other direction anyhow, so I don't think we should consider this.
</aside>

So in my vision of how things should work, everything should really use
C/US locale by default and only by using wxLocale::Init() would you opt
into locale-specific behaviour. Of course, IMO most applications should use
wxLocale in any case, so the case of not using it is not even very
interesting, but I'd still prefer to avoid returning strings using decimal
comma to the applications that don't want to do anything with the locales.
Especially because if we do it like this, the only way to avoid it would be
to explicitly use wxLocale(wxLANGUAGE_ENGLISH_US) which is something we've
never recommended doing before, while fixing the problem with wrong (i.e.
non-localized) dates display would require just using wxLANGUAGE_DEFAULT
which is something that was recommended since always.

Stefan, do my arguments seem convincing to you? What about the others? I'd
really like to understand what do we want to achieve before trying to
actually do it, I think it would significantly increase our chances of
success :-)


On a more technical level, Stefan also wrote:

SC> 2. Setting a locale eg to de_CH is not working correctly at the
SC> c-level as I wrote, it only knows about de, the decimal and
SC> thousands separators are wrong.

This really looks like a bug in macOS :-( The same

$ LC_ALL=de_CH printf "%g\n" 1.234

command outputs "1,234" under macOS 11.1 while it correctly outputs "1.234"
under Linux. And /usr/share/locale/de_CH/LC_NUMERIC under macOS is just a
symlink to ../de_DE.UTF-8/LC_NUMERIC which, of course, explains why it
behaves like this but doesn't help.

SC> So there I'd like to leave LC_NUMERIC at C and perform the inverse
SC> of what FromDouble/FromCDouble etc are doing at wxString level.
SC>
SC> So in summary IMHO we will never be able to depend on C-Level locale on
SC> macOS, we will have to use the CFLocale information and perform
SC> replacements like mentioned in 2)

The problem with this is that there is a lot of existing code using
[sf]printf() and other C functions (and whatever you may think of these
functions, even using C++ locale support wouldn't help with the problem
above). If we don't interoperate with this code, we're going to have
problems without any simple solution. E.g. just consider a typical GUI
application that gets string from wx and then uses some non-GUI library for
doing something with them. How are you going to fix things to work in both
de_DE and de_CH locales if wx strings use a decimal point in the latter but
comma in the former while the non-GUI library simply has no way to use
different values in the 2 cases without using non-portable code because
either it assumes C locale (and then it doesn't work in de_DE) or it uses
the current locale (and then it doesn't work in de_CH).

Also, if we leave LC_NUMERIC (and why not LC_TIME too?) at C, what is the
point of changing LC_ALL at all? Which locale facet do we really need to
change, in fact? We already have wxCmpNatural() that should be used rather
than relying on LC_COLLATE. Nothing that we do uses LC_MONETARY, AFAIK.
LC_CTYPE doesn't really make sense with Unicode anyhow. And we have
wxTranslations rather than relying on LC_MESSAGES. So AFAICS your proposal
is basically equivalent to not setting C locale at all under macOS.

And while I'm not saying that we shouldn't do this, I do have a feeling
that this is not the right thing to do, especially if we only decide to do
it because of the problem with de_CH (even if I know that it's a
subjectively important locale...). Is there really no chance of Apple
fixing their locale definition instead?


To summarize, there are 2 questions I'd like to answer before changing
anything:

1. What should applications not using wxLocale at all expect from wx?
Here it seems obvious that the C locale must not be changed (as this
would silently breaks tons of code), but what should happen with the
UI, i.e. should it use locale-specific or the US formats for
dates/numbers?

2. What should applications using wxLocale(wxLANGUAGE_DEFAULT) expect?
Here it seems pretty clear that the dates/numbers should appear in the
appropriate locale-specific format in the UI, but it's not clear how
should the standard C functions work.

Thanks in advance!
VZ

Andy Robinson

unread,
Mar 7, 2021, 8:03:45 AM3/7/21
to wx-...@googlegroups.com
On 07/03/2021 12:23, Vadim Zeitlin wrote:
> So in my vision of how things should work, everything should really use
> C/US locale by default and only by using wxLocale::Init() would you opt
> into locale-specific behaviour.

I'm in favour of this. Many programs save configuration files and
document files with numbers in, which are not intended to be read by the
user, so you can get a nasty surprise when a user whose machine is
configured for one locale sends a document file to a user with a
different locale. (I'm not talking about text that the user sees, I'm
talking about e.g. a number that specifies the dimensions of a document)

In fact I wish the standard versions of sprintf, scanf etc all used C/US
always, and we would call the locale-aware versions of those functions
(where we pass the locale as a parameter to the function) when we want
locale-specific behaviour. But unfortunately it's too late to wish for that.

Regards,
Andy Robinson, Seventh String Software, www.seventhstring.com

Vadim Zeitlin

unread,
Mar 7, 2021, 8:14:02 AM3/7/21
to wx-...@googlegroups.com
On Sun, 7 Mar 2021 13:03:41 +0000 Andy Robinson wrote:

AR> On 07/03/2021 12:23, Vadim Zeitlin wrote:
AR> > So in my vision of how things should work, everything should really use
AR> > C/US locale by default and only by using wxLocale::Init() would you opt
AR> > into locale-specific behaviour.
AR>
AR> I'm in favour of this. Many programs save configuration files and
AR> document files with numbers in, which are not intended to be read by the
AR> user, so you can get a nasty surprise when a user whose machine is
AR> configured for one locale sends a document file to a user with a
AR> different locale. (I'm not talking about text that the user sees, I'm
AR> talking about e.g. a number that specifies the dimensions of a document)

Yes, this is exactly the motivation for the above.

AR> In fact I wish the standard versions of sprintf, scanf etc all used C/US
AR> always, and we would call the locale-aware versions of those functions
AR> (where we pass the locale as a parameter to the function) when we want
AR> locale-specific behaviour. But unfortunately it's too late to wish for that.

We do provide wxString::{To,From}CDouble() that always work like this. But
not all code can use them.

And, of course, if you use C++17 you can (and should) use
std::{to,from}_chars() that are also much faster and more precise. But,
again, not a lot of existing code uses those...

Regards,
VZ

Stefan Csomor

unread,
Mar 7, 2021, 10:15:44 AM3/7/21
to wx-...@googlegroups.com
Hello everyone
Definitely, macOS itself takes care of the internationalization of all OS-delivered text of an app anyway. So I also think it is correct that wxLocale::GetSystemLanguage is returning this language, even if no wxLocale has been inited.

I'm always using wxLocale in my apps. So I'm not dependent on the current behavior and if everybody agrees that returning the US-Root Localization information in wxLocale::GetInfo is ok as a change in the non-wxLocale situation, then I'm fine, I just want to be sure that everybody is fine with that change.
no, it definitely is a bug, been there for many years, never fixed, there are also other language variants with a similar fate, but I don't recall which ...

and it is visible in other apps, sometimes even Apple's own, when suddenly the decimal separator is a comma ...

I'm not saying LC_NUMERIC has to be "C", but I'd like to make sure, we declare at which level things are properly localized within wx and make sure at the respective levels round trips are working correctly wxString::FromDouble, ToDouble, wxNumberFormatter::FromString, ToString etc. so that at least a wx app can be localized correctly ... without having to do fixes in my own app ...

Thanks,
Stefan

Stefan Csomor

unread,
Mar 7, 2021, 10:53:09 AM3/7/21
to wx-...@googlegroups.com
Hi

As an example here are the two changes if the fixes were to be applied at the wxString level.

Best,

Stefan

bool wxString::ToDouble(double *pVal) const
{
#if defined(__WXOSX__)
struct lconv* lc = localeconv();
wxString sep = wxLocale::GetInfo(wxLOCALE_DECIMAL_POINT,wxLOCALE_CAT_NUMBER);

if ( lc->decimal_point != sep )
{
wxString s(*this);
s.Replace(sep, lc->decimal_point);
WX_STRING_TO_X_TYPE_START
start = s.wx_str();
double val = wxStrtod(start, &end);
WX_STRING_TO_X_TYPE_END
}
#endif

WX_STRING_TO_X_TYPE_START
double val = wxStrtod(start, &end);
WX_STRING_TO_X_TYPE_END
}

wxString wxString::FromDouble(double val, int precision)
{
wxCHECK_MSG( precision >= -1, wxString(), "Invalid negative precision" );

wxString format;
if ( precision == -1 )
{
format = "%g";
}
else // Use fixed precision.
{
format.Printf("%%.%df", precision);
}

wxString s = wxString::Format(format, val);
#if defined(__WXOSX__)
struct lconv* lc = localeconv();
wxString sep = wxLocale::GetInfo(wxLOCALE_DECIMAL_POINT,wxLOCALE_CAT_NUMBER);

if ( lc->decimal_point != sep )
s.Replace(lc->decimal_point,sep);
#endif
return s;
}

Vadim Zeitlin

unread,
Mar 7, 2021, 10:57:36 AM3/7/21
to wx-...@googlegroups.com
On Sun, 7 Mar 2021 15:53:05 +0000 Stefan Csomor wrote:

SC> As an example here are the two changes if the fixes were to be applied
SC> at the wxString level.

I am afraid it would be very confusing if wxString::FromDouble() didn't
work in the same way as wxString::Printf(). Note that we do already have
wxNumberFormatter which is supposed to handle numbers in the "UI format"
and so IMO it would be better to ensure that it works accordingly to the
current UI locale.

Regards,
VZ

Vadim Zeitlin

unread,
Mar 7, 2021, 11:10:25 AM3/7/21
to wx-...@googlegroups.com
On Sun, 7 Mar 2021 15:15:41 +0000 Stefan Csomor wrote:

SC> Definitely, macOS itself takes care of the internationalization of all
SC> OS-delivered text of an app anyway. So I also think it is correct that
SC> wxLocale::GetSystemLanguage is returning this language, even if no
SC> wxLocale has been inited.

Yes, GetSystemLanguage() should return the current language used in the UI
(it's badly named and should have been called GetUserUILanguage() or
something similar, but this is a different issue). I don't question this.

SC> I'm always using wxLocale in my apps. So I'm not dependent on the
SC> current behavior and if everybody agrees that returning the US-Root
SC> Localization information in wxLocale::GetInfo is ok as a change in the
SC> non-wxLocale situation, then I'm fine, I just want to be sure that
SC> everybody is fine with that change.

I'm not even sure what exactly is going to change, I need to write a
comprehensive test and run it under all 3 platforms. Or do you already know
the answer to this question?

SC> and it is visible in other apps, sometimes even Apple's own, when
SC> suddenly the decimal separator is a comma ...
SC>
SC> I'm not saying LC_NUMERIC has to be "C",

The main attraction of this, or not changing locale at all under Mac, is
that it provides a solution for https://trac.wxwidgets.org/ticket/19023
that I absolutely don't know how to fix otherwise and which is, IMO, a
blocker for 3.1.5 release.

SC> but I'd like to make sure, we declare at which level things are
SC> properly localized within wx and make sure at the respective levels
SC> round trips are working correctly wxString::FromDouble, ToDouble,
SC> wxNumberFormatter::FromString, ToString etc. so that at least a wx app
SC> can be localized correctly ... without having to do fixes in my own app

wxNumberFormatter clearly should use the same conventions as the platform
UI, doing this is the only reason for its existence. wxString methods
should, IMO, follow the C locale because they always did, they're similar
to wxStrXXX() and other CRT-wrapper functions that also always did and will
continue to do it, and wxString can, and is, used in non-GUI code too.

The real question to me is what should happen with C locale under macOS.
I'd very much like to keep changing it, but if we can't solve #19023, I
don't really know if it's a defensible position in practice :-(

Regards,
VZ

Stefan Csomor

unread,
Mar 7, 2021, 4:55:06 PM3/7/21
to wx-...@googlegroups.com
Hi

SC> I'm always using wxLocale in my apps. So I'm not dependent on the
SC> current behavior and if everybody agrees that returning the US-Root
SC> Localization information in wxLocale::GetInfo is ok as a change in the
SC> non-wxLocale situation, then I'm fine, I just want to be sure that
SC> everybody is fine with that change.

I'm not even sure what exactly is going to change, I need to write a
comprehensive test and run it under all 3 platforms. Or do you already know
the answer to this question?

if you run the app on a non-english platform, eg on a "de_DE", then

wxNumberFormatter::GetDecimalSeparator() returns ',' without the change, and '.' with it,

or eg

wxLocale::GetInfo(wxLOCALE_SHORT_DATE_FMT, wxLOCALE_CAT_DATE)

"%d.%m.%Y" without the change, "%Y-%m-%d" with it

So while before these functions where returning the current system locale, now they return the default

SC> and it is visible in other apps, sometimes even Apple's own, when
SC> suddenly the decimal separator is a comma ...
SC>
SC> I'm not saying LC_NUMERIC has to be "C",

The main attraction of this, or not changing locale at all under Mac, is
that it provides a solution for https://trac.wxwidgets.org/ticket/19023
that I absolutely don't know how to fix otherwise and which is, IMO, a
blocker for 3.1.5 release.

SC> but I'd like to make sure, we declare at which level things are
SC> properly localized within wx and make sure at the respective levels
SC> round trips are working correctly wxString::FromDouble, ToDouble,
SC> wxNumberFormatter::FromString, ToString etc. so that at least a wx app
SC> can be localized correctly ... without having to do fixes in my own app

wxNumberFormatter clearly should use the same conventions as the platform
UI, doing this is the only reason for its existence. wxString methods
should, IMO, follow the C locale because they always did, they're similar
to wxStrXXX() and other CRT-wrapper functions that also always did and will
continue to do it, and wxString can, and is, used in non-GUI code too.

The real question to me is what should happen with C locale under macOS.
I'd very much like to keep changing it, but if we can't solve #19023, I
don't really know if it's a defensible position in practice :-(

I think we should determine on which level we want the localization to happen, native Cocoa apps are recommended to use the higher level APIs, IIRC setlocale does not work on native iOS devices, so having "C" on the C-Level and the correct locale on wxLocale and wxNumberFormatters would be one way, I only fear that in many places Printf methods are used with translated messages, and no wxNumberFormatters are used there ...

Best,

Stefan

Vadim Zeitlin

unread,
Mar 7, 2021, 5:20:32 PM3/7/21
to wx-...@googlegroups.com
On Sun, 7 Mar 2021 21:55:02 +0000 Stefan Csomor wrote:

SC> if you run the app on a non-english platform, eg on a "de_DE", then
SC>
SC> wxNumberFormatter::GetDecimalSeparator() returns ',' without the
SC> change, and '.' with it,
SC>
SC> or eg
SC>
SC> wxLocale::GetInfo(wxLOCALE_SHORT_DATE_FMT, wxLOCALE_CAT_DATE)
SC>
SC> "%d.%m.%Y" without the change, "%Y-%m-%d" with it
SC>
SC> So while before these functions where returning the current system
SC> locale, now they return the default

But now they behave the same as under the other platforms, whereas
previously they did not... At least from looking at GetInfo()
implementation in the MSW case it seems clear that it returns the default,
hardcoded values if no locale had been set. And under Unix it uses
localeconv() which definitely uses LC_NUMERIC for the current locale.

So my change was based on my understanding of how these functions worked
under the other platforms, but this is indeed not how they work under Mac
currently. I think Mac is in the wrong here, however, because we already
have a separate GetOSInfo() added back in 9fc78c8167 (Add
wxLocale::GetOSInfo() and use it in MSW wxDateTimePickerCtrl., 2014-12-05)
and we should just implement it, and use it where appropriate, under Mac
too. This seems preferable to me to changing GetInfo() under the other
platforms.

What do you think?


SC> I think we should determine on which level we want the localization to
SC> happen,

The problem is that we don't fully control this. Application code uses
both [sf]printf() and wxString::Format() and wxString::ToDouble() etc
So just saying that we don't do localization at this level won't work.

SC> native Cocoa apps are recommended to use the higher level APIs,

Sure, let's use them where it makes sense. For example the commit above
refers to wxDateTimePickerCtrl which always uses the same format as all the
other date pickers on the same system, even if we don't set the locale.
Maybe we need to do the same kind of things under Mac.

Does Cocoa use the user locale by default for the UI elements, as MSW
does, or do we need to set it explicitly?

SC> IIRC setlocale does not work on native iOS devices, so having "C" on
SC> the C-Level and the correct locale on wxLocale and wxNumberFormatters
SC> would be one way, I only fear that in many places Printf methods are
SC> used with translated messages, and no wxNumberFormatters are used there

We have to accept that we can't work miracles, so no solution will be
perfect under macOS or iOS if they just don't support C locales. The thing
that really bothers me is that changing the locale is not just useless, but
actively harmful because it makes the check marks disappear from the menus
and we just need to do something about this. Not setting the C locale at
all there seems like the only thing to do, but just imagining explaining in
the documentation that wxPrintf("%g") uses locale-specific decimal
separator on all systems including macOS < 11 but always uses the period
under macOS 11 makes me feel bad. How can anybody justify it working like
this? I still hope you can find some magic solution to make the menus work
in non-US locale. Do you think it could be not completely useless to try
reporting this to Apple? I don't exactly have high hopes for this, but if
it's possible to reproduce this by adding a setlocale() call to a "Hello
world" Xcode Cocoa project, surely even they must realize that it's a bug?
Or am I being too optimistic?

Thanks again,
VZ

Stefan Csomor

unread,
Mar 8, 2021, 5:24:20 AM3/8/21
to wx-...@googlegroups.com
Hi
I never wanted to change the other platforms. But I was worried about code running under macOS depending on it. But this is perfect, wxLocale::GetOSInfo is exactly what GetInfo did up to now. So we have a documented API for this functionality if someone depends on it, and a consistent crossplatform experience.

SC> I think we should determine on which level we want the localization to
SC> happen,

The problem is that we don't fully control this. Application code uses
both [sf]printf() and wxString::Format() and wxString::ToDouble() etc
So just saying that we don't do localization at this level won't work.

SC> native Cocoa apps are recommended to use the higher level APIs,

Sure, let's use them where it makes sense. For example the commit above
refers to wxDateTimePickerCtrl which always uses the same format as all the
other date pickers on the same system, even if we don't set the locale.
Maybe we need to do the same kind of things under Mac.

Does Cocoa use the user locale by default for the UI elements, as MSW
does, or do we need to set it explicitly?

It's done for you, CFLocale and NSLocale are set to a current locale that is the join between the language preferences you set on the macOS level and the localizations your application bundle is advertising. So when I set my languages in the system preferences to the order 'de', 'en', 'fr' and my app is not offering 'de' in the available localizations I am started in the 'en_CH' CFLocale

So this isn't really the same localization my system runs in for all cases ...

SC> IIRC setlocale does not work on native iOS devices, so having "C" on
SC> the C-Level and the correct locale on wxLocale and wxNumberFormatters
SC> would be one way, I only fear that in many places Printf methods are
SC> used with translated messages, and no wxNumberFormatters are used there

We have to accept that we can't work miracles, so no solution will be
perfect under macOS or iOS if they just don't support C locales. The thing
that really bothers me is that changing the locale is not just useless, but
actively harmful because it makes the check marks disappear from the menus
and we just need to do something about this. Not setting the C locale at
all there seems like the only thing to do, but just imagining explaining in
the documentation that wxPrintf("%g") uses locale-specific decimal
separator on all systems including macOS < 11 but always uses the period
under macOS 11 makes me feel bad. How can anybody justify it working like
this?

Yes, and since SVG is used for all scaled graphics, this may well hit us at other places as well.

I still hope you can find some magic solution to make the menus work
in non-US locale. Do you think it could be not completely useless to try
reporting this to Apple? I don't exactly have high hopes for this, but if
it's possible to reproduce this by adding a setlocale() call to a "Hello
world" Xcode Cocoa project, surely even they must realize that it's a bug?
Or am I being too optimistic?

I'll file a bug report and see what's happening, but we will have to solve it anyways on our side I guess ...

Thanks,

Stefan
Reply all
Reply to author
Forward
0 new messages