Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Like, who's sabotaging who, or is buggy, wrt. locales?

20 views
Skip to first unread message

Alf P. Steinbach

unread,
Jul 19, 2018, 6:23:49 PM7/19/18
to
This can cause an exception or possibly just a crash or UB, in Windows:

std::locale{ setlocale( LC_ALL, nullptr ) }

Full example program:


-------------------------------------------------------------------------
#include <iostream>
#include <locale>
#include <locale.h>
#include <stdexcept>
#include <stdlib.h>

auto main()
-> int
{
setlocale( LC_ALL, "" ); // Default national locale (UTF-8 in *nix)
setlocale( LC_NUMERIC, "C" ); // Periods as fraction
separators, please.

std::cout << "C level locale = \"" << setlocale( LC_ALL, nullptr )
<< "\"\n";

try
{
const auto& loc = std::locale{ setlocale( LC_ALL, nullptr ) };
std::cout << "C++ level locale = " << loc.name() << "\n";
return EXIT_SUCCESS;
}
catch( std::exception const& x )
{
std::cerr << "!" << x.what() << "\n";
}
catch( ... )
{
std::cerr << "!!! non-standard exception\n";
}
return EXIT_FAILURE;
}
-------------------------------------------------------------------------


Results with respectively Visual C++ and MinGW g++, in that order:


-------------------------------------------------------------------------
[P:\temp]
> cl locale-sabotage.cpp /Feb
locale-sabotage.cpp

[P:\temp]
> b
C level locale = "LC_COLLATE=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_CTYPE=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_MONETARY=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_NUMERIC=C;LC_TIME=Norwegian Bokmål_Svalbard and Jan
Mayen.1252"
C++ level locale = LC_COLLATE=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_CTYPE=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_MONETARY=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_NUMERIC=C;LC_TIME=Norwegian Bokmål_Svalbard and Jan Mayen.1252

[P:\temp]
> g++ locale-sabotage.cpp

[P:\temp]
> a
C level locale = "LC_COLLATE=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_CTYPE=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_MONETARY=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_NUMERIC=C;LC_TIME=Norwegian Bokmål_Svalbard and Jan
Mayen.1252"
!locale::facet::_S_create_c_locale name not valid
-------------------------------------------------------------------------


This is in Windows 10. I seem to remember that locale names were like
"no_NB.1252", and not the long descriptive text above. Has something
changed?

Because I also remember g++'s C++ level locale handling to be missing or
screwed up, so that /could/ be the problem?


Cheers!,

- Alf

Alf P. Steinbach

unread,
Jul 20, 2018, 12:14:19 AM7/20/18
to
On 20.07.2018 00:23, Alf P. Steinbach wrote:
> [snip]
> Because I also remember g++'s C++ level locale handling to be missing or
> screwed up, so that /could/ be the problem?

That was the problem.

Or rather /is/ the problem.

Not so nice to have all that inefficient locale stuff in the iostreams,
when actual use of it isn't portable to one of the main compilers. :(


-----------------------------------------------------------------------------
// Source encoding: utf-8 (π should be a greek "pi" char)
#include <iostream>
#include <locale>
#include <locale.h>
#include <stdexcept>
#include <stdlib.h>

auto main()
-> int
{
setlocale( LC_ALL, "" ); // Default national locale, for UTF-8
in Unix-land.
#ifdef PERIOD
setlocale( LC_NUMERIC, "C" ); // But periods as fraction
separators, please.
#endif

std::cout << "C level locale = \"" << setlocale( LC_ALL, nullptr )
<< "\"\n";

try
{
const auto& loc =
std::locale{ "" }
#ifdef PERIOD
.combine<std::numpunct<char>>( std::locale::classic() )
#endif
;
std::cout << "C++ level locale = \"" << loc.name() << "\"\n";
std::cout.imbue( loc );
printf( "printf blåbærsyltetøy, %g.\n", 3.14 );
std::cout << "cout blåbærsyltetøy, " << 3.14 << ".\n";
return EXIT_SUCCESS;
}
catch( std::exception const& x )
{
std::cerr << "!" << x.what() << "\n";
}
catch( ... )
{
std::cerr << "!!! non-standard exception\n";
}
return EXIT_FAILURE;
}
-----------------------------------------------------------------------------


Results with MinGW g++ 7.3.0 in Windows 10:


-----------------------------------------------------------------------------
[P:\temp]
> chcp
Active code page: 1252

[P:\temp]
> g++ locale-fix.cpp && a
C level locale = "Norwegian Bokmål_Svalbard and Jan Mayen.1252"
C++ level locale = "C"
printf blåbærsyltetøy, 3,14.
cout blåbærsyltetøy, 3.14.

[P:\temp]
> g++ locale-fix.cpp -D PERIOD && a
C level locale = "LC_COLLATE=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_CTYPE=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_MONETARY=Norwegian Bokmål_Svalbard and Jan
Mayen.1252;LC_NUMERIC=C;LC_TIME=Norwegian Bokmål_Svalbard and Jan
Mayen.1252"
C++ level locale = "C"
printf blåbærsyltetøy, 3.14.
cout blåbærsyltetøy, 3.14.
-----------------------------------------------------------------------------


The C++ level locale is unaffected by anything, it just remains as "C". :(

The code works with Visual C++.


Cheers & hope this info can be useful to someone, maybe,

- Alf

Manfred

unread,
Jul 20, 2018, 7:06:18 AM7/20/18
to
Looks like a problem in MinGW: the following is the result on a real
Linux VM (gcc 8.1)

-------------------------------
[user@localhost]$ c++ locale.cc && ./a.out
C level locale = "de_DE.UTF-8"
C++ level locale = "de_DE.UTF-8"
printf blåbærsyltetøy, 3,14.
cout blåbærsyltetøy, 3,14.

[user@localhost]$ c++ -DPERIOD locale.cc && ./a.out
C level locale =
"LC_CTYPE=de_DE.UTF-8;LC_NUMERIC=C;LC_TIME=de_DE.UTF-8;LC_COLLATE=de_DE.UTF-8;LC_MONETARY=de_DE.UTF-8;LC_MESSAGES=de_DE.UTF-8;LC_PAPER=de_DE.UTF-8;LC_NAME=de_DE.UTF-8;LC_ADDRESS=de_DE.UTF-8;LC_TELEPHONE=de_DE.UTF-8;LC_MEASUREMENT=de_DE.UTF-8;LC_IDENTIFICATION=de_DE.UTF-8"
C++ level locale = "de_DE.UTF-8"
printf blåbærsyltetøy, 3.14.
cout blåbærsyltetøy, 3.14.

-------------------------------
0 new messages