_() macro in wxNO_IMPLICIT_WXSTRING_ENCODING situations partially broken (Issue #24713)

30 views
Skip to first unread message

Duncan

unread,
Jul 19, 2024, 6:14:28 AM7/19/24
to wx-...@googlegroups.com, Subscribed

Description

_() macro when used with wxNO_IMPLICIT_WXSTRING_ENCODING will expand to wxGetTranslation(wxASCII_STR(s)) which is going to be broken for any strings that are not ASCII or are wide char. So this will mean you cannot use anything non-ASCII in your base language.

Expected vs observed behaviour:

The classic example of this in our codebase is _("Angle (\u00b0)"). This will trigger an assert Non-ASCII value passed to FromAscii()..
And if you were to use _(L"Angle (\u00b0)") that is now a compile error.
It would be more appropriate that the _() macro is disabled entirely during wxNO_IMPLICIT_WXSTRING_ENCODING or at least there to be documentation stating that it cannot be used / implies ASCII.

Personally I would prefer it be changed to wxString::FromUTF8 instead of wxASCII_STR as all code these days is utf8 encoded (yes even on Windows if you use the /utf8 flag) but that wouldn't solve the wide char issue.

Patch or snippet allowing to reproduce the problem:

#define wxNO_IMPLICIT_WXSTRING_ENCODING
wxString str = _("\u00b0");
// _(L"a") compile error

A potential fix would be to add an explicit macro for the encoding:

#define _u8(s) wxGetTranslation(wxString::FromUTF8((s)))

wxString str = _u8("\u00b0");

Although that might be too close to the u8 string literal.

Platform and version information

  • wxWidgets version you use: 3.2.5
  • wxWidgets port you use: wxMSW
  • OS and its version: Windows 11


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/issues/24713@github.com>

VZ

unread,
Jul 20, 2024, 11:12:12 AM7/20/24
to wx-...@googlegroups.com, Subscribed

Thanks, this is indeed not great.

I agree that it would make sense to use FromUTF8() rather than FromAscii() in _(), as there doesn't seem to be any danger of breaking anything: either the string was in ASCII before and it would still work, or it wasn't and then it would either start to work or remain broken.

To deal with _(L"foo") case (does this really occur in the wild? I don't think I've ever seen this) we could add some wxMsgIdString() function with overloads for const char*, for which it would return wxString::FromUTF8(), and const wchar_t*, for which it would just return wxString(). Such a function could also check if FromUTF8() returns an empty string, indicating conversion error, and assert in this case.

Thanks in advance for any thoughts/objections/PRs!


Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/issues/24713/2241180592@github.com>

Reply all
Reply to author
Forward
0 new messages