Fix data corruption in wxPrintfFormatConverterUtf8 (PR #26236)

15 views
Skip to first unread message

Randalphwa

unread,
Feb 23, 2026, 5:48:58 PM (2 days ago) Feb 23
to wx-...@googlegroups.com, Subscribed

Bug Fix: wxDoVsnprintf treats %s as wchar_t* in UTF-8 builds

Problem

When wxUSE_UNICODE_UTF8=1 and wx's own wxDoVsnprintf<char> is used (i.e., the system does not provide _vsprintf_p or a POSIX vsnprintf with positional parameter support), %s arguments are misinterpreted as wchar_t* instead of char*. This causes string corruption — UTF-8 byte pairs are reinterpreted as wide characters (e.g., "https:" becomes "瑨灴㩳").

Conditions That Trigger the Bug

All of these must be true simultaneously:

  1. wxUSE_UNICODE_UTF8=1
  2. wxUSE_PRINTF_POS_PARAMS=1
  3. HAVE_UNIX98_PRINTF is not defined
  4. _MSC_FULL_VER is not defined (i.e., not compiled with MSVC)

This configuration causes wxCRT_VsnprintfA to be undefined in include/wx/wxcrtvararg.h, which triggers wx's own implementation in src/common/wxprintf.cpp. The exact platform this affects is Clang (or GCC) with MinGW on Windows using UTF-8 Unicode mode.

MSVC avoids the bug because _MSC_FULL_VER is defined, routing wxCRT_VsnprintfA to _vsprintf_p (native CRT). Linux/macOS typically define HAVE_UNIX98_PRINTF, which also avoids the bug.

Root Cause

File: include/wx/private/wxprintf.h

In wxPrintfConvSpec<CharType>::Parse() (~line 436), plain %s (no h or l prefix, so ilen == 0) is classified as wxPAT_PWCHAR:

case wxT('s'):
    if (ilen == -1)
    {
        // wx extension: we'll let %hs mean non-Unicode strings
        m_type = wxPAT_PCHAR;
    }
    else
    {
        // %ls == %s == Unicode string
        m_type = wxPAT_PWCHAR;
    }

Then in Process() (~line 670), wxPAT_PWCHAR casts the void* argument to const wchar_t*:

else // m_type == wxPAT_PWCHAR
    s.assign(static_cast<const wchar_t *>(p->pad_str));

But the actual argument is a char* (UTF-8 string from wxString::wx_str()). The cast reinterprets UTF-8 bytes as wide character pairs → corruption.

File: src/common/strvararg.cpp

The format converter that should prevent this is wxPrintfFormatConverterUtf8::HandleString() (~line 476):

class wxPrintfFormatConverterUtf8 : public wxFormatConverterBase<char>
{
    virtual void HandleString(CharType WXUNUSED(conv),
                              SizeModifier WXUNUSED(size),
                              CharType& outConv, SizeModifier& outSize) override
    {
        outConv = 's';
        outSize = Size_Default;  // ← BUG: produces plain %s
    }

It outputs Size_Default (no prefix), so %s passes through unchanged. For wxDoVsnprintf<char> to treat the argument as char*, the format would need %hs (Size_Shortilen = -1wxPAT_PCHAR).

Fix

In src/common/strvararg.cpp, change wxPrintfFormatConverterUtf8::HandleString() to output Size_Short so that %s becomes %hs, which wxDoVsnprintf<char> correctly maps to wxPAT_PCHAR (char*):

class wxPrintfFormatConverterUtf8 : public wxFormatConverterBase<char>
{
    virtual void HandleString(CharType WXUNUSED(conv),
                              SizeModifier WXUNUSED(size),
                              CharType& outConv, SizeModifier& outSize) override
    {
        outConv = 's';
#if wxUSE_WXVSNPRINTFA
        // When using wx's own vsnprintf implementation, plain %s is
        // interpreted as wchar_t*. We need %hs to indicate char* args,
        // which is what we actually pass in UTF-8 builds.
        outSize = Size_Short;
#else
        outSize = Size_Default;
#endif
    }

The guard wxUSE_WXVSNPRINTFA is 1 only when wx's own implementation is active (the exact condition that triggers the bug). When a native CRT function handles formatting, Size_Default remains correct.

How to Verify

The corruption is visible in any wxString::Format("%s", ...) call when compiled under the triggering conditions. A simple test:

wxString url("https://example.com");
wxString result = wxString::Format("<A>%s</A>", url);
// Without fix: result contains garbled Chinese characters
// With fix: result == "<A>https://example.com</A>"

The specific symptom observed was in wxHyperlinkCtrl — the URL label passed to MSWCreateControl via GetLabelForSysLink() (in src/msw/hyperlink.cpp) was corrupted because it uses wxString::Format.

It also showed up in menus with the accelerator being part of the label. This was breaking wxUiEditor's menus and about box. Making the fix resolved all of the corruption I saw when I switched compilers on Windows from MSVC to MinGW-clang.


You can view, comment on, or merge this pull request online at:

  https://github.com/wxWidgets/wxWidgets/pull/26236

Commit Summary

  • 6f26779 Fix data corruption in wxPrintfFormatConverterUtf8

File Changes

(1 file)

Patch Links:


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/26236@github.com>

VZ

unread,
Feb 24, 2026, 10:44:24 AM (yesterday) Feb 24
to wx-...@googlegroups.com, Subscribed
vadz left a comment (wxWidgets/wxWidgets#26236)

Thanks, I'm not entirely sure if this is the right place to fix this (could this be done in wxVsnprintf() instead?), but I'll push your fix to at least fix the problem, and it could be always improved later.


Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/26236/c3953025599@github.com>

VZ

unread,
Feb 24, 2026, 10:53:45 AM (yesterday) Feb 24
to wx-...@googlegroups.com, Subscribed

Closed #26236 via c90aaa3.


Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/26236/issue_event/23039752300@github.com>

Reply all
Reply to author
Forward
0 new messages