Improve UTF-8 builds coverage in the CI jobs (PR #23313)

45 views
Skip to first unread message

VZ

unread,
Mar 5, 2023, 10:36:44 AM3/5/23
to wx-...@googlegroups.com, Subscribed

You can view, comment on, or merge this pull request online at:

  https://github.com/wxWidgets/wxWidgets/pull/23313

Commit Summary

  • 915072b Test UTF-8 build variant with ASAN too
  • b49b07f Don't claim that wxUSE_UNICODE_UTF8 is Unix-specific
  • 120e0f5 Add wxUSE_UNICODE_UTF8 to wx/setup.h
  • 447b609 Add a build using wxUSE_UNICODE_UTF8 to Appveyor CI
  • a1af6e4 Change one of the MSW CI builds to use UTF-8

File Changes

(15 files)

Patch Links:


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/23313@github.com>

VZ

unread,
Mar 5, 2023, 12:55:43 PM3/5/23
to wx-...@googlegroups.com, Push

@vadz pushed 2 commits.

  • 77bafdc Change one of the MSW CI builds to use UTF-8
  • 145aed1 Fix dereferencing invalid iterator in wxString in UTF-8 build


View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/23313/push/12831736551@github.com>

VZ

unread,
Mar 26, 2023, 12:58:06 PM3/26/23
to wx-...@googlegroups.com, Push

@vadz pushed 10 commits.

  • 5cc2983 Test UTF-8 build variant with ASAN too
  • 0f0ac39 Don't claim that wxUSE_UNICODE_UTF8 is Unix-specific
  • 0677d49 Add wxUSE_UNICODE_UTF8 to wx/setup.h
  • e733e28 Add a build using wxUSE_UNICODE_UTF8 to Appveyor CI
  • ad29bdc Change one of the MSW CI builds to use UTF-8
  • 73ad17d Fix dereferencing invalid iterator in wxString in UTF-8 build
  • eb4e75f Document that range for can be used for wxString iteration
  • 1869f9e Use range for instead of iterators in unit test
  • 52e5561 Fix wxRegKey compilation in UTF-8 build
  • 133fd7f Fix wxString::GetCache() compilation in UTF-8 DLL build with MSVS


View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/23313/push/13078475030@github.com>

VZ

unread,
Mar 27, 2023, 12:55:37 PM3/27/23
to wx-...@googlegroups.com, Push

@vadz pushed 5 commits.

  • 7fba58c Fix a comment mentioning the now removed ANSI build
  • d8cf6d0 Use the length of the buffer instead of recomputing it again
  • aea4519 Fix using wxStringOutputStream with surrogates in UTF-8 build
  • fab541a Fix another surrogate-related bug in UTF-8 build in wxString
  • b1a30e9 Fix using dangling pointer in iterator dtor in UTF-8 builds


View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/23313/push/13090139422@github.com>

VZ

unread,
Mar 28, 2023, 12:17:26 PM3/28/23
to wx-...@googlegroups.com, Push

@vadz pushed 11 commits.

  • f9c1099 Don't assert when using "%s" with invalid char in UTF-8 build
  • 2b5dbd1 Get rid of CppUnit boilerplate in wxString unit test
  • 92649ca Fix formatting string_view in UTF-8 build
  • 378f386 Allow setting locale for the tests
  • 1ac5794 Use range-for loop over wxString in Catch::StringMaker<wxString>
  • a098fb1 Make Catch::StringMaker<wxString> more robust
  • e4e3e7e Put quotes around strings in Catch::StringMaker<wxString>
  • e48c101 Fix recognizing locales using UTF-8 charset
  • 9c9a6d4 Implement wxString::Shrink() in terms of shrink_to_fit()
  • df69397 Fix handling non-ASCII format strings in UTF-8 build
  • a441168 Reimplement wxUTF8StringBuffer correctly and more efficiently


View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/23313/push/13104497902@github.com>

VZ

unread,
Mar 28, 2023, 12:54:10 PM3/28/23
to wx-...@googlegroups.com, Push

@vadz pushed 4 commits.

  • 3cb17c9 Fix recognizing locales using UTF-8 charset
  • 79df62f Implement wxString::Shrink() in terms of shrink_to_fit()
  • 172bedf Fix handling non-ASCII format strings in UTF-8 build
  • 66c2f1f Reimplement wxUTF8StringBuffer correctly and more efficiently


View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/23313/push/13104916953@github.com>

VZ

unread,
Mar 28, 2023, 1:00:59 PM3/28/23
to wx-...@googlegroups.com, Push

@vadz pushed 1 commit.

  • aba4c60 Use correct path for the test on AppVeyor CI in debug builds


View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/23313/push/13104993840@github.com>

VZ

unread,
Mar 28, 2023, 5:06:46 PM3/28/23
to wx-...@googlegroups.com, Push

@vadz pushed 5 commits.

  • a1d289f Fix recognizing locales using UTF-8 charset
  • 488950f Implement wxString::Shrink() in terms of shrink_to_fit()
  • 5a9e433 Fix handling non-ASCII format strings in UTF-8 build
  • ae13c25 Reimplement wxUTF8StringBuffer correctly and more efficiently
  • f1f612e Use correct path for the test on AppVeyor CI in debug builds


View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/23313/push/13107571009@github.com>

VZ

unread,
Mar 28, 2023, 5:07:42 PM3/28/23
to wx-...@googlegroups.com, Subscribed

There turned out to be many more problems than I thought, but I think this should finally solve all of them and using UTF-8 build should now work fine under MSW too.

Any testing would be welcome, of course!


Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/23313/c1487592830@github.com>

VZ

unread,
Mar 28, 2023, 7:10:50 PM3/28/23
to wx-...@googlegroups.com, Subscribed

Just a note: the remaining CI failure will be fixed in master by #23393.


Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/23313/c1487722603@github.com>

Randalph

unread,
Mar 30, 2023, 9:39:47 AM3/30/23
to wx-...@googlegroups.com
I created an internal build of wxUiEditor using your ci-utf8 branch. All of this app's internal strings for UI and code generation are utf8 as are the xml project files it reads/writes and of course the XRC files it reads/writes. Everything appeared to work normally, including some utf8 testing strings in utf8 Japanese. I didn't observe a difference in performance since all of these utf8 strings have to be converted to UTF16 to display anyway. XRC would be theoretically faster since it's not converting ANSI xml tag names to UTF16 to do string comparisons, but that's a small part of it's XRC parsing.

Note that while wxUiEditor does read/write XML files, it doesn't use wxXmlDocument so while I assume performance of that class is vastly improved, I don't have anything that uses it to compare performance. The same is true of wxTextFile -- reading and processing utf8 files should be both faster and significantly reduce memory usage, but I've never used it because of its automatic conversion of all files to UTF16. Anyway, if I get some extra time, I'll do some performance comparison of both of these just out of curiosity as to how much a utf8-build improves their performance.

--
You received this message because you are subscribed to the Google Groups "wx-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wx-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wx-dev/wxWidgets/wxWidgets/pull/23313/c1487592830%40github.com.

Vadim Zeitlin

unread,
Mar 30, 2023, 1:36:57 PM3/30/23
to wx-...@googlegroups.com
On Thu, 30 Mar 2023 06:39:33 -0700 Randalph wrote:

R> I created an internal build of wxUiEditor using your ci-utf8 branch.

Thanks a lot for testing it!

R> All of this app's internal strings for UI and code generation are utf8
R> as are the xml project files it reads/writes and of course the XRC files
R> it reads/writes. Everything appeared to work normally, including some
R> utf8 testing strings in utf8 Japanese.

Great!

R> I didn't observe a difference in performance since all of these utf8
R> strings have to be converted to UTF16 to display anyway. XRC would be
R> theoretically faster since it's not converting ANSI xml tag names to
R> UTF16 to do string comparisons, but that's a small part of it's XRC
R> parsing.

Under Windows the only efficiency gains I can see would be coming from
using wxString for storing contents not shown in the GUI -- which is, of
course, not something it's really supposed to be used for. As soon as it's
used in the GUI it's converted to UTF-16 anyhow, at least until we add
support for using UTF-8 code page with "A" variants of Windows functions,
which we don't currently support. But even then Windows would still perform
the conversion internally as it still uses UTF-16 itself, so the only gain
would be due to using their, presumably more efficient, conversion code
instead of ours.

Under Linux/GTK there should be normally no need to convert to UTF-32 (and
definitely not UTF-16) at all, however, so I'd expect to see some
performance advantage to using the UTF-8 build there for programs working
with long strings, as wxUiEditor probably does.

R> Anyway, if I get some extra time, I'll do some performance comparison of
R> both of these just out of curiosity as to how much a utf8-build improves
R> their performance.

This would be definitely interesting to know, thanks!
VZ

VZ

unread,
Mar 30, 2023, 1:58:06 PM3/30/23
to wx-...@googlegroups.com, Subscribed

Merged #23313 into master.


Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you are subscribed to this thread.Message ID: <wxWidgets/wxWidgets/pull/23313/issue_event/8890318452@github.com>

Reply all
Reply to author
Forward
0 new messages