convert wxString to char*

197 views
Skip to first unread message

Zhou Leo

unread,
Aug 25, 2015, 6:16:18 AM8/25/15
to wx-users
Hi,
I've built an unicode app on OS X 10.10 using wxWidgets 3.1.0. For some reason, I need to convert a Chinese wxString to multi-byte char* directly using current system locale instead UTF-8. 

Here are some facts:

wxLocale::GetSystemLanguage() returns wxLANGUAGE_CHINESE_SINGAPORE.
wxLocale::GetSystemEncoding() returns wxFONTENCODING_MACCHINESESIMP.

wxCSConv cs(wxLocale::GetSystemEncoding()); // IsOK() returns false
wxCSConv cs(wxFONTENCODING_GB2312); // IsOK() returns true

wxString string = wxT("some chinese characters");
printf("%s\n", (const char*) string.mb_str(cs));

Whether IsOK() returns true or false, mb_str() always returns mojibake. 

Can anyone help me? Thanks!

Leo

Vadim Zeitlin

unread,
Aug 25, 2015, 6:49:49 AM8/25/15
to wx-u...@googlegroups.com
On Tue, 25 Aug 2015 03:16:17 -0700 (PDT) Zhou Leo wrote:

ZL> I've built an unicode app on OS X 10.10 using wxWidgets 3.1.0. For some
ZL> reason, I need to convert a Chinese wxString to multi-byte char* directly
ZL> using current system locale instead UTF-8.
ZL>
ZL> Here are some facts:
ZL>
ZL> wxLocale::GetSystemLanguage() returns wxLANGUAGE_CHINESE_SINGAPORE.
ZL> wxLocale::GetSystemEncoding() returns wxFONTENCODING_MACCHINESESIMP.
ZL>
ZL> wxCSConv cs(wxLocale::GetSystemEncoding()); // IsOK() returns false

This already looks like a bug. Can you check why does the initialization
of wxMBConv_cf (which should be used for this encoding under OS X if I'm
reading the code correctly) fail?

ZL> wxCSConv cs(wxFONTENCODING_GB2312); // IsOK() returns true
ZL>
ZL> wxString string = wxT("some chinese characters");
ZL> printf("%s\n", (const char*) string.mb_str(cs));
ZL>
ZL> Whether IsOK() returns true or false, mb_str() always returns mojibake.

Are you sure the original string is correct? I.e. could the "chinese
characters" be mangled from the beginning e.g. because your source file
doesn't use the correct encoding? If not, I don't see why wouldn't this
work, please make the shortest possible self-contained example showing the
problem and open a Trac ticket for it if the problem is not with your
source file encoding.

Regards,
VZ

--
TT-Solutions: wxWidgets consultancy and technical support
http://www.tt-solutions.com/

Zhou Leo

unread,
Aug 25, 2015, 9:11:41 AM8/25/15
to wx-users
在 2015年8月25日星期二 UTC+8下午6:49:49,Vadim Zeitlin写道:
Thanks for your quick reply. The original string came from a wxTextCtrl by calling GetValue. I'm sure the wxString value is correct. 
I just modified the dialogs.app sample as follows:

void MyFrame::LineEntry(wxCommandEvent& WXUNUSED(event))

{

    wxTextEntryDialog dialog(this,

                             wxT("This is a small sample\n")

                             wxT("A long, long string to test out the text entrybox"),

                             wxT("Please enter a string"),

                             wxT("Default value"),

                             wxOK | wxCANCEL);


    if (dialog.ShowModal() == wxID_OK)

    {

        wxString str = dialog.GetValue();

        wxMessageBox(str, wxT("Got string"), wxOK | wxICON_INFORMATION, this);

        

        wxCSConv cs(wxT("GB2312"));     

        printf("Got string = %s\n", (const char*)str.mb_str(cs));

    }


wxTextEntryDialog also has the same issue. When I type in some Chinese characters in the text control, the character string printed out is just mojibate. I think I will open a Trac ticket. Thanks a lot!

Leo

Zhou Leo

unread,
Aug 25, 2015, 9:20:35 AM8/25/15
to wx-users
在 2015年8月25日星期二 UTC+8下午6:49:49,Vadim Zeitlin写道:
On Tue, 25 Aug 2015 03:16:17 -0700 (PDT) Zhou Leo wrote:

I also find out that when I start a wx executable directly on OS X instead of .app, I cannot type any other characters except English into the text control, although the input language is successfully changed. 

Vadim Zeitlin

unread,
Aug 25, 2015, 10:12:05 AM8/25/15
to wx-u...@googlegroups.com
On Tue, 25 Aug 2015 06:20:35 -0700 (PDT) Zhou Leo wrote:

ZL> I also find out that when I start a wx executable directly on OS X instead
ZL> of .app

This is not supported (by OS X, not wx) anyhow, so we're not responsible
for whatever happens when you don't use a bundle.

Vadim Zeitlin

unread,
Aug 25, 2015, 10:13:42 AM8/25/15
to wx-u...@googlegroups.com
On Tue, 25 Aug 2015 06:11:41 -0700 (PDT) Zhou Leo wrote:

ZL> wxTextEntryDialog also has the same issue. When I type in some Chinese
ZL> characters in the text control, the character string printed out is just
ZL> mojibate. I think I will open a Trac ticket. Thanks a lot!

The trouble is that I risk having trouble typing Chinese characters as I
don't read/write Chinese. So having a simple self-contained console
application showing the problem would be preferred. Of course, if the
problem doesn't happen in the console applications but only happens in the
GUI ones, then it wouldn't help but at least we'd have narrowed it down.

Zhou Leo

unread,
Aug 25, 2015, 11:18:10 AM8/25/15
to wx-users
在 2015年8月25日星期二 UTC+8下午10:13:42,Vadim Zeitlin写道:
The problem exists in both console and GUI apps. I've modified the console sample. I use printf() to print two chinese character strings. One is converted from wxString using default system encoding, which cannot be shown. The other is c string literal, which is shown correctly. Thanks!
console.cpp

Vadim Zeitlin

unread,
Aug 28, 2015, 7:48:38 PM8/28/15
to wx-u...@googlegroups.com
On Tue, 25 Aug 2015 08:18:10 -0700 (PDT) Zhou Leo wrote:

ZL> The problem exists in both console and GUI apps. I've modified the console
ZL> sample. I use printf() to print two chinese character strings. One is
ZL> converted from wxString using default system encoding, which cannot be
ZL> shown. The other is c string literal, which is shown correctly. Thanks!

I've finally tried the program under OS X, but I don't understand what
problem am I supposed to be seeing... Obviously, I could be missing
something as I don't read Chinese but let's take the first character of
your string, "欢" (U+6B22, "e6 ac a2" in UTF-8). Converted to simplified
Chinese (GB2312) it becomes "bb b6" which is exactly what I see in the
output of your program. Of course, it does appear as "??" in the terminal
but that's just because it uses UTF-8...

So what is the problem exactly, once again? And could you please
demonstrate it with a single character to make things simpler for me?

Thanks,

Zhou Leo

unread,
Aug 31, 2015, 5:52:37 AM8/31/15
to wx-users
在 2015年8月29日星期六 UTC+8上午7:48:38,Vadim Zeitlin写道:
Hi, I'd like to convert a wxString to multi-byte char* using current system locale, not using "GB2312" directly. I'm not sure how to do it. 
For now, I construct a wxCSConv object as follows:

wxCSConv cs(wxLocale::GetSystemEncoding());


In my system, wxLocale::GetSystemEncoding() reuturns wxFONTENCODING_MACCHINESESIMP, but the mb_str(cs) returns a empty string. If my method is wrong, could you tell me how I can get appropriate charset to construct wxLocale based on current OS, instead of specifying it directly. Thanks!


Leo

Zhou Leo

unread,
Aug 31, 2015, 5:55:15 AM8/31/15
to wx-users
在 2015年8月29日星期六 UTC+8上午7:48:38,Vadim Zeitlin写道:
On Tue, 25 Aug 2015 08:18:10 -0700 (PDT) Zhou Leo wrote:
Hi, I'd like to convert a wxString to multi-byte char* using current system locale, not using "GB2312" directly. I'm not sure how to do it. 
For now, I construct a wxCSConv object as follows:

wxCSConv cs(wxLocale::GetSystemEncoding());


In my system, wxLocale::GetSystemEncoding() reuturns wxFONTENCODING_MACCHINESESIMP, but the mb_str(cs) returns a empty string. If my method is wrong, could you tell me how I can get appropriate charset to construct wxCSConv based on current OS, instead of specifying it directly. Thanks!


Leo 

Vadim Zeitlin

unread,
Aug 31, 2015, 3:50:53 PM8/31/15
to wx-u...@googlegroups.com
On Mon, 31 Aug 2015 02:55:15 -0700 (PDT) Zhou Leo wrote:

ZL> Hi, I'd like to convert a wxString to multi-byte char* using current system
ZL> locale, not using "GB2312" directly. I'm not sure how to do it.
ZL> For now, I construct a wxCSConv object as follows:
ZL>
ZL> wxCSConv cs(wxLocale::GetSystemEncoding());
ZL>
ZL> In my system, wxLocale::GetSystemEncoding()
ZL> reuturns wxFONTENCODING_MACCHINESESIMP, but the mb_str(cs) returns a empty
ZL> string.

Then I can't reproduce this, it definitely returns a non-empty string for
me. I've tested under 10.8, what is your system and can you trace inside
the conversion function to check what is going wrong?

ZL> If my method is wrong, could you tell me how I can get appropriate
ZL> charset to construct wxCSConv based on current OS, instead of specifying it
ZL> directly.

Using GetSystemEncoding() directly maps to the corresponding CF function.
I'm not sure what exactly does this return though, so maybe it's not the
right one to use.

Anyhow, I'm still confused about what the problem is. Is it that
GetSystemEncoding() returns a wrong value for you? Or is it that wxCSConv
doesn't work? Or something else?

Regards,

Zhou Leo

unread,
Aug 31, 2015, 9:58:08 PM8/31/15
to wx-users
在 2015年9月1日星期二 UTC+8上午3:50:53,Vadim Zeitlin写道:
Sorry for the unclear description.  My system is OS X 10.10 in Simplified Chinese. I just want to get the appropriate charset to create wxCSConv object, not using the constant string wxT("GB2312") directly. My clients' systems may be not in Chinese. I have no idea how to map current system language to the appropriate charset. Or can I get it using some other methods? GetSystemEncoding() returns wxFONTENCODING_MACCHINESESIMP on my system, not wxFONTENCODING_GB2312 which I expect. Is there any API I can use to get the correct charset? Thanks!

Leo

Vadim Zeitlin

unread,
Sep 1, 2015, 9:04:24 AM9/1/15
to wx-u...@googlegroups.com
On Mon, 31 Aug 2015 18:58:07 -0700 (PDT) Zhou Leo wrote:

ZL> Sorry for the unclear description. My system is OS X 10.10 in Simplified
ZL> Chinese. I just want to get the appropriate charset to create wxCSConv
ZL> object, not using the constant string wxT("GB2312") directly. My clients'
ZL> systems may be not in Chinese. I have no idea how to map current system
ZL> language to the appropriate charset. Or can I get it using some other
ZL> methods? GetSystemEncoding() returns wxFONTENCODING_MACCHINESESIMP on my
ZL> system, not wxFONTENCODING_GB2312 which I expect.

Why do you think wxFONTENCODING_MACCHINESESIMP is incorrect in the first
place? I believe it's the same as GB18030 which is just a later/updated
version of GB2312.

Sorry but I still completely fail to see the problem...

Zhou Leo

unread,
Sep 1, 2015, 9:21:51 AM9/1/15
to wx-users
在 2015年9月1日星期二 UTC+8下午9:04:24,Vadim Zeitlin写道:
 On my system,  wxFONTENCODING_MACCHINESESIMP doesn't work. I don't know why. I've implemented a wrapper for GetSystemEncoding(), which just returned wxFONTENCODING_GB2312 in case of wxFONTENCODING_MACCHINESESIMP. At least this trick works for me. Anyway, thanks for your reply!

Leo

Vadim Zeitlin

unread,
Sep 1, 2015, 9:28:20 AM9/1/15
to wx-u...@googlegroups.com
On Tue, 1 Sep 2015 06:21:50 -0700 (PDT) Zhou Leo wrote:

ZL> On my system, wxFONTENCODING_MACCHINESESIMP doesn't work. I don't know
ZL> why.

You really should try debugging it...
Reply all
Reply to author
Forward
0 new messages