Hi Guys, I'm slightly confused about something. Does std::string not support unicode ? I have stl enabled in my wxwidgets compilation. The reason I did it is because I have some backend classes that use std::string and I didnt want to be converting to and from std::string to wxString. Turning on stl meant my strings could be interchangeable.
However I want to work with locales so I went to use the unicode build of wx widgets, this however fails to allow me to have interchangeable strings.
So if I want to use unicode will I have to use wxString and perform a conversion?
I've been looking through the internationalization section of the book but i've ended up confused :)
std::string is simply a typedef for a specialisation of the class template std::basic_string, instantiated for the char type. Similarly wstring is the same template class instantiated for wchar_t.
The wide char type would, I guess help with some aspects of unicode - but it is not designed to support unicode as such.
HTH
Alec
In message <mailman.27.1207662690.19501.wx-us...@lists.wxwidgets.org>, Declan McMullen <declan.mcmul...@gmail.com> writes
>Hi Guys, >I'm slightly confused about something. Does std::string not support >unicode ? >I have stl enabled in my wxwidgets compilation. The reason I did it is >because >I have some backend classes that use std::string and I didnt want to be >converting to >and from std::string to wxString. Turning on stl meant my strings could be >interchangeable.
>However I want to work with locales so I went to use the unicode build of >wx widgets, this >however fails to allow me to have interchangeable strings.
>So if I want to use unicode will I have to use wxString and perform a >conversion?
>I've been looking through the internationalization section of the book but >i've ended up confused :)
Yes, it's correct that std::string doesn't support Unicode. Unicode requires more than one byte per character (except for ASCII characters in UTF8), so wxChar becomes either 16bit or 32bit (depending on the platform). You'll have to transcode everything that you need to pass between std::string and wxString, and you'll have to wrap all of your literals in wxT macro. See the topic overviews on Unicode support and Conversion between Unicode and multibyte strings.
There's an alternative, but it requires that you switch to the development branch. The Unicode support for 2.9/3.0 has been rewritten to use UTF8, which uses 8-bit wxChar and is therefore somewhat interoperable with std::string if you're careful about it.
Regards, John Ralls
On Apr 8, 2008, at 6:44 AM, Declan McMullen wrote:
> Hi Guys, > I'm slightly confused about something. Does std::string not support > unicode ? > I have stl enabled in my wxwidgets compilation. The reason I did it > is because > I have some backend classes that use std::string and I didnt want to > be converting to > and from std::string to wxString. Turning on stl meant my strings > could be interchangeable.
> However I want to work with locales so I went to use the unicode > build of wx widgets, this > however fails to allow me to have interchangeable strings.
> So if I want to use unicode will I have to use wxString and perform > a conversion?
> I've been looking through the internationalization section of the > book but i've ended up confused :)
On Tue, Apr 8, 2008 at 5:02 PM, John Ralls <jra...@ceridwen.us> wrote: > Yes, it's correct that std::string doesn't support Unicode. Unicode > requires more than one byte per character (except for ASCII characters in > UTF8), so wxChar becomes either 16bit or 32bit (depending on the platform). > You'll have to transcode everything that you need to pass between > std::string and wxString, and you'll have to wrap all of your literals in > wxT macro. See the topic overviews on Unicode support and Conversion between > Unicode and multibyte strings. > There's an alternative, but it requires that you switch to the development > branch. The Unicode support for 2.9/3.0 has been rewritten to use UTF8, > which uses 8-bit wxChar and is therefore somewhat interoperable with > std::string if you're careful about it.
> Regards, > John Ralls
> On Apr 8, 2008, at 6:44 AM, Declan McMullen wrote:
> Hi Guys, > I'm slightly confused about something. Does std::string not support > unicode ? > I have stl enabled in my wxwidgets compilation. The reason I did it is > because > I have some backend classes that use std::string and I didnt want to be > converting to > and from std::string to wxString. Turning on stl meant my strings could be > interchangeable.
> However I want to work with locales so I went to use the unicode build of > wx widgets, this > however fails to allow me to have interchangeable strings.
> So if I want to use unicode will I have to use wxString and perform a > conversion?
> I've been looking through the internationalization section of the book but > i've ended up confused :)
> On Tue, Apr 8, 2008 at 5:02 PM, John Ralls <jra...@ceridwen.us> wrote:
> > Yes, it's correct that std::string doesn't support Unicode. Unicode > > requires more than one byte per character (except for ASCII characters in > > UTF8), so wxChar becomes either 16bit or 32bit (depending on the platform). > > You'll have to transcode everything that you need to pass between > > std::string and wxString, and you'll have to wrap all of your literals in > > wxT macro. See the topic overviews on Unicode support and Conversion between > > Unicode and multibyte strings. > > There's an alternative, but it requires that you switch to the > > development branch. The Unicode support for 2.9/3.0 has been rewritten to > > use UTF8, which uses 8-bit wxChar and is therefore somewhat interoperable > > with std::string if you're careful about it.
> > Regards, > > John Ralls
> > On Apr 8, 2008, at 6:44 AM, Declan McMullen wrote:
> > Hi Guys, > > I'm slightly confused about something. Does std::string not support > > unicode ? > > I have stl enabled in my wxwidgets compilation. The reason I did it is > > because > > I have some backend classes that use std::string and I didnt want to be > > converting to > > and from std::string to wxString. Turning on stl meant my strings could > > be interchangeable.
> > However I want to work with locales so I went to use the unicode build > > of wx widgets, this > > however fails to allow me to have interchangeable strings.
> > So if I want to use unicode will I have to use wxString and perform a > > conversion?
> > I've been looking through the internationalization section of the book > > but i've ended up confused :)
I think you could make it even more complicated but I must admit that at a first glance I don't see how. I'd also strongly recommend deciding upon your std::string encoding instead of blindly trying UTF-8 first and falling back to ASCII later (and what happens if it's in Latin-1? or koi8-r?) -- and it also would be better if conversion in both directions were symmetric which is not the case here. And I won't even mention nice pessimizations like not passing strings via references and especially creating a temporary object on a heap instead of just using wxConvUTF8...
> I think you could make it even more complicated but I must admit that at > a > first glance I don't see how. I'd also strongly recommend deciding upon > your std::string encoding instead of blindly trying UTF-8 first and > falling > back to ASCII later (and what happens if it's in Latin-1? or koi8-r?) -- > and it also would be better if conversion in both directions were > symmetric > which is not the case here. And I won't even mention nice pessimizations > like not passing strings via references and especially creating a > temporary > object on a heap instead of just using wxConvUTF8...
> > I think you could make it even more complicated but I must admit that > > at a > > first glance I don't see how. I'd also strongly recommend deciding upon > > your std::string encoding instead of blindly trying UTF-8 first and > > falling > > back to ASCII later (and what happens if it's in Latin-1? or koi8-r?) -- > > and it also would be better if conversion in both directions were > > symmetric > > which is not the case here. And I won't even mention nice pessimizations > > like not passing strings via references and especially creating a > > temporary > > object on a heap instead of just using wxConvUTF8...
Declan McMullen wrote: > Any tips on how I should do it ?
We are using the following in FlameRobin:
//------------------------------------------------------------------------- ---- std::string wx2std(const wxString& input, wxMBConv* conv = 0) { if (input.empty()) return ""; if (!conv) conv = wxConvCurrent; const wxWX2MBbuf buf(input.mb_str(*conv)); // conversion may fail and return 0, // which isn't a safe value to pass // to std:string constructor if (!buf) return ""; return std::string(buf);
wxString::Length() returns the number characters in the unconverted string, not the number of bytes so this will only work for ASCII characters. Why don't you simply use:
On Wed, 09 Apr 2008 12:12:28 +0200 Milan Babuskov <mil...@panonnet.net> wrote:
MB> Declan McMullen wrote:
MB> > Any tips on how I should do it ? MB> MB> We are using the following in FlameRobin: MB> MB> //------------------------------------------------------------------------- ---- MB> std::string wx2std(const wxString& input, wxMBConv* conv = 0) MB> { MB> if (input.empty()) MB> return ""; MB> if (!conv) MB> conv = wxConvCurrent; MB> const wxWX2MBbuf buf(input.mb_str(*conv)); MB> // conversion may fail and return 0, MB> // which isn't a safe value to pass MB> // to std:string constructor MB> if (!buf) MB> return ""; MB> return std::string(buf); MB> } MB> //------------------------------------------------------------------------- ---- MB> wxString std2wx(const std::string& input, wxMBConv* conv = 0) MB> { MB> if (input.empty()) MB> return wxEmptyString; MB> if (!conv) MB> conv = wxConvCurrent; MB> return wxString(input.c_str(), *conv); MB> } MB> //------------------------------------------------------------------------- ---- MB> MB> Any comments are appreciated.
This code is correct although I'm not sure why do you think it's necessary to test for empty string explicitly, normally you should be able to remove the checks for input.empty() without any ill effects.
Vadim Zeitlin wrote: > MB> //------------------------------------------------------------------------- ---- > MB> std::string wx2std(const wxString& input, wxMBConv* conv = 0) > MB> { > MB> if (input.empty()) > MB> return ""; > MB> if (!conv) > MB> conv = wxConvCurrent; > MB> const wxWX2MBbuf buf(input.mb_str(*conv)); > MB> // conversion may fail and return 0, > MB> // which isn't a safe value to pass > MB> // to std:string constructor > MB> if (!buf) > MB> return ""; > MB> return std::string(buf); > MB> } > MB> //------------------------------------------------------------------------- ---- > MB> wxString std2wx(const std::string& input, wxMBConv* conv = 0) > MB> { > MB> if (input.empty()) > MB> return wxEmptyString; > MB> if (!conv) > MB> conv = wxConvCurrent; > MB> return wxString(input.c_str(), *conv); > MB> } > MB> //------------------------------------------------------------------------- ---- > MB> > MB> Any comments are appreciated.
> This code is correct although I'm not sure why do you think it's necessary > to test for empty string explicitly, normally you should be able to remove > the checks for input.empty() without any ill effects.
We used to have some more complex code that allocated buffers and such, so if string is empty it seemed efficient to not do all that. I guess it could be removed now.