Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

std::string and unicode

1 view
Skip to first unread message

Declan McMullen

unread,
Apr 8, 2008, 9:44:38 AM4/8/08
to
Hi Guys,
I'm slightly confused about something. Does std::string not support unicode ?
I have stl enabled in my wxwidgets compilation. The reason I did it is because
I have some backend classes that use std::string and I didnt want to be converting to
and from std::string to wxString. Turning on stl meant my strings could be interchangeable.

However I want to work with locales so I went to use the unicode build of wx widgets, this
however fails to allow me to have interchangeable strings.

So if I want to use unicode will I have to use wxString and perform a conversion?

I've been looking through the internationalization section of the book but i've ended up confused :)

Any guidance much appreciated.



--
http://www.computing.dcu.ie/~dmcmullen
declan....@computing.dcu.ie
School of Computing
Postgrad Bay A

Alec Ross

unread,
Apr 8, 2008, 10:31:27 AM4/8/08
to
Hi,

std::string is simply a typedef for a specialisation of the class
template std::basic_string, instantiated for the char type. Similarly
wstring is the same template class instantiated for wchar_t.

The wide char type would, I guess help with some aspects of unicode -
but it is not designed to support unicode as such.

HTH

Alec

In message <mailman.27.1207662...@lists.wxwidgets.org>,
Declan McMullen <declan....@gmail.com> writes


>Hi Guys,
>I'm slightly confused about something. Does std::string not support
>unicode ?
>I have stl enabled in my wxwidgets compilation. The reason I did it is
>because
>I have some backend classes that use std::string and I didnt want to be
>converting to
>and from std::string to wxString. Turning on stl meant my strings could be
>interchangeable.
>
>However I want to work with locales so I went to use the unicode build of
>wx widgets, this
>however fails to allow me to have interchangeable strings.
>
>So if I want to use unicode will I have to use wxString and perform a
>conversion?
>
>I've been looking through the internationalization section of the book but
>i've ended up confused :)
>
>Any guidance much appreciated.

--
Alec Ross

John Ralls

unread,
Apr 8, 2008, 12:02:25 PM4/8/08
to
Yes, it's correct that std::string doesn't support Unicode. Unicode requires more than one byte per character (except for ASCII characters in UTF8), so wxChar becomes either 16bit or 32bit (depending on the platform). You'll have to transcode everything that you need to pass between std::string and wxString, and you'll have to wrap all of your literals in wxT macro. See the topic overviews on Unicode support and Conversion between Unicode and multibyte strings.

There's an alternative, but it requires that you switch to the development branch. The Unicode support for 2.9/3.0 has been rewritten to use UTF8, which uses 8-bit wxChar and is therefore somewhat interoperable with std::string if you're careful about it. 

Regards,
John Ralls



On Apr 8, 2008, at 6:44 AM, Declan McMullen wrote:
Hi Guys,
I'm slightly confused about something. Does std::string not support unicode ?
I have stl enabled in my wxwidgets compilation. The reason I did it is because
I have some backend classes that use std::string and I didnt want to be converting to
and from std::string to wxString. Turning on stl meant my strings could be interchangeable.

However I want to work with locales so I went to use the unicode build of wx widgets, this
however fails to allow me to have interchangeable strings.

So if I want to use unicode will I have to use wxString and perform a conversion?

I've been looking through the internationalization section of the book but i've ended up confused :)

Any guidance much appreciated.



--
Postgrad Bay A _______________________________________________
wx-users mailing list
wx-u...@lists.wxwidgets.org
http://lists.wxwidgets.org/mailman/listinfo/wx-users

Declan McMullen

unread,
Apr 8, 2008, 12:31:59 PM4/8/08
to
Cheers guys.

Declan McMullen

unread,
Apr 8, 2008, 1:20:50 PM4/8/08
to
This guy actually wrote two nice little functions

Vadim Zeitlin

unread,
Apr 8, 2008, 1:31:16 PM4/8/08
to
On Tue, 8 Apr 2008 18:20:50 +0100 Declan McMullen <declan....@gmail.com> wrote:

DM> This guy actually wrote two nice little functions
DM> http://www.kangmaman.com/node/131

I think you could make it even more complicated but I must admit that at a
first glance I don't see how. I'd also strongly recommend deciding upon
your std::string encoding instead of blindly trying UTF-8 first and falling
back to ASCII later (and what happens if it's in Latin-1? or koi8-r?) --
and it also would be better if conversion in both directions were symmetric
which is not the case here. And I won't even mention nice pessimizations
like not passing strings via references and especially creating a temporary
object on a heap instead of just using wxConvUTF8...

In short, don't use the code above.

Regards,
VZ

--
TT-Solutions: wxWidgets consultancy and technical support
http://www.tt-solutions.com/

Declan McMullen

unread,
Apr 9, 2008, 4:37:12 AM4/9/08
to
Any tips on how I should do it ?

Are these up to date ? http://wiki.wxwidgets.org/WxString


_______________________________________________
wx-users mailing list
wx-u...@lists.wxwidgets.org
http://lists.wxwidgets.org/mailman/listinfo/wx-users

Declan McMullen

unread,
Apr 9, 2008, 5:00:29 AM4/9/08
to
This seems to work goign from the wiki entries ?

wxString Utility::std2wx(const string s)
{
    return wxString (s.c_str(), wxConvUTF8);
}

string Utility::wx2std(wxString s)
{
    char * newString = NULL;
    int length = s.Length();
    newString = new char[length];
    return strcpy( newString, (const char*)s.mb_str(wxConvUTF8) );

Marcin 'Malcom' Malich

unread,
Apr 9, 2008, 5:27:44 AM4/9/08
to
On 9 Kwi, 11:00, "Declan McMullen" <declan.mcmul...@gmail.com> wrote:

> string Utility::wx2std(wxString s)
> {
>     char * newString = NULL;
>     int length = s.Length();
>     newString = new char[length];
>     return strcpy( newString, (const char*)s.mb_str(wxConvUTF8) );
>
> }
>

memory leak

std::string Utility::wx2std(wxString s)
{
return std::string(s.mb_str(wxConvUTF8), s.Length());
}

--
malcom
http://malcom.pl

Milan Babuskov

unread,
Apr 9, 2008, 6:12:28 AM4/9/08
to
Declan McMullen wrote:
> Any tips on how I should do it ?

We are using the following in FlameRobin:

//-----------------------------------------------------------------------------
std::string wx2std(const wxString& input, wxMBConv* conv = 0)
{
if (input.empty())
return "";
if (!conv)
conv = wxConvCurrent;
const wxWX2MBbuf buf(input.mb_str(*conv));
// conversion may fail and return 0,
// which isn't a safe value to pass
// to std:string constructor
if (!buf)
return "";
return std::string(buf);
}
//-----------------------------------------------------------------------------
wxString std2wx(const std::string& input, wxMBConv* conv = 0)
{
if (input.empty())
return wxEmptyString;
if (!conv)
conv = wxConvCurrent;
return wxString(input.c_str(), *conv);
}
//-----------------------------------------------------------------------------

Any comments are appreciated.

--
Milan Babuskov
http://www.flamerobin.org

Robert Roebling

unread,
Apr 9, 2008, 7:07:20 AM4/9/08
to

Marcin 'Malcom' Malich wrote:

>
> Declan McMullen wrote:
>
> > string Utility::wx2std(wxString s)
> > {
> > char * newString = NULL;
> > int length = s.Length();
> > newString = new char[length];
> > return strcpy( newString, (const char*)s.mb_str(wxConvUTF8) );
> >
> > }
> >
>
> memory leak
>
> std::string Utility::wx2std(wxString s)
> {
> return std::string(s.mb_str(wxConvUTF8), s.Length());
> }

wxString::Length() returns the number characters in the
unconverted string, not the number of bytes so this will
only work for ASCII characters. Why don't you simply use:

wxString s1 = "test";
std::string s2 = s1.utf8_str();
// or
std::wstring s2 = s1.wc_str();

or is there something wrong with this?

Robert


Vadim Zeitlin

unread,
Apr 9, 2008, 10:56:37 AM4/9/08
to
On Wed, 09 Apr 2008 12:12:28 +0200 Milan Babuskov <mil...@panonnet.net> wrote:

MB> Declan McMullen wrote:
MB> > Any tips on how I should do it ?
MB>
MB> We are using the following in FlameRobin:
MB>
MB> //-----------------------------------------------------------------------------
MB> std::string wx2std(const wxString& input, wxMBConv* conv = 0)
MB> {
MB> if (input.empty())
MB> return "";
MB> if (!conv)
MB> conv = wxConvCurrent;
MB> const wxWX2MBbuf buf(input.mb_str(*conv));
MB> // conversion may fail and return 0,
MB> // which isn't a safe value to pass
MB> // to std:string constructor
MB> if (!buf)
MB> return "";
MB> return std::string(buf);
MB> }
MB> //-----------------------------------------------------------------------------
MB> wxString std2wx(const std::string& input, wxMBConv* conv = 0)
MB> {
MB> if (input.empty())
MB> return wxEmptyString;
MB> if (!conv)
MB> conv = wxConvCurrent;
MB> return wxString(input.c_str(), *conv);
MB> }
MB> //-----------------------------------------------------------------------------
MB>
MB> Any comments are appreciated.

This code is correct although I'm not sure why do you think it's necessary
to test for empty string explicitly, normally you should be able to remove
the checks for input.empty() without any ill effects.

Milan Babuskov

unread,
Apr 10, 2008, 5:21:58 AM4/10/08
to

We used to have some more complex code that allocated buffers and such,
so if string is empty it seemed efficient to not do all that. I guess it
could be removed now.

Thanks,

Declan McMullen

unread,
Apr 11, 2008, 4:20:24 AM4/11/08
to
Cheers for that.

Could you tell me what the second parameter is doing? And should I be passing a value to it or leave it defaulting to 0 ?

On Wed, Apr 9, 2008 at 11:12 AM, Milan Babuskov <mil...@panonnet.net> wrote:
Declan McMullen wrote:
Any tips on how I should do it ?

We are using the following in FlameRobin:

//-----------------------------------------------------------------------------

std::string wx2std(const wxString& input, wxMBConv* conv = 0)
{
   if (input.empty())
       return "";
   if (!conv)
       conv = wxConvCurrent;
   const wxWX2MBbuf buf(input.mb_str(*conv));
   // conversion may fail and return 0,
   // which isn't a safe value to pass
   // to std:string constructor
   if (!buf)
       return "";
   return std::string(buf);
}
//-----------------------------------------------------------------------------
wxString std2wx(const std::string& input, wxMBConv* conv = 0)
{
  if (input.empty())
      return wxEmptyString;
  if (!conv)
      conv = wxConvCurrent;
  return wxString(input.c_str(), *conv);
}
//-----------------------------------------------------------------------------

Any comments are appreciated.
--
Milan Babuskov
http://www.flamerobin.org
_______________________________________________
wx-users mailing list
wx-u...@lists.wxwidgets.org
http://lists.wxwidgets.org/mailman/listinfo/wx-users

Michael Hieke

unread,
Apr 11, 2008, 4:32:21 AM4/11/08
to
Declan McMullen wrote:

> Could you tell me what the second parameter is doing?

For calls where std::strings in the database character set need to be
converted to the system character set we pass the proper conversion object.

> And should I be passing a value to it or leave it defaulting to 0 ?

You can omit it if you only ever go from wxString to std::string and
back in the current system encoding.

HTH

--
Michael Hieke

Declan McMullen

unread,
Apr 11, 2008, 4:40:59 AM4/11/08
to
cheers


_______________________________________________
wx-users mailing list
wx-u...@lists.wxwidgets.org
http://lists.wxwidgets.org/mailman/listinfo/wx-users
0 new messages