On Mon, Apr 29, 2013 at 11:47 AM, Chris Barker - NOAA Federal
<
chris....@noaa.gov> wrote:
> On Mon, Apr 29, 2013 at 2:55 AM, Sturla Molden <
stu...@molden.no> wrote:
>>
>> If you have a C++11 compiler (e.g. GCC), the easiest solution is just to
>> use STL containers. You will find STL containers that corresponds to most
>> Python types. For example:
>
> <snip>
>
>>
>> unicode -> std::wstring
>
>
> Is this at all the case anymore? My understanding is that a wstring is
> simply an array of 2-byte "char"s -- and encoding, interpretatin, etc is all
> up to you (or libraries that you're using).
>
> whereas a Python unicode object really is unicode, and if you use the Python
> API dont need to care about encoding, etc -- it does all that for you.
> Internally:
I agree, wstring is almost certainly not what you want.
> Py2: unicode objects are stored internally in buffers of either two or four
> bytes, (UCS16 or UCS32) determined at compile time.
>
> Py3: The latest version supports multiple internal encodings depending on
> what data you actually have in the object -- very cool, but a bit hard to
> directly pass to cany other library.
>
> So: what you need to so is determine what encoding you want to use in your C
> or C++ code, choose an appropriate container (maybe std::string or
> std::wstring, or maybe something from a proper unicode library). Then in
> your Cython, encode the unicode object to a bytes object, and pass that
> bytes object off to your C/C++ datatype.
>
> I sure wish it were easier, but it's just not.
It's better with this last release:
http://docs.cython.org/src/tutorial/strings.html#auto-encoding-and-decoding
.
It should be noted that if you're trying to avoid the GIL you may end
up having to do your own synchronizing of the (not completely
thread-safe) stl containers.
- Robert