Hi Stefan,
On Mon, 2013-06-03 at 19:57 +0200, Stefan Behnel wrote:
Brilliant! Too bad that I've already re-invented much of the same wheel,
but I will use your code for inspiration from now on.
> > elif isinstance(obj, str):
>
> Note that "str" is "bytes" in Py2 and "unicode" in Py3. May or may not be
> what you actually want. More likely, you'd want to handle both separately
> and explicitly.
That's a really good catch! Actually, I'm not sure what's the best
practice here :-( So, sorry, more questions to follow:
The C++ application that I'm wrapping (and I believe this is also valid
for the Lua runtime) uses C++ strings as bytes internally and in theory
it is not supposed to do anything 'smart' about them (as in depending on
the locale/encoding), so all I want is for the users to be able to push
strings back and forth with minimum hassle...
I guess this can be expressed in form of the following rules:
- For Py2, I'd like them to be able to push both str() and unicode()
- For Py3, I guess I can also accept both bytes and unicode strings
- Probably in both cases, it's fine to always return unicode back
Are these the same semantics as what you have implemented for Lua or
not? Your code is as follows:
elif isinstance(o, bytes): # Python -> Lua
lua.lua_pushlstring(L, <char*>(<bytes>o), len(<bytes>o))
pushed_values_count = 1
elif isinstance(o, unicode) and runtime._encoding is not None:
pushed_values_count = push_encoded_unicode_string(runtime, L, <unicode>o)
elif lua_type == lua.LUA_TSTRING: # Lua -> Python
s = lua.lua_tolstring(L, n, &size)
if runtime._encoding is not None:
return s[:size].decode(runtime._encoding)
else:
return s[:size]
And how do I handle bytes in my code? Would the following make sense? My
C++ constructor takes std::string.
cdef object ret
cdef string obj_str
elif isinstance(obj, bytes): # Python -> NEST
obj_str = obj
ret = <Datum*> new StringDatum(obj_str)
elif isinstance(obj, str):
obj_str = obj.encode()
ret = <Datum*> new StringDatum(obj_str)
elif datum_type.compare("stringtype") == 0: # NEST -> Python
ret = (<string> deref_str(<StringDatum*> dat)).decode()
> Sounds reasonable, although there are countless iterable types in Python.
> Which ones of them are worth special casing depends entirely on your code
> and the use cases you anticipate. Maybe you should just test for the buffer
> interface earlier and make the iteration case a generic fallback.
>
> > I can totally restrict myself to 1-D vectors of longs and doubles, sorry
> > if this was not clear from my original post; I really don't need
> > anything more complicated than that.
>
> In that case, memory views should work just fine for you.
>
>
http://docs.cython.org/src/userguide/memoryviews.html
Right, I have read this page many times, and every time it's becoming a
bit more clear, but still I don't understand, how do I check whether the
supplied Python object exposes a buffer interface, and, if yes, then
what is the type and the dimensions of this buffer.
Do I understand correctly that you are implying that there is no nice
way to do this (like isinstance) and I should create functions like
cdef Datum* long_vector_to_datum(long [:] obj):
cdef Datum* double_vector_to_datum(double [:] obj):
and then
if isinstance(obj, bool):
...
else:
try:
long_vector_to_datum(obj)
except:
# doesn't provide this interface
try:
double_vector_to_datum(obj)
except:
# doesn't provide this interface
etc. ?
I've seen it, but I'll get back to it again, as soon as the previous
question is cleared up.
Many thanks!