Putting a Python object in a C struct

672 views
Skip to first unread message

Jeroen Demeyer

unread,
Nov 27, 2015, 10:10:37 AM11/27/15
to cython-users
Is there a fundamental reason why this is disallowed?

cdef extern from "Python.h":
ctypedef struct PyDictEntry:
Py_ssize_t me_hash
object me_key
object me_value

It gives me an error message
C struct/union member cannot be a Python object

Of course I can write PyObject*, but then I need several <object> casts
in other places.

Alternatively, a way of casting <PyObject*> to <object> without changing
refcounts would also be very useful.

Robert Bradshaw

unread,
Nov 27, 2015, 12:57:11 PM11/27/15
to cython...@googlegroups.com
On Fri, Nov 27, 2015 at 7:10 AM, Jeroen Demeyer <jdem...@cage.ugent.be> wrote:
> Is there a fundamental reason why this is disallowed?
>
> cdef extern from "Python.h":
> ctypedef struct PyDictEntry:
> Py_ssize_t me_hash
> object me_key
> object me_value
>
> It gives me an error message
> C struct/union member cannot be a Python object

This is because we can't do proper reference counting of an object
that's part of a struct (e.g. it can be copied and passed around
without incrementing the reference count, possibly by third party
libraries.

> Of course I can write PyObject*, but then I need several <object> casts in
> other places.
>
> Alternatively, a way of casting <PyObject*> to <object> without changing
> refcounts would also be very useful.

Is there a reason you can't change the refcounts? Note that the act of
casting a PyObject* to an object does not itself change the refcount,
but often the context it's use in requires creating a new reference.

- Robert

Jeroen Demeyer

unread,
Dec 3, 2015, 2:45:34 AM12/3/15
to cython...@googlegroups.com
On 2015-11-27 18:56, Robert Bradshaw wrote:
>> It gives me an error message
>> C struct/union member cannot be a Python object
>
> This is because we can't do proper reference counting of an object
> that's part of a struct (e.g. it can be copied and passed around
> without incrementing the reference count, possibly by third party
> libraries.
Sure, it might be accessed by third party libraries. But so what? You
just have to make sure that those third-party libraries work correctly.

My feeling is that if you would just remove the error message and handle
object-in-struct the same as regular objects in Cython, it would "just
work".

>> Of course I can write PyObject*, but then I need several <object> casts in
>> other places.
>>
>> Alternatively, a way of casting <PyObject*> to <object> without changing
>> refcounts would also be very useful.
>
> Is there a reason you can't change the refcounts?
OK, there is no real good reason. Just the perceived inefficiency of the
superfluous Py_INCREF and Py_DECREF and the fact that this doesn't work
with NULL pointers (which are sometimes valid arguments to functions
taking PyObject*).

Sturla Molden

unread,
Dec 3, 2015, 5:05:25 AM12/3/15
to cython...@googlegroups.com
On 27/11/15 16:10, Jeroen Demeyer wrote:

> Alternatively, a way of casting <PyObject*> to <object> without changing
> refcounts would also be very useful.

I have often missed a borrowed object type, that would behave like
object except it has no refcounting.

Sturla

Robert Bradshaw

unread,
Dec 3, 2015, 12:56:25 PM12/3/15
to cython...@googlegroups.com
On Wed, Dec 2, 2015 at 11:45 PM, Jeroen Demeyer <jdem...@cage.ugent.be> wrote:
> On 2015-11-27 18:56, Robert Bradshaw wrote:
>>>
>>> It gives me an error message
>>> C struct/union member cannot be a Python object
>>
>>
>> This is because we can't do proper reference counting of an object
>> that's part of a struct (e.g. it can be copied and passed around
>> without incrementing the reference count, possibly by third party
>> libraries.
>
> Sure, it might be accessed by third party libraries. But so what? You just
> have to make sure that those third-party libraries work correctly.
>
> My feeling is that if you would just remove the error message and handle
> object-in-struct the same as regular objects in Cython, it would "just
> work".

Structs are often passed by value, but when copying a struct
containing an object one must increment its refcount. (And recursively
increment the objects of all structs that it contains.) Consider the
struct

cdef struct A:
cdef int i
object o

Now if we have a function

cdef A GimmeA():
...

It would have to increment the refcount of A.o before returning the
result (to ensure it's at least 1). This means that the caller (which
may not be Cython) cannot simply ignore the return value--it must
decrement A.a before discarding it (so code like GimmeA().i is right
out). This is especially bad for generic libraries. For example,
consider

cdef A a1, a2, a3
cdef vector<A> va
va.push_back(a1)
va[0] = a2
cdef vector<A> vaa = va
vaa[0] = a3
...

Clearly the refcounts are getting all messed up here...


Now in C++ we could create a builtin "smart pointer" to object that
manages the references correctly, and A would become

struct {
int i;
ref_counted_ptr<PyObject> a;
};

and all would work as expected (taking care that the GIL is held
whenever handling such objects).

An alternative is to provide better support for borrowed references.
PyObject* is essentially a borrowed object, though one must manually
cast back and forth between it an object. I've toyed with the idea of
adding

cdef object x
cdef object* x_ptr = &x

perhaps even automatically dereferencing based on context (e.g.
my_py_function(x_ptr)). This syntax would extend to cdef classes as
well, e.g. "cdef sage.rings.integer.Integer* integer_ptr." OTOH, this
makes it really easy to shoot oneself in the foot...

>>> Of course I can write PyObject*, but then I need several <object> casts
>>> in
>>> other places.
>>>
>>> Alternatively, a way of casting <PyObject*> to <object> without changing
>>> refcounts would also be very useful.
>>
>> Is there a reason you can't change the refcounts?
>
> OK, there is no real good reason. Just the perceived inefficiency of the
> superfluous Py_INCREF and Py_DECREF

We avoid it when we can. Of course, it's often impossible for the
compiler to make inferences like "this object is always held onto
elsewhere" using only local information.

> and the fact that this doesn't work with
> NULL pointers (which are sometimes valid arguments to functions taking
> PyObject*).

NULL is never a valid object, but there's always Py_XINCREF and
Py_XDECREF if it may be null.
Reply all
Reply to author
Forward
0 new messages