Re: [cython-users] Failing GC assertion when defining extension types with weakref support

118 views
Skip to first unread message

Stefan Behnel

unread,
Oct 11, 2012, 2:44:05 AM10/11/12
to cython...@googlegroups.com
Jean-Paul Calderone, 11.10.2012 03:19:
> First off, kudos on Cython. I've been using it on and off for some time
> and I've found it to be quite a handy tool.

Thanks!


> Now to the bad news. ;)
>
> I'm trying to wrap the Bullet Physics C++ library using Cython and just
> tried making some of the wrapper types able to have weak references point
> to them. This seems to have introduced some code that interacts poorly
> with the garbage collector, though. When one of these objects is visited
> during garbage collection, it triggers an assertion failure in the Python
> garbage collector:
>
> python-dbg: ../Modules/gcmodule.c:326: visit_decref: Assertion
> `gc->gc.gc_refs != 0' failed.
>
> All I've really been able to figure out is that the object being visited is
> a weakref instance and the traverser doing the visitor is the
> Cython-generated traverser function for one of my Cython-defined extension
> types (__pyx_tp_traverse_6bullet_6bullet_CollisionObject).
>
> The type definition is quite boring, basically just including a "cdef
> object __weakref__", as described the documentation. A minimal example is
> available at
> http://bazaar.launchpad.net/~exarkun/pybullet/shortened-weakref-crash-example/view/head:/bullet/bullet.pyx with
> a test case which reproduces the failed assertion at
> http://bazaar.launchpad.net/~exarkun/pybullet/shortened-weakref-crash-example/view/head:/bullet/test_bullet.py<http://bazaar.launchpad.net/~exarkun/pybullet/shortened-weakref-crash-example/view/head:/bullet/test_bullet.py#L17>.
> The code is highly ridiculous, but that is a result of producing the
> minimal example that does not depend on Bullet Physics. The weird things
> being done here mimic the more sensible things which need to be done to
> wrap Bullet Physics APIs. The actual code is online as well, on Launchpad,
> at http://code.launchpad.net/pybullet in case anyone would prefer to look
> at that instead.

This code isn't safe:

def __dealloc__(self):
print 'dealloc', self.objects
while self.objects:
obj = self.objects.pop()
Py_DECREF(<object>obj) # BTW, redundant cast here

You are accessing a Python object attribute in __dealloc__(). Python's
garbage collector is free to clear Python object attributes at any moment
between the time the last reachable reference to an object dies and the
time the object is cleaned up and __dealloc__() is called. This definitely
happens when a reference cycle is involved, because in order to clean it
up, the garbage collector must start by killing a reference somewhere in
order to break the cycle, so it takes an arbitrary object and sets its
object attributes to None. The code above will stop working and do nothing
in that case because self.objects is None, i.e. False.

I'm not sure why it crashes, though.

Stefan

Stefan Behnel

unread,
Oct 11, 2012, 12:58:16 PM10/11/12
to cython...@googlegroups.com
Jean-Paul Calderone, 11.10.2012 15:21:
> On Thursday, October 11, 2012 2:44:15 AM UTC-4, Stefan Behnel wrote:
>>
>> You are accessing a Python object attribute in __dealloc__(). Python's
>> garbage collector is free to clear Python object attributes at any moment
>> between the time the last reachable reference to an object dies and the
>> time the object is cleaned up and __dealloc__() is called. This definitely
>> happens when a reference cycle is involved, because in order to clean it
>> up, the garbage collector must start by killing a reference somewhere in
>> order to break the cycle, so it takes an arbitrary object and sets its
>> object attributes to None. The code above will stop working and do nothing
>> in that case because self.objects is None, i.e. False.
>>
>>
> Indeed. I forgot about that while minimizing the example. This doesn't
> affect my actual code, I think, as the "list of objects" is actually held
> inside the c++ object pointed to by self.thisptr - and I think that
> self.thisptr is safe in __dealloc__?

Yes.


> I dug a little more and came across something I don't understand in the
> handling of __weakref__ in the generated tp_traverse for this type.
>
> e = (*v)(o->__weakref__, a); if (e) return e;
>
> Based on inspection of some types defined by CPython itself (eg the ones in
> Modules/_io/fileio.c), this looks fishy. The CPython documentation for
> what to do with the __weakref__ list is rather absent unfortunately, so I'm
> not *sure* this is wrong. However, changing the Cython code generator to
> omit this particular traversal gets rid of the failing gc assertion
> (replacing it with another failing assertion about a negative reference
> count - which doesn't do anything for my confidence about the *correctness*
> of this change). I also talked to Yhg1s on #python and his understanding
> is that it is not the type's responsibility to traverse this list - it is
> owned by the weak reference system, not the instance. The resulting double
> traversal seems consistent with the failing assertion in my initial post.
>
> I've also filed this bug against CPython about the poor documentation:
> http://bugs.python.org/issue16195
>
> Perhaps it will result in some more clarity on the issue.

Antoine mentioned PyObject_ClearWeakRefs() as the way to do it. It's quite
possible that Cython misbehaves here, given the lack of explicit documentation.


> As I mentioned, the example still doesn't work. The next failure after
> changing the code generator is:
>
> Fatal Python error: bullet/bullet.cpp:724 object at 0x1aa9a58 has negative
> ref count -1
>
> I wonder if this is due to the casts in the example doing the wrong thing.
> Does `<object>integer` actually give me the PyObject* with the address
> `integer`?

No, it would give you an integer object. And given that the right side of
the cast is already a Python object (not a C integer), it's actually a no-op.

What you might have been thinking of is "<object>some_voidptr". That casts
a pointer to an object to an owned object reference, thus increasing (and
further managing) the reference count.


> Or is it a no-op, since Python integers are indeed already
> objects (resulting in the Py_DECREF applying to the wrong object). If so,
> I wonder how to spell the correct cast.

Could you explain why you use these complicated indirections at all? Why
doesn't a normal Python reference work?

Stefan

Stefan Behnel

unread,
Oct 12, 2012, 1:24:26 AM10/12/12
to cython...@googlegroups.com
Jean-Paul Calderone, 11.10.2012 20:50:
> On Thursday, October 11, 2012 12:58:19 PM UTC-4, Stefan Behnel wrote:
>> Jean-Paul Calderone, 11.10.2012 15:21:
>>>
>>> I've also filed this bug against CPython about the poor documentation:
>>> http://bugs.python.org/issue16195
>>>
>>> Perhaps it will result in some more clarity on the issue.
>>
>> Antoine mentioned PyObject_ClearWeakRefs() as the way to do it. It's quite
>> possible that Cython misbehaves here, given the lack of explicit
>> documentation.
>
> Cython seems to use that API in the dealloc it generates, and it *appears*
> to be working properly now that I've gotten rid of the traversal for
> __weakref__.

Ok, guess we should give it a try then.


>>> As I mentioned, the example still doesn't work. The next failure after
>>> changing the code generator is:
>>>
>>> Fatal Python error: bullet/bullet.cpp:724 object at 0x1aa9a58 has
>> negative
>>> ref count -1
>>>
>>> I wonder if this is due to the casts in the example doing the wrong
>> thing.
>>> Does `<object>integer` actually give me the PyObject* with the address
>>> `integer`?
>>
>> No, it would give you an integer object. And given that the right side of
>> the cast is already a Python object (not a C integer), it's actually a
>> no-op.
>>
>> What you might have been thinking of is "<object>some_voidptr". That casts
>> a pointer to an object to an owned object reference, thus increasing (and
>> further managing) the reference count.
>
> Makes sense. I ended up doing this to force the desired behavior:
>
> cdef long ptr
> ....
> ptr = someinteger
> Py_DECREF(<object><void*>ptr)
>
> Sorry if that's unpleasant to read (it wasn't very pleasant to write).

What I usually do in this case is to use Py_XDECREF() instead, simply
because it's naturally defined in Cython's CPython .pxd files on a plain
"PyObject*" rather than a managed "object" reference. So:

cdef size_t ptr # instead of long to guarantee portability
ptr = someinteger
Py_XDECREF(<PyObject*>ptr)

You can also just redefine your own Py_DECREF() to accept a PyObject*, but
I can usually afford (and am often happy about) the NULL check in Py_XDECREF().

The main difference is that your code does redundant ref-counting for the
intermediate managed reference, whereas mine only does the requested decref.


>>> Or is it a no-op, since Python integers are indeed already
>>> objects (resulting in the Py_DECREF applying to the wrong object). If
>> so,
>>> I wonder how to spell the correct cast.
>>
>> Could you explain why you use these complicated indirections at all? Why
>> doesn't a normal Python reference work?
>>
>
> I expect you'll be happy to hear that this is *mostly* just to make the
> minimal example work. The real API I'm interacting with is:
>
> void *getUserPointer()
> void setUserPointer(void *userPointer)
>
> And the real __dealloc__ I've got is:
>
> def __dealloc__(self):
> cdef btCollisionObject *obj
>
> while self.thisptr.getCollisionObjectArray().size():
> obj = self.thisptr.getCollisionObjectArray().at(0)
> self.thisptr.removeCollisionObject(obj)
> if NULL != obj.getUserPointer():
> Py_DECREF(<object>obj.getUserPointer())

See? Perfect place for a Py_XDECREF() :)


> This still doesn't use normal Python references, but hopefully you can see
> why - the lifetime of `obj` is managed by a Cython-defined class,
> CollisionObject, and I'm keeping the pointer to `obj` in the void* the
> underlying API gives me.

Ok, makes total sense. It's exactly the same thing in lxml.

Stefan

Charanpal Dhanjal

unread,
Oct 12, 2012, 5:00:34 AM10/12/12
to cython...@googlegroups.com
I downloaded and compiled Cython 0.17.1 and ran cython in the
Cython-0.17.1/build/scripts-2.7 directory but if I type:

cython -V

I get version 0.15.1. Is this just an incorrect value, or is the binary
really an older version of Cython? I am using Python 2.7 on Ubuntu 12.04
if that helps.

Thanks,
Charanpal


Talanor

unread,
Oct 12, 2012, 5:03:42 AM10/12/12
to cython...@googlegroups.com
Hello,

you must have done something wrong. I'm on ubuntu and am working with
python 2.7 too and I get "Cython version 0.18-pre".

(and I assumed you did ./cython -V and not cython -V)

Check you have downloaded the latest cython version.

Charanpal Dhanjal

unread,
Oct 12, 2012, 5:17:35 AM10/12/12
to cython...@googlegroups.com
I just downloaded the archive again (Cython 0.17.1), extracted and then
compiled using:

python setup.py install --user

then in Cython-0.17.1/build/scripts-2.7 I call ./cython -V and get
version 0.15.1. I also tried the code in git repository and get exactly
the same problem. Any ideas - why does 0.18-pre work?

smallfish

unread,
Oct 12, 2012, 5:10:06 AM10/12/12
to cython...@googlegroups.com
yes, like: "which cython", check the cython path
--

Stefan Behnel

unread,
Oct 12, 2012, 6:25:41 AM10/12/12
to cython...@googlegroups.com
Hi,

please don't top-post and please don't hijack threads.

Charanpal Dhanjal, 12.10.2012 11:17:
> On 12/10/2012 11:03, Talanor wrote:
>> On 10/12/2012 11:00 AM, Charanpal Dhanjal wrote:
>>> I downloaded and compiled Cython 0.17.1 and ran cython in the
>>> Cython-0.17.1/build/scripts-2.7 directory but if I type:
>>>
>>> cython -V
>>>
>>> I get version 0.15.1. Is this just an incorrect value, or is the binary
>>> really an older version of Cython? I am using Python 2.7 on Ubuntu 12.04
>>> if that helps.
>>>
>> you must have done something wrong. I'm on ubuntu and am working with
>> python 2.7 too and I get "Cython version 0.18-pre".
>>
>> (and I assumed you did ./cython -V and not cython -V)
>>
>> Check you have downloaded the latest cython version.
>
> I just downloaded the archive again (Cython 0.17.1), extracted and then
> compiled using:
>
> python setup.py install --user
>
> then in Cython-0.17.1/build/scripts-2.7 I call ./cython -V and get version
> 0.15.1. I also tried the code in git repository and get exactly the same
> problem. Any ideas

My guess is that you have installed the older Cython version either with
your package distribution or with setuptools. Just uninstall the old
version and try again. Make sure "which cython" gives the right answer.
Also check your site-packages directory in your Python installation for
older versions.


> why does 0.18-pre work?

There's no difference. You're simply starting the wrong Cython installation.

Stefan

Charanpal Dhanjal

unread,
Oct 12, 2012, 6:35:58 AM10/12/12
to cython...@googlegroups.com
On 12/10/2012 12:25, Stefan Behnel wrote:
> Hi,
>
> please don't top-post and please don't hijack threads.
Sorry about that, it was an accident. I replied to an existing thread
and changed the subject thinking it would start a new one.

Lawrence Mitchell

unread,
Oct 12, 2012, 4:51:26 PM10/12/12
to cython...@googlegroups.com
Stefan Behnel wrote:

[...]

> cdef size_t ptr # instead of long to guarantee portability

FWIW, that's not giving portability (size_t need not be big
enough to hold a <void *>). You want

from libc.stdint cimport uintptr_t
...
cdef uintptr_t ptr

instead

> ptr = someinteger
> Py_XDECREF(<PyObject*>ptr)

[...]

--
Lawrence Mitchell <we...@gmx.li>

Reply all
Reply to author
Forward
0 new messages