object and PyObject*

2,348 views
Skip to first unread message

Darren Dale

unread,
Apr 10, 2011, 6:54:18 PM4/10/11
to cython...@googlegroups.com
This afternoon, I have tried using some PyString_* definitions in some
code with the ones provided by Cython, and have been trying to wrap my
head around why cython was complaining: "Cannot convert 'PyObject *'
to Python object". I ran across a rather amusing post apologizing for
not having read the documentation buried in
http://www.cython.org/release/Cython-0.13/Cython/Includes/cpython/__init__.pxd
. That documentation states:

# REFERENCE COUNTING:
#
# JUST TO SCARE YOU:
# If you are going to use any of the Python/C API in your Cython
# program, you might be responsible for doing reference counting.
# Read http://docs.python.org/api/refcounts.html which is so
# important I've copied it below.
#
# For all the declaration below, whenver the Py_ function returns
# a *new reference* to a PyObject*, the return type is "object".
# When the function returns a borrowed reference, the return
# type is PyObject*. When Cython sees "object" as a return type
# it doesn't increment the reference count. When it sees PyObject*
# in order to use the result you must explicitly cast to <object>,
# and when you do that Cython increments the reference count wether
# you want it to or not, forcing you to an explicit DECREF (or leak memory).
# To avoid this we make the above convention.
[...]
## With borrowed
## references if you do an explicit typecast to <object>, Pyrex generates an
## INCREF and DECREF so you have to be careful. However, you got a
## borrowed reference in this case, so there's got to be another reference
## to your object, so you're OK, as long as you relealize this
## and use the result of an explicit cast to <object> as a borrowed
## reference (and you can call Py_INCREF if you want to turn it
## into another reference for some reason).

I find this second stanza very confusing, so I went to
http://docs.python.org/api/refcounts.html to read more, but I don't
see any discussion differentiating object vs PyObject, or of how
<object> casts automatically increments the reference count. That
seems important enough to warrant some discussion at refcounts.html,
rather than just in __init__.pxd, since the behavior effects more than
just consumers of cython's cpython definitions.

What is meant by "If you do an explicit typecast to <object>, Pyrex
generates an INCREF and DECREF so you have to be careful"? Why does it
say "and DECREF" there, isn't that misleading?

Darren

Darren Dale

unread,
Apr 10, 2011, 9:20:59 PM4/10/11
to cython...@googlegroups.com
On Sun, Apr 10, 2011 at 6:54 PM, Darren Dale <dsda...@gmail.com> wrote:
> This afternoon, I have tried using some PyString_* definitions in some
> code with the ones provided by Cython,

That should have read "I tried replacing some of my PyString_*
definitions with the ones provided by Cython"...

>and have been trying to wrap my
> head around why cython was complaining: "Cannot convert 'PyObject *'
> to Python object". I ran across a rather amusing post apologizing for
> not having read the documentation buried in
> http://www.cython.org/release/Cython-0.13/Cython/Includes/cpython/__init__.pxd

... and I apologize for being a bit peevish.

Sturla Molden

unread,
Apr 10, 2011, 9:49:10 PM4/10/11
to cython...@googlegroups.com
Den 11.04.2011 00:54, skrev Darren Dale:
> What is meant by "If you do an explicit typecast to<object>, Pyrex
> generates an INCREF and DECREF so you have to be careful"? Why does it
> say "and DECREF" there, isn't that misleading?
>


If you do

a = <object> b

you get an incref and a decref on a.

Why? Because if an object is incref'ed, it will also be decref'ed,
eventually. It depends on what you do with it and how long it stays in
scope. When a is reassigned or falls out of scope you get a decref.


Sturla

Stefan Behnel

unread,
Apr 11, 2011, 2:39:22 AM4/11/11
to cython...@googlegroups.com
Darren Dale, 11.04.2011 00:54:

> This afternoon, I have tried using some PyString_* definitions in some
> code with the ones provided by Cython, and have been trying to wrap my
> head around why cython was complaining: "Cannot convert 'PyObject *'
> to Python object".

It would be best to show an example of your code here. In general, you
shouldn't be using PyObject* unless it's truly required *and* you know what
you are doing.


> I ran across a rather amusing post apologizing for
> not having read the documentation buried in
> http://www.cython.org/release/Cython-0.13/Cython/Includes/cpython/__init__.pxd
> . That documentation states:
>
> # REFERENCE COUNTING:
> #
> # JUST TO SCARE YOU:
> # If you are going to use any of the Python/C API in your Cython
> # program, you might be responsible for doing reference counting.

This is at least half-serious. Unless you understand reference counting,
you're better off not to deal with borrowed references.


> # For all the declaration below, whenver the Py_ function returns
> # a *new reference* to a PyObject*, the return type is "object".
> # When the function returns a borrowed reference, the return
> # type is PyObject*. When Cython sees "object" as a return type
> # it doesn't increment the reference count. When it sees PyObject*
> # in order to use the result you must explicitly cast to<object>,
> # and when you do that Cython increments the reference count wether
> # you want it to or not, forcing you to an explicit DECREF (or leak memory).
> # To avoid this we make the above convention.
> [...]
> ## With borrowed
> ## references if you do an explicit typecast to<object>, Pyrex generates an

"Pyrex" reference fixed now.

> ## INCREF and DECREF so you have to be careful. However, you got a
> ## borrowed reference in this case, so there's got to be another reference
> ## to your object, so you're OK, as long as you relealize this
> ## and use the result of an explicit cast to<object> as a borrowed
> ## reference (and you can call Py_INCREF if you want to turn it
> ## into another reference for some reason).
>
> I find this second stanza very confusing, so I went to
> http://docs.python.org/api/refcounts.html to read more, but I don't
> see any discussion differentiating object vs PyObject, or of how
> <object> casts automatically increments the reference count.

That's because Python doesn't have a notion of an "<object> cast". That's a
Cython language thing.

Basically, when you cast a PyObject* to "object", Cython will give you a
new reference. When you cast an object reference to a PyObject*, Cython
will give you a borrowed reference.

It's a long-standing feature request to improve the language support for
borrowed references in order to keep users from having to deal with
PyObject* themselves. But no one has implemented that yet.

Stefan

Darren Dale

unread,
Apr 11, 2011, 8:06:56 AM4/11/11
to cython...@googlegroups.com, Stefan Behnel
On Mon, Apr 11, 2011 at 2:39 AM, Stefan Behnel <stef...@behnel.de> wrote:
> Darren Dale, 11.04.2011 00:54:
>>
>> This afternoon, I have tried using some PyString_* definitions in some
>> code with the ones provided by Cython, and have been trying to wrap my
>> head around why cython was complaining: "Cannot convert 'PyObject *'
>> to Python object".
>
> It would be best to show an example of your code here. In general, you
> shouldn't be using PyObject* unless it's truly required *and* you know what
> you are doing.

I'm working on py3 support for h5py, which currently makes a few calls
to PyString_*. For example, the following appears in a pyx file:

cdef extern from "Python.h":

# From Cython declarations
ctypedef void PyTypeObject
ctypedef struct PyObject:
Py_ssize_t ob_refcnt
PyTypeObject *ob_type

int PyString_CheckExact(PyObject* str) except *

I tried to replace that with:

from cpython cimport PyTypeObject, PyObject, PyString_CheckExact

and that is when I saw "Cannot convert 'PyObject *' to Python object"
errors, because PyString_CheckExact as provided by cython accepts an
object, not a PyObject*.

>> I ran across a rather amusing post apologizing for
>> not having read the documentation buried in
>>
>> http://www.cython.org/release/Cython-0.13/Cython/Includes/cpython/__init__.pxd
>> . That documentation states:
>>
>> # REFERENCE COUNTING:
>> #
>> #   JUST TO SCARE YOU:
>> #   If you are going to use any of the Python/C API in your Cython
>> #   program, you might be responsible for doing reference counting.
>
> This is at least half-serious. Unless you understand reference counting,
> you're better off not to deal with borrowed references.

I guess I only understand reference counting in principle, but I am
contributing to a project that uses the Python C API in a few places
that matter for py3 support.

Sorry, I looked right over the fact that refcounts.html was posted at
python.org. I thought I was loading a cython webpage. Maybe the
discussion in __init__.pxd could appear somewhere on the cython
documentation website?

> Basically, when you cast a PyObject* to "object", Cython will give you a new
> reference. When you cast an object reference to a PyObject*, Cython will
> give you a borrowed reference.
>
> It's a long-standing feature request to improve the language support for
> borrowed references in order to keep users from having to deal with
> PyObject* themselves. But no one has implemented that yet.

Thank you for the clarifications.

Darren

Darren Dale

unread,
Apr 11, 2011, 8:10:52 AM4/11/11
to cython...@googlegroups.com, Sturla Molden

The documentation says: "you can call Py_INCREF if you want to turn it
into another reference for some reason". Why would this be necessary,
if it was automatically incref'ed? Couldn't I return "a", so it
doesn't go out of scope, and therefor it wouldn't be decref'ed?

Thanks,
Darren

Stefan Behnel

unread,
Apr 11, 2011, 8:44:53 AM4/11/11
to Darren Dale, Cython-users
Darren Dale, 11.04.2011 14:06:

> On Mon, Apr 11, 2011 at 2:39 AM, Stefan Behnel wrote:
>> Darren Dale, 11.04.2011 00:54:
>>>
>>> This afternoon, I have tried using some PyString_* definitions in some
>>> code with the ones provided by Cython, and have been trying to wrap my
>>> head around why cython was complaining: "Cannot convert 'PyObject *'
>>> to Python object".
>>
>> It would be best to show an example of your code here. In general, you
>> shouldn't be using PyObject* unless it's truly required *and* you know what
>> you are doing.
>
> I'm working on py3 support for h5py, which currently makes a few calls
> to PyString_*. For example, the following appears in a pyx file:
>
> cdef extern from "Python.h":
>
> # From Cython declarations
> ctypedef void PyTypeObject
> ctypedef struct PyObject:
> Py_ssize_t ob_refcnt
> PyTypeObject *ob_type
>
> int PyString_CheckExact(PyObject* str) except *
>
> I tried to replace that with:
>
> from cpython cimport PyTypeObject, PyObject, PyString_CheckExact
>
> and that is when I saw "Cannot convert 'PyObject *' to Python object"
> errors, because PyString_CheckExact as provided by cython accepts an
> object, not a PyObject*.

Ok. I actually meant to ask for a code snippet that *calls* the functions,
because that's where the problem is. Generally speaking, there should be
little code that needs to call such functions on PyObject*, because Cython
will always give you an object (and therefore also defines the external
C-API functions accordingly). Only when you receive a borrowed reference
from a raw call to the C-API you'd have to start caring what you do.

I could imagine that the code in question contains a micro optimisation
that avoids ref-counting before knowing the type, but that's just a wild
guess out of the blue.

Stefan

Lisandro Dalcin

unread,
Apr 11, 2011, 9:38:22 AM4/11/11
to cython...@googlegroups.com
On 11 April 2011 03:39, Stefan Behnel <stef...@behnel.de> wrote:
>
> It's a long-standing feature request to improve the language support for
> borrowed references in order to keep users from having to deal with
> PyObject* themselves. But no one has implemented that yet.
>

Have you some syntax in mind? for example, we could make:

object* somefunc(object* obj)

That would mean that "obj" reference is stolen, and the return is borrowed.

--
Lisandro Dalcin
---------------
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169

Darren Dale

unread,
Apr 11, 2011, 9:40:58 AM4/11/11
to Stefan Behnel, Cython-users

h5py is a python wrapper for the HDF5 library. Some of these calls
occur in a function to create a null-terminated HDF5 vlen ASCII string
from a python string. I think I should consult with the primary author
of h5py, and ask him if the code could be changed. It doesn't look to
me like this is a case of optimization, the function is not called
very frequently:

cdef int conv_str2vlen(void* ipt, void* opt, void* bkg, void* priv) except -1:

cdef PyObject** buf_obj = <PyObject**>ipt
cdef char** buf_cstring = <char**>opt

cdef PyObject* temp_object = NULL
cdef char* temp_string = NULL
cdef size_t temp_string_len = 0 # Not including null term

try:
if buf_obj[0] == NULL or buf_obj[0] == Py_None:
temp_string = ""
temp_string_len = 0
else:
if PyString_CheckExact(buf_obj[0]):
temp_object = buf_obj[0]
Py_INCREF(temp_object)
temp_string = PyString_AsString(temp_object)
temp_string_len = PyString_Size(temp_object)
else:
temp_object = PyObject_Str(buf_obj[0])
temp_string = PyString_AsString(temp_object)
temp_string_len = PyString_Size(temp_object)

if strlen(temp_string) != temp_string_len:
raise ValueError("VLEN strings do not support embedded NULLs")

buf_cstring[0] = <char*>malloc(temp_string_len+1)
memcpy(buf_cstring[0], temp_string, temp_string_len+1)

return 0
finally:
Py_XDECREF(temp_object)

Stefan Behnel

unread,
Apr 11, 2011, 2:58:28 PM4/11/11
to cython...@googlegroups.com
Darren Dale, 11.04.2011 15:40:

Ok, I think the reason this was written that way is because it's unclear at
the beginning if ipt points to an object or is NULL (or None). If you say
that this code isn't performance critical, I'd say it's overdesigned. It's
also somewhat inconsistent in the way it optimises for special cases.

When you rewrite it, just make the None/NULL case a completely separate
code section, then, after that, cast the pointer to object and handle the
other cases normally.

Stefan

Reply all
Reply to author
Forward
0 new messages