Passing C++ reference types and copy constructors

566 views
Skip to first unread message

Barry Warsaw

unread,
Jul 12, 2012, 3:28:42 PM7/12/12
to cython...@googlegroups.com
I'm well underway with an experimental branch to wrap the Xapian C++ API for
Python 3 using Cython. I'm encouraged enough, despite the occasional hiccup,
to put this work into a package that we can use in Ubuntu, where we're on a
mission to convert our desktop applications to Python 3. The code is
available[*] on Launchpad in Bazaar.

I've been able to work around most of the roadblocks I've encountered so far,
but I'm hitting one now that's got me stumped.

There is a Xapian::Document class with these two methods:

TermIterator termlist_begin() const
TermIterator termlist_end() const

which I expose in my .pxd file like so:

cdef cppclass Document:
# ...
TermIterator termlist_begin()
TermIterator termlist_end()

cdef cppclass TermIterator:
TermIterator()
TermIterator(TermIterator&)
string operator*()
TermIterator& operator++()
string get_description()

I'm having trouble implementing the TermIterator wrapper class in my .pyx
file. Here's what I have so far:

cdef class Document:
cdef xapianlib.Document * _this

def __cinit__(self):
self._this = new xapianlib.Document()

def __iter__(self):
return TermIterator(self)

cdef class TermIterator:
cdef xapianlib.TermIterator * _here
cdef xapianlib.TermIterator * _end

def __cinit__(self, document):
cdef xapianlib.TermIterator& here = <xapianlib.TermIterator&>document._this.termlist_begin()
cdef xapianlib.TermIterator& end = <xapianlib.TermIterator&>document._this.termlist_end()
self._here = new xapianlib.TermIterator(here)
self._end = new xapianlib.TermIterator(end)

This was the best I could do to get Cython to generate its .cpp file, but C++
compilation of that fails:

xapian.cpp: In function ‘int __pyx_pf_6xapian_12TermIterator___cinit__(__pyx_obj_6xapian_TermIterator*, PyObject*)’:
xapian.cpp:3540:25: error: ‘__pyx_v_here’ declared as reference but not initialized
xapian.cpp:3541:25: error: ‘__pyx_v_end’ declared as reference but not initialized

Line 3540 looks like this:

static int __pyx_pf_6xapian_12TermIterator___cinit__(struct __pyx_obj_6xapian_TermIterator *__pyx_v_self, CYTHON_UNUSED PyObject *__pyx_v_document) {
Xapian::TermIterator &__pyx_v_here;
Xapian::TermIterator &__pyx_v_end;
int __pyx_r;
__Pyx_RefNannyDeclarations
__Pyx_RefNannySetupContext("__cinit__", 0);

If I remove the `&` from the cdef declarations in TermIterator.__cinit__(),
then Cython is unhappy:

Error compiling Cython file:
------------------------------------------------------------
...
cdef class TermIterator:
cdef xapianlib.TermIterator * _here
cdef xapianlib.TermIterator * _end

def __cinit__(self, document):
cdef xapianlib.TermIterator here = <xapianlib.TermIterator&>document._this.termlist_begin()
^
------------------------------------------------------------

xapian.pyx:236:43: Casting temporary Python object to non-numeric non-Python type

Then I tried creating xapianlib.TermIterator* objects in the
Document.__iter__() method, passing those to the TermIterator constructor, but
this leads to Cython errors complaining it can't turn
xapianlib.TermIterator*'s into Python objects. My guess is that __cinit__()
is a Python-level method and can't be used to pass C++ objects into it.

I've settled on adding a cdef to the TermIterator class to "pivot" the
internal pointers, e.g.:

cdef class Document:
cdef xapianlib.Document * _this

def __iter__(self):
cdef xapianlib.TermIterator * begin = new xapianlib.TermIterator(self._this.termlist_begin())
cdef xapianlib.TermIterator * end = new xapianlib.TermIterator(self._this.termlist_end())
term_iter = TermIterator()
term_iter._pivot(begin, end)
return term_iter

cdef class TermIterator:
cdef xapianlib.TermIterator * _here
cdef xapianlib.TermIterator * _end

def __cinit__(self):
self._here = NULL
self._end = NULL

def __dealloc__(self):
if self._here != NULL:
del self._here
if self._end != NULL:
del self._end

cdef _pivot(self,
xapianlib.TermIterator * begin, xapianlib.TermIterator * end):
self._here = begin
self._end = end


As a general approach, I don't much like this too much because I'm not sure
I'll always have a reliable copy constructors to turn the reference types into
pointer types. I'll note that Cython does not accept _here and _end as
xapianlib.TermIterator& types.

Any other suggestions?

Cheers,
-Barry

[*] `bzr branch lp:pyxap` FLOSS license TBD.
signature.asc

Bradley Froehle

unread,
Jul 13, 2012, 1:43:58 PM7/13/12
to cython...@googlegroups.com
Hi Barry,

Let me address only one error message, which I excerpted below:

In the __cinit__, Cython doesn't know that `document` is of type `Document`, so the `document._this` is likely translating into a pythonic attribute lookup.  I think you can fix it by doing:
    def __cinit__(self, Document document):
        ... 

Robert Bradshaw

unread,
Jul 13, 2012, 2:05:47 PM7/13/12
to cython...@googlegroups.com
To make a long story short, supporting this would require imposing
complicated C++ scoping rules onto Cython code. (In C++ for instance,
each {} block creates a new scope, and these blocks aren't even
explicit in the Cython code.) The issue with references and
stack-allocated C++ objects with non-trivial constructors are the
same. The solution for both is to pointers instead.

> If I remove the `&` from the cdef declarations in TermIterator.__cinit__(),
> then Cython is unhappy:
>
> Error compiling Cython file:
> ------------------------------------------------------------
> ...
> cdef class TermIterator:
> cdef xapianlib.TermIterator * _here
> cdef xapianlib.TermIterator * _end
>
> def __cinit__(self, document):
> cdef xapianlib.TermIterator here = <xapianlib.TermIterator&>document._this.termlist_begin()
> ^
> ------------------------------------------------------------
>
> xapian.pyx:236:43: Casting temporary Python object to non-numeric non-Python type
>
> Then I tried creating xapianlib.TermIterator* objects in the
> Document.__iter__() method, passing those to the TermIterator constructor, but
> this leads to Cython errors complaining it can't turn
> xapianlib.TermIterator*'s into Python objects. My guess is that __cinit__()
> is a Python-level method and can't be used to pass C++ objects into it.

True, though as Bradley correctly mentioned, you can pass a cdef class
(if you declare it as such to access it's C-level members).
You should be able to take the address of the returned reference
itself, and pass that around.

- Robert

Stefan Behnel

unread,
Jul 15, 2012, 6:08:47 AM7/15/12
to cython...@googlegroups.com
Barry Warsaw, 12.07.2012 21:28:
> cdef xapianlib.TermIterator& here = <xapianlib.TermIterator&>document._this.termlist_begin()
> cdef xapianlib.TermIterator& end = <xapianlib.TermIterator&>document._this.termlist_end()

Just a quick additional comment on this kind of code. Cython has enough
type inference in place to not require the type declarations on the left,
which makes this a bit less verbose and more readable.

Stefan

Barry Warsaw

unread,
Jul 18, 2012, 3:39:59 PM7/18/12
to cython...@googlegroups.com
Hi Bradley.

On Jul 13, 2012, at 10:43 AM, Bradley Froehle wrote:

>Let me address only one error message, which I excerpted below:
>
>In the __cinit__, Cython doesn't know that `document` is of type
>`Document`, so the `document._this` is likely translating into a pythonic
>attribute lookup. I think you can fix it by doing:
> def __cinit__(self, Document document):
> ...

Cython doesn't seem to like this.

cdef class Document:
cdef xapianlib.Document * _this

def __iter__(self):
term_iter = TermIterator(self._this)
return term_iter


cdef class TermIterator:
cdef xapianlib.Document * _doc
cdef xapianlib.TermIterator * _current

def __cinit__(self, xapianlib.Document * doc):
self._doc = doc
self._current = NULL


Error compiling Cython file:
------------------------------------------------------------
...

def __len__(self):
return self._this.termlist_count()

def __iter__(self):
term_iter = TermIterator(self._this)
^
------------------------------------------------------------

xapian.pyx:96:37: Cannot convert 'Document *' to Python object

Cheers,
-Barry
signature.asc

Robert Bradshaw

unread,
Jul 18, 2012, 3:50:08 PM7/18/12
to cython...@googlegroups.com
Pass your wrapper Document class in, e.g.

cdef class Document:
cdef xapianlib.Document * _this

def __iter__(self):
term_iter = TermIterator(self._this)
return term_iter


cdef class TermIterator:
cdef xapianlib.Document * _doc
cdef xapianlib.TermIterator * _current

def __cinit__(self, Document doc):
self._doc = doc._this
self._current = NULL

(You may want to even hold onto the cpdef Document itself instead, if
it handles the memory management of its underlying _this pointer.)

- Robert

Barry Warsaw

unread,
Jul 18, 2012, 3:53:48 PM7/18/12
to cython...@googlegroups.com
Indeed, it does seem to work without the lhs type declaration, but I've hit
another show stopper. I have modified my code somewhat since the original
message, so let me include the relevant parts.

cdef class TermIterator:
cdef xapianlib.Document * _doc
cdef xapianlib.TermIterator * _current

cdef _advance(self):
if self._doc == NULL:
raise StopIteration
if self._current == NULL:
# Turn reference into pointer.
self._current = new xapianlib.TermIterator(
self._doc.termlist_begin())
end = self._doc.termlist_end()
if deref(self._current) != end:
postinc(self._current)
else:
raise StopIteration

def __next__(self):
self._advance()

Error compiling Cython file:
------------------------------------------------------------
...
raise StopIteration
if self._current == NULL:
self._current = new xapianlib.TermIterator(
self._doc.termlist_begin())
end = self._doc.termlist_end()
if deref(self._current) != end:
^
------------------------------------------------------------

xapian.pyx:264:32: Invalid types for '!=' (TermIterator, TermIterator)


The problem is that equivalent C++ code looks something like this:

serialise_document(const Xapian::Document &doc)
{
// ...
Xapian::TermIterator term;
for (term = doc.termlist_begin(); term != doc.termlist_end(); ++term) {
// ...
}


You can see that the conditional in the for-loop is comparing two
Xapian::TermIterator objects. doc.termlist.end() returns the same type as
doc.termlist_begin(), i.e. a Xapian::TermIterator[1]. I don't know how to
translate that into the equivalent Cython code. I thought there might be a
cython.operator.equality operator or somesuch, but there's not.

Is this possible?

Cheers,
-Barry

[1] http://xapian.org/docs/apidoc/html/classXapian_1_1Document.html
signature.asc

Barry Warsaw

unread,
Jul 18, 2012, 3:59:05 PM7/18/12
to cython...@googlegroups.com
On Jul 18, 2012, at 12:50 PM, Robert Bradshaw wrote:

>Pass your wrapper Document class in, e.g.
>
>cdef class Document:
> cdef xapianlib.Document * _this
>
> def __iter__(self):
> term_iter = TermIterator(self._this)

s/self._this/self/ and it works, thanks!

> return term_iter
>
>
>cdef class TermIterator:
> cdef xapianlib.Document * _doc
> cdef xapianlib.TermIterator * _current
>
> def __cinit__(self, Document doc):
> self._doc = doc._this
> self._current = NULL
>
>(You may want to even hold onto the cpdef Document itself instead, if
>it handles the memory management of its underlying _this pointer.)

/me nods. Thanks.
-Barry
signature.asc

Robert Bradshaw

unread,
Jul 18, 2012, 4:06:59 PM7/18/12
to cython...@googlegroups.com
On Wed, Jul 18, 2012 at 12:53 PM, Barry Warsaw <ba...@python.org> wrote:
> On Jul 15, 2012, at 12:08 PM, Stefan Behnel wrote:
>
>>Barry Warsaw, 12.07.2012 21:28:
>>> cdef xapianlib.TermIterator& here = <xapianlib.TermIterator&>document._this.termlist_begin()
>>> cdef xapianlib.TermIterator& end = <xapianlib.TermIterator&>document._this.termlist_end()
>>
>>Just a quick additional comment on this kind of code. Cython has enough
>>type inference in place to not require the type declarations on the left,
>>which makes this a bit less verbose and more readable.
>
> Indeed, it does seem to work without the lhs type declaration, but I've hit
> another show stopper. I have modified my code somewhat since the original
> message, so let me include the relevant parts.
>
> cdef class TermIterator:
> cdef xapianlib.Document * _doc
> cdef xapianlib.TermIterator * _current
>
> cdef _advance(self):
> if self._doc == NULL:
> raise StopIteration
> if self._current == NULL:
> # Turn reference into pointer.
> self._current = new xapianlib.TermIterator(
> self._doc.termlist_begin())

You should be able to do

self._current = &self._doc.termlist_begin()

(maybe requires a recent Cython).

> end = self._doc.termlist_end()
> if deref(self._current) != end:
> postinc(self._current)
> else:
> raise StopIteration
>
> def __next__(self):
> self._advance()
>
> Error compiling Cython file:
> ------------------------------------------------------------
> ...
> raise StopIteration
> if self._current == NULL:
> self._current = new xapianlib.TermIterator(
> self._doc.termlist_begin())
> end = self._doc.termlist_end()
> if deref(self._current) != end:
> ^
> ------------------------------------------------------------
>
> xapian.pyx:264:32: Invalid types for '!=' (TermIterator, TermIterator)

Did you declare an operatior!= method in your declaration of
TermIterator? BTW, thous would probably be more easily written using
the yield keyword in __iter__ rather than making a whole separate
class. (And even better, had the methods been named "begin" and "end"
as per the stl convention, you could write

def __iter__(self):
for x in self._this:
return x
.)

Barry Warsaw

unread,
Jul 18, 2012, 6:56:39 PM7/18/12
to cython...@googlegroups.com
On Jul 18, 2012, at 01:06 PM, Robert Bradshaw wrote:

>> cdef _advance(self):
>> if self._doc == NULL:
>> raise StopIteration
>> if self._current == NULL:
>> # Turn reference into pointer.
>> self._current = new xapianlib.TermIterator(
>> self._doc.termlist_begin())
>
>You should be able to do
>
> self._current = &self._doc.termlist_begin()
>
>(maybe requires a recent Cython).

More recent than 0.16? I'm using that version from PyPI in a virtualenv.

>> Error compiling Cython file:
>> ------------------------------------------------------------
>> ...
>> raise StopIteration
>> if self._current == NULL:
>> self._current = new xapianlib.TermIterator(
>> self._doc.termlist_begin())
>> end = self._doc.termlist_end()
>> if deref(self._current) != end:
>> ^
>> ------------------------------------------------------------
>>
>> xapian.pyx:264:32: Invalid types for '!=' (TermIterator, TermIterator)
>
>Did you declare an operatior!= method in your declaration of
>TermIterator?

No, because here's the weird part: the Xapian::TermIterator class reference
doesn't include operator!= or operator==

http://xapian.org/docs/apidoc/html/classXapian_1_1TermIterator.html

and yet, this seems to be how it's used in all the C++ code samples I've been
able to find. Those methods are also not defined in the Xapian source code
afaict. My C++-fu is rusty to understand how that actually works.

For kicks though, I added that definition to my TermIterator class

cdef cppclass TermIterator:
int operator!=(TermIterator&)

(note that Cython complained about using bool instead of int for the return
type)

With this, it compiles. (The test segfaults, but I think that's unrelated;
I'm still investigating.)

>BTW, thous would probably be more easily written using
>the yield keyword in __iter__ rather than making a whole separate
>class. (And even better, had the methods been named "begin" and "end"
>as per the stl convention, you could write
>
> def __iter__(self):
> for x in self._this:
> return x
>.)

Cool, thanks. I'll try those suggestions too.

-Barry
signature.asc

Bradley Froehle

unread,
Jul 18, 2012, 8:33:55 PM7/18/12
to cython...@googlegroups.com
For kicks though, I added that definition to my TermIterator class

    cdef cppclass TermIterator:
        int operator!=(TermIterator&)

(note that Cython complained about using bool instead of int for the return
type)

Maybe use `bint` instead of `int` as the declared return type?  Otherwise you can dig out a definition of the c++ bool:
`from libcpp cimport bool as cbool`.

With this, it compiles.  (The test segfaults, but I think that's unrelated;
I'm still investigating.)

Just for the record, the operator!= is defined outside of the class but in the same header file: http://xapian.org/docs/apidoc/html/termiterator_8h.html
 
Glad to see that you've made some progress.
Reply all
Reply to author
Forward
0 new messages