List-like container for Extension types and getting an item with nogil

673 views
Skip to first unread message

Matěj Laitl

unread,
Aug 25, 2012, 8:01:24 PM8/25/12
to cython...@googlegroups.com
Hi,
I've been srtuggling lately to get a reference to an Extension class instance inside a method declared nogil to be used in prange. I want to do something like (in parallel):

cdef Nuclide[:] nuclides  # pre-populeted before

cdef bint compute_all_nuclides(double param) nogil:
    cdef Nuclide nuclide
    for i in range(nuclides.shape[0]):
        nuclide = nuclides[i]
        compute_nuclide(nuclide, param)

cdef bint compute_nuclide(Nuclide nuclide, double param) nogil:
    # C-only computationally expensive work.

The above doesn't work and says Function declared nogil has Python locals or temporaries, Assignment of Python object not allowed without gil and Cannot access buffer with object dtype without gil. The the last 2 messages seem reasonable (or is accessing a view of extension types without GIL just not implemented and falls back to Python object access?), so I've tried:

cdef compute_all_nuclides(double param) nogil:
    cdef Nuclide nuclide
    for i in range(nuclides.shape[0]):
        with gil:
            nuclide = nuclides[i]
        compute_nuclide(nuclide, param)

...but it still fails with "Function declared nogil has Python locals or temporaries." Is this supposed to apply to extension types, too? Extension types are supported in nogil function arguments, but unsupported as locals (even if just referenced)? Does Python object refcounting need GIL? (what about atomic integers then?) Nuclide cannot be easily converted to C/C++ type - contains methods intended to be callable from Python space.

I've also tried to use libcpp.list as a container, but this isn't probably supposed to work with extension types and crashes the Cython compiler. (and would solve just the part of my problem)

Regards and sorry for way too much questions,
            Matěj

Feng Yu

unread,
Aug 26, 2012, 1:21:13 AM8/26/12
to cython...@googlegroups.com
here is my very ugly way of doing it:

import numpy
cimport numpy

cdef class Nuclide:
cdef readonly int i
def __init__(self, i):
self.i = i

cdef numpy.ndarray nuclides = numpy.empty(10, dtype='object')

for i in range(10):
nuclides[i] = Nuclide(i)

def func():
cdef void ** ptr= <void **> (nuclides.data)
cdef int a = 0
cdef int i

with nogil:
for i in range(10):
a = a + (<Nuclide>ptr[i]).i
print a
func()

I checked the C code and indeed the loop is GIL safe, ptr[i] is cast
to a Nuclide c-struct, no ref counting.

Nevertheless you can't cast nuclide.data to <Nuclide **> because
"Pointer base type cannot be a Python object".

- Yu

Robert Bradshaw

unread,
Aug 26, 2012, 1:55:10 AM8/26/12
to cython...@googlegroups.com
Yes, refcounting is the issue.

> (what about atomic integers then?)

Should CPython choose to adopt them, that would solve this issue.

> Nuclide cannot be easily converted to C/C++
> type - contains methods intended to be callable from Python space.
>
> I've also tried to use libcpp.list as a container, but this isn't probably
> supposed to work with extension types and crashes the Cython compiler. (and
> would solve just the part of my problem)

Yeah, you'll have refounting issues trying to do that too.

What you really want is a borrowed reference, but we haven't yet
figured out the best way to add that to Cython. (It's both a question
of syntax and making it hard for the user to shoot themselves in the
foot.) You could do things with NumPy arrays, but even easier is to do

from cpython.list cimport PyList_GET_ITEM

cdef list nuclides = [...]
with nogil:
for i in range(10):
compute_nuclide(<Nuclide>nuclides[i]), param)

You can also do (again, unteseted)

cdef compute_all_nuclides(double param) with gil:
cdef Nuclide nuclide
with nogil:
for i in range(nuclides.shape[0]):
with gil:
nuclide = nuclides[i]
compute_nuclide(nuclide, param)

- Robert

Stefan Behnel

unread,
Aug 26, 2012, 5:20:58 AM8/26/12
to cython...@googlegroups.com
Feng Yu, 26.08.2012 07:21:
> cdef class Nuclide:
> cdef readonly int i
> [...]
> with nogil:
> for i in range(10):
> a = a + (<Nuclide>ptr[i]).i

Interesting. I didn't even know that that worked. Makes sense.

Methods are still a different kind of beast, though, because someone must
own the object they are being called on, and taking ownership requires
holding the GIL to increase the refcount.

Stefan

Matěj Laitl

unread,
Aug 26, 2012, 7:08:47 AM8/26/12
to cython...@googlegroups.com
On 25. 8. 2012 Robert Bradshaw wrote:
> On Sat, Aug 25, 2012 at 5:01 PM, Matěj Laitl <ma...@laitl.cz> wrote:
> > Hi,
> > I've been srtuggling lately to get a reference to an Extension class
> > instance inside a method declared nogil to be used in prange. I want to do
> > something like (in parallel):
> > (...)
> > Does Python object refcounting need GIL?
>
> Yes, refcounting is the issue.
>
> > (what about atomic integers then?)
>
> Should CPython choose to adopt them, that would solve this issue.

I see. There must be some road-block I'm not getting, otherwise I don't see
why they're not adopted.

> What you really want is a borrowed reference, but we haven't yet
> figured out the best way to add that to Cython. (It's both a question
> of syntax and making it hard for the user to shoot themselves in the
> foot.) You could do things with NumPy arrays, but even easier is to do
>
> from cpython.list cimport PyList_GET_ITEM
>
> cdef list nuclides = [...]
> with nogil:
> for i in range(10):
> compute_nuclide(<Nuclide>nuclides[i]), param)

Interesting, this means that list indexing is GIL-free after cimporting
PyList_GET_ITEM? It seems that it circumvents refcounting, but the list
already holds a reference to nuclide, no problem.

> You can also do (again, unteseted)
>
> cdef compute_all_nuclides(double param) with gil:
> cdef Nuclide nuclide
> with nogil:
> for i in range(nuclides.shape[0]):
> with gil:
> nuclide = nuclides[i]
> compute_nuclide(nuclide, param)

Thanks, this works. (although I'm getting significant overhead, not sure
whether caused by GIL handling or something different)

Cheers,
Matěj

Stefan Behnel

unread,
Aug 26, 2012, 7:26:36 AM8/26/12
to cython...@googlegroups.com
Matěj Laitl, 26.08.2012 13:08:
> On 25. 8. 2012 Robert Bradshaw wrote:
>> from cpython.list cimport PyList_GET_ITEM
>>
>> cdef list nuclides = [...]
>> with nogil:
>> for i in range(10):
>> compute_nuclide(<Nuclide>nuclides[i]), param)
>
> Interesting, this means that list indexing is GIL-free after cimporting
> PyList_GET_ITEM?

Nope, I think that was just a left-over from Robert's mail edits. He might
have meant to use it instead of indexing.

Note that even if you manage to get at the typed object reference without
acquiring the GIL, not holding the GIL will prevent you from doing many
interesting things with it.

If your problem really is as simple as keeping data in an array and
processing it in parallel, it might turn out to be way easier to drop
Python classes all together and using C structs with functions instead.

Remember that your internal data structures don't have to map to Python
space 1:1. It's really not uncommon to use highly specialised low-level C
data structures internally and then write a pythonic wrapper that fakes a
totally object oriented API around them.

Stefan

Dag Sverre Seljebotn

unread,
Aug 26, 2012, 10:20:34 AM8/26/12
to cython...@googlegroups.com
Overhead compared to what?

When you stick data in Nuclide, you'll still need to follow that pointer
and jump wildly around in memory, which is not very efficient for the
CPU. Sticking the data directly in a NumPy array will always be
significantly faster just due to how the hardware (and Python lists) work.

In summary: Don't attempt C++-style programming in languages that aren't
C++.

Dag

Matěj Laitl

unread,
Aug 26, 2012, 4:56:15 PM8/26/12
to cython...@googlegroups.com
> Overhead compared to what?

Overhead of the parallel versus the sequential code. (prange used all the
time, but -fopenmp passed only for the first time) Parallel code had twice wall
clock time and ten times CPU time. This probably means that the prange was
used too low in the call stack.

> When you stick data in Nuclide, you'll still need to follow that pointer
> and jump wildly around in memory, which is not very efficient for the
> CPU. Sticking the data directly in a NumPy array will always be
> significantly faster just due to how the hardware (and Python lists) work.

Nuclide is just a container for a couple of read-only parameters, actual
result is written to prepared Cython memory view. (I've just omitted that in
the example) There are only a few Nuclide instances - the same instances are
shared by all threads. They're very likely on a memory page that doesn't get
written to, thus I see no cache coherency problems.

The code in compute_nuclide(nuclide, param) is the expensive part whose total
time is approx 1000 times greater than total time of
compute_all_nuclides(param)

> In summary: Don't attempt C++-style programming in languages that aren't
> C++.

What do you mean by C++-style programming here? Writing results to objects?

Matěj

Feng Yu

unread,
Aug 29, 2012, 12:53:10 PM8/29/12
to cython...@googlegroups.com
cdef nogil functions works: refcounting is maintained by numpy's
object array. As long as we hold a reference to the nuclides array,
the elements are (indirectly) owned.

python methods won't work: they return a python object and the
parameters are passed around as python objects,
the creation of which all require GIL refcounting.


>>>>>>>>>>>
import numpy
cimport numpy

cdef class Nuclide:
cdef readonly int i
def __init__(self, i):
self.i = i
cdef void cmethod(Nuclide self, int i) nogil:
pass

cdef numpy.ndarray nuclides = numpy.empty(10, dtype='object')

for i in range(10):
nuclides[i] = Nuclide(i)
def func():
cdef void ** ptr
cdef int a = 0
cdef int i
ptr = <void **> nuclides.data

with nogil:
for i in range(10):
a = a + (<Nuclide>ptr[i]).i
(<Nuclide>ptr[i]).cmethod(i)

print a

func()
Reply all
Reply to author
Forward
0 new messages