On Tue, Jan 6, 2015 at 7:30 PM, Sturla Molden <
sturla...@gmail.com> wrote:
> On 07/01/15 01:59, Robert Bradshaw wrote:
>> It probably, usually, does. Even if you wanted to count on this
>> (current?) implementation detail it's a bit hackish.
>
> It seems to be ok. If the class has cdef methods there is a vtab pointer in
> the struct as well though.
>
> But yeah, it is hackish, and it depends on an implementation detail.
>
>
>> Why not
>>
>> cdef struct FooStruct:
>> double x
>> double y
>>
>> cdef class Foo:
>> FooStruct data
>>
>> cdef Foo foo = Foo()
>> cdef FooStruct *foo_c = &foo.data
>>
>> in your C code you would just pass around the bare FooStruct*. I
>> suppose you'd have a level of indirection for Python access, but it's
>> not too bad.
>>
>
> I thought about this a lot, but it does not really solve my problem.
>
> First a tree like cKDTree is constructed of nodes, and I need to be able to
> access the child nodes in the algorithm.
Ah, so you want to traverse an entire data structure without
refcounting. Of course you'd need the GIL and refcounting to safely
modify it... (well, I suppose you could swap nodes or do other
refcount-preserving operations). You'd still want to make sure no one
else is concurrently touching your structure.
> The send thing is that we are using heaps as priority queues. I want to do
> an incref when a node is pushed into the heap and a decref when it is popped
> off. But I cannot have any refcounting happening inside the heap.
>
> Right now scipy.spatial.cKDTree is written to use C structs:
>
>
https://github.com/scipy/scipy/blob/master/scipy/spatial/ckdtree.pyx
>
> But the query method has been notorious for leaking memory. The main problem
> is use of malloc inside a complex algorithm. I have spent a lot of time
> trying to weed out the leaks, but it is really too complicated...
I wonder if using a memory arena would make things much nicer here
(though wouldn't help with the other issues). Reminds me of a fellow
student of my that was trying to convert someone's binary to be used
as a library, but it just malloc'd stuff left and right and assumed
process exit was right around the corner...
> So no I want to let Python deal with the memory, but it cannot impact the
> performance of the data structure.
>
> Another thing is that it would be nice to support pickle, as well as letting
> the tree be viewable from Python. This too points to using cdef classes
> instead of plain C structs.
>
> I am afraid to suggest this, but perhaps Cython needs a "borrowed" keyword
> to indicate a borrowed reference, so we could
>
> cdef object foo = Foo()
> cdef borrowed object bfoo = foo
>
> And the only thing borrowed would do is to turn off reference counting.
This has been toyed with before, one possible syntax is
cdef Foo foo # refcounted
cdef Foo *foo_ref # borrowed, one could do Foo **fffoo as well.
Seeing how much people mess up on char* conversion, I'm a bit wary of
making this too easy, as borrowed references can lead to subtle
heisenbugs.
Note that just referencing fields does not require a refcount, thus you can do
cdef void* c_node = <void*>py_node
print (<Foo>c_node).child
> Another possibility would be if there was a compiler directive to enforce a
> C name for a cdef class.
>
> @cython.cname('CFoo')
> cdef class Foo:
> pass
>
> cdef extern from *:
> struct CFoo:
> pass
>
> cdef Foo foo = Foo()
> cdef CFoo *cfoo = <CFoo*> (<void*> foo)
You can use the archaic
cdef public class Foo [object FooC, type FooCType]:
int x
Foo child
(which should be available as a directive, but I'm not sure what) then write
cdef extern from *:
cdef struct FooC:
int x
FooC *child
Is that what you're looking for?
- Robert