Requesting clarification: when property is used to define python access to a cdef attribute and given the same name, how does cython resolve the name clash?

356 views
Skip to first unread message

Alex Meakins

unread,
Jan 26, 2014, 5:10:45 PM1/26/14
to cython...@googlegroups.com

I've encountered an interesting case in the code I'm developing that I can't find clarified in the cython documentation.

If I am building an extension class like so:

vector.pxd:

cdef class Vector:

    cdef x, y, z

vector.pyx:

cdef class Vector:

    <blah, blah>

    property x:

        def __get__(self):

            return self.x

        def __set__(self, double v):

            self.x = v

This compiles fine and when I include the pxd in other pyx files, direct access is always to generated for the cdef attributes. What I wanted to confirm is that, despite defining the x attribute in the python scope via property while simultaneously having an x attribute cdef, the cython code will *always* use the direct attribute access, rather than attempting to use the python calls (when cimporting). i.e. does the cdef attribute always have priority in the generated code? I can't find any confirmation of this ordering in the documentation so I thought I'd better check this was a definite design choice and not a happy accident. Having the above code working makes the cython side of my api look as tidy as the python side (I originally have cdef _x, _y, _z as the attributes to avoid a clash, but would obviously prefer x,y,z!). 

If I've missed this in the documentation, I apologise.



Stefan Behnel

unread,
Jan 27, 2014, 1:28:12 AM1/27/14
to cython...@googlegroups.com
Alex Meakins, 26.01.2014 23:10:
> I've encountered an interesting case in the code I'm developing that I
> can't find clarified in the cython documentation.
>
> If I am building an extension class like so:
>
> vector.pxd:
>
>> cdef class Vector:
>> cdef x, y, z
>>
> vector.pyx:
>
>> cdef class Vector:
>> <blah, blah>
>> property x:
>> def __get__(self):
>> return self.x
>> def __set__(self, double v):
>> self.x = v

Note that this can be simplified to

cdef class Vector:
cdef public double x

But I suspect that the reason why you are using explicit properties here is
that their code is a bit more complex in your real code than in the example
above.


> This compiles fine and when I include the pxd in other pyx files, direct
> access is always to generated for the cdef attributes. What I wanted to
> confirm is that, despite defining the x attribute in the python scope via
> property while simultaneously having an x attribute cdef, the cython code
> will *always* use the direct attribute access, rather than attempting to
> use the python calls (when cimporting). i.e. does the cdef attribute always
> have priority in the generated code? I can't find any confirmation of this
> ordering in the documentation so I thought I'd better check this was a
> definite design choice and not a happy accident.

As long as Cython knows that the type of the object you are accessing is a
Vector, it will use direct access to its declared fields and ignore any
defined properties.

The only problem arises when you are using untyped access, i.e. this code

def get_x(v):
return v.x

get_x(Vector())

will use property access even when compiled with Cython. It will still
work, though.


> Having the above code
> working makes the cython side of my api look as tidy as the python side (I
> originally have cdef _x, _y, _z as the attributes to avoid a clash, but
> would obviously prefer x,y,z!).

Depends. If your properties actually do anything interesting, then you want
to be sure there is no functional difference between the access through the
property and the access through the object field. So, giving both different
names has the advantage of making it clear what happens and what is
intended in the code.

And if they don't do anything interesting, why bother implementing the
properties independently?

Stefan

Alex Meakins

unread,
Jan 27, 2014, 9:22:07 AM1/27/14
to cython...@googlegroups.com, stef...@behnel.de

Thanks for the clarification Stefan.

The code I have at present would be better handled by making the cdef public as the property is the same as the direct access. I refactored my code recently and didn't spot the simplification (I've not needed public before and it didn't spring to mind... I'm still relatively new to cython). 

I have had to diverge my cython and python apis in some places, notably vector multiplication. Using __mul__(x,y) is very slow compared to a dedicated cdef Vector mul(self, double v) method - I assume this is because of the overhead required to allow this method to be redefined in a subclass. Is there any way in cython of preventing classes from being extended so that the cython complier can safely inline the __mul__ code (knowing it can never change)? Overloading methods would be useful in this case too!

If you are interested, the sort of code I'm finding slow is as follows (everything here is a vector, dot is a cpdef):

Calculate reflection vector (using __mul__ and __sub__)

r = i - 2*n*(n.dot(i))

Writing this as follows is ~10 times faster (d is a double):

d = n.dot(i)

r.x = i.x - 2*n*d

r.y = i.y - 2*n*d

r.z = i.z - 2*n*d

I'd like to avoid having to drop to such a low level for every bit of vector maths if I can.

If you want to have a look, the code (alpha!) is at: www.raysect.org, see the core/math package. Feel free to tell me I'm doing things wrong. The aim of the project is to develop an easily extensible, generic ray-tracing framework for use in high precision scientific and engineering applications. Ease of use is the primary goal, but it still needs to be fast.

There is a bodged together speed test in raysect/tests

to build for development:

dev/build.sh in the package root

dev/test.sh to run the package tests

dev/speedtest.sh to run the speed test



Alex Meakins

unread,
Jan 27, 2014, 9:33:00 AM1/27/14
to cython...@googlegroups.com, stef...@behnel.de

Hmm... just had a look at the generated c for my two examples. The __sub,__mul example generates a lot of temporary objects, unlike the lower level version. Inlining __mul__ etc won't necessarily fix the creation of the temporaries... I guess it would make a small improvement but not as much as I would like.

Alex Meakins

unread,
Jan 27, 2014, 6:33:07 PM1/27/14
to cython...@googlegroups.com, stef...@behnel.de

Turns out I was wrong - it is substantially faster, here are the results for test run:

Test 5: compound maths (reflection vector)

Loop of 2,500,000 operations:
- python: 16218.0 ms
- raysect via python: 13323.5 ms
- raysect via cython: 12758.5 ms
- raysect via optimised cython (high-level): 165.6 ms
- raysect via optimised cython (low-level): 88.0 ms

raysect (python scope) vs python: 1.217 times faster
raysect (cython scope) vs python: 1.271 times faster
raysect (optimised cython, high-level) vs python: 97.922 times faster
raysect (optimised cython, low-level) vs python: 184.318 times faster


The last three results correspond to:

raysect (cython scope) vs python: 1.271 times faster:

def ctest5(int n):

    cdef Vector incident, normal, reflected

    incident = Vector([1,-1,0]).normalise()
    normal = Vector([0,1,0]).normalise()

    for i in range(0, n):

        reflected = incident - 2.0 * normal * normal.dot(incident)

    return reflected

raysect (optimised cython, high-level) vs python: 97.922 times faster:

def cotest5a(int n):

 

    cdef Vector incident, normal, reflected

    incident = new_vector(1, -1, 0).normalise()
    normal = new_vector(0, 1, 0).normalise()

    for i in range(0, n):

        # r = i - 2*n*(n.i)
        reflected = incident.sub(normal.mul(2.0*normal.dot(incident)))

    return reflected

raysect (optimised cython, low-level) vs python: 184.318 times faster:

def cotest5b(int n):

    cdef Vector incident, normal, reflected
    cdef double d

    incident = new_vector(1, -1, 0).normalise()
    normal = new_vector(0, 1, 0).normalise()

    for i in range(0, n):

        # r = i - 2*n*(n.i)
        d = normal.dot(incident)
        reflected = new_vector(incident.x - 2.0 * normal.x * d,
                                          incident.y - 2.0 * normal.y * d,
                                          incident.z - 2.0 * normal.z * d)

    return reflected


Using __mul__ and __sub__  is slow compared to cdef mul and cdef sub.

I've found the @cython.final directive can prevent subclassing. In theory would this not allow __mul__ and __sub__ etc.. to be compiled in as though they are inline cdefs? I've tested my above code with Vector defined as final and it makes no change to the speed, so I take it that this is'nt done? Is there something I'm missing that would prevent this optimisation? I'd be very happy for make all my core maths classes final if it meant I get a clean api and speed!

Alex Meakins

unread,
Jan 27, 2014, 7:42:02 PM1/27/14
to cython...@googlegroups.com, stef...@behnel.de
Sorry to keep replying to my own post.... I guess this would require some form of overloading of __mul__ etc... to be able to gain full speed as there would need to be proper typing?

Stefan Behnel

unread,
Jan 28, 2014, 1:35:26 AM1/28/14
to cython...@googlegroups.com
Alex Meakins, 28.01.2014 00:33:
> Using __mul__ and __sub__ is slow compared to cdef mul and cdef sub.
>
> I've found the @cython.final directive can prevent subclassing. In theory
> would this not allow __mul__ and __sub__ etc.. to be compiled in as though
> they are inline cdefs? I've tested my above code with Vector defined as
> final and it makes no change to the speed, so I take it that this is'nt
> done? Is there something I'm missing that would prevent this optimisation?

No, except that someone would have to implement this.

This is already done for normal method calls to final classes (as you might
have noticed with your mul() and sub() methods). However, that part isn't
very clean in the code base and it would be better to redo it properly
before adding slot support to it.

Stefan

Reply all
Reply to author
Forward
0 new messages