@cython.cdivision(True) makes things slower....

1,136 views
Skip to first unread message

Neal Hughes

unread,
Oct 23, 2013, 6:07:21 AM10/23/13
to cython...@googlegroups.com
I am using this compiler directive to speed up division in a cython function. Looking at the c code / html file it should be helping as it removes the zero division checking. But for some reason it is making my code run much slower.

Any ideas why this directive would result in slower code.....?

Robert Bradshaw

unread,
Oct 23, 2013, 12:01:38 PM10/23/13
to cython...@googlegroups.com
That shouldn't happen. Can you show us the benchmarks you're using?
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "cython-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cython-users...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Neal Hughes

unread,
Oct 23, 2013, 6:56:32 PM10/23/13
to cython...@googlegroups.com
Ok here is a demonstration.....

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
@cython.cdivision(True)
cdef double example1(double[:] xi, double[:] a, double[:] b, double D):

    cdef int k
    cdef double[:] x

    for k in range(D):
        x[k] = (xi[k] - a[k]) / (b[k] - a[k]) 

    return x

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
@cython.cdivision(True)
cdef double example1(double[:] xi, double[:] a, double[:] b, double D):

    cdef int k
    cdef double[:] x

    for k in range(D):
        x[k] = (xi[k] - a[k]) / (b[k] - a[k]) 

    return x

   def test_division(self):
       
        D = 10000
        x = np.random.rand(D)
        a = np.zeros(D)
        b = np.random.rand(D) + 1

        tic = time.time()
        example1(x, a, b, D)
        toc = time.time()

        print 'With c division: ' + str(toc - tic)
        
        tic = time.time()
        example2(x, a, b, D)
        toc = time.time()
    
        print 'Without c division: ' + str(toc - tic)

Resulting output:

With c division: 0.000194787979126
Without c division: 0.000176906585693

Robert Bradshaw

unread,
Oct 24, 2013, 12:05:11 AM10/24/13
to cython...@googlegroups.com
Good question. It could be an artifact of how the compiler deals with
a large expression vs. storing intermediates as temporaries (though
this *shouldn't* matter). However, in my experiments the difference
goes both ways and is certainly less than the run-to-run variance.

https://gist.github.com/robertwb/7131183

Stefan Behnel

unread,
Oct 24, 2013, 3:57:58 AM10/24/13
to cython...@googlegroups.com
Robert Bradshaw, 24.10.2013 06:05:
> Good question. It could be an artifact of how the compiler deals with
> a large expression vs. storing intermediates as temporaries (though
> this *shouldn't* matter). However, in my experiments the difference
> goes both ways and is certainly less than the run-to-run variance.

Yes, the timings are way to low to draw a conclusion from them.

It's quite likely that the C compiler and/or the CPU manage to drop the
execution time for the extra safety check to essentially zero by exploiting
branch prediction and pipelining effects, so the difference is simply
negligible and the general variance takes over.

Stefan

Reply all
Reply to author
Forward
0 new messages