Tight loop more efficient PER ITERATION

David Klein

unread,

Nov 10, 2021, 8:02:09 AM11/10/21

to cython-users

I am calling from C++ to Python code that has been compiled with Cython and timing the performance. For example

Cython code:

cdef void foo(int i):

do_something_with_i(i)

C++

for (int i = 0; i < 10000; i++) {

start_timer();

foo(i);

end_timer();

}

For my function foo it averages approximately 30 micro seconds per evaluation (using a high performance timer).

But when I slow the C++ code down by doing busy work, for example

double tmp = 0;

for (int i = 0; i < 10000; i++) {

start_timer();

foo(i);

end_timer();
for (int j = 0; j < niter; j++) tmp = exp(-tmp);
}

The average time jumps to around 200 micro seconds per evaluation! Note that the loop with the call to exp is outside of the timer.

It seems that the longest elapsed time for foo() remains constant as niter changes, but there are more and more cases that take the longest time as niter increases.

Any ideas why this might be happening?

da-woods

unread,

Nov 10, 2021, 2:16:58 PM11/10/21

to cython...@googlegroups.com

When things like this happens it's often because the compiler has decided that "foo(i)" is pointless and optimized it out completely, whereas it misses that optimization in the more complex case.

Not 100% convinced this is right in this case though.

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cython-users/333fb9e2-9fbf-4a14-b21a-dd02dec68858n%40googlegroups.com.

David Klein

unread,

Nov 11, 2021, 1:26:24 AM11/11/21

to cython-users

Excellent idea, but definitely doesn't apply in my case (I tried using the value of foo(i) and it made no difference).

Reply all

Reply to author

Forward