Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Best Practices for passing numpy data pointer to C ?
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Sturla Molden  
View profile  
 More options Jul 28 2012, 10:40 am
From: Sturla Molden <sturlamol...@yahoo.no>
Date: Sat, 28 Jul 2012 16:40:38 +0200
Local: Sat, Jul 28 2012 10:40 am
Subject: Re: [cython-users] Re: Best Practices for passing numpy data pointer to C ?

> I prepared some quick-and-dirty benchmarks of the behavior I need at
> https://github.com/jakevdp/memview_benchmarks/ -- I'd be interested if
> people more familiar with memory-views could take a look and let me
> know if I'm missing anything there.
>    Jake

I took the liberty to update your banchmarks (see attachment).  For
example I noticed that GCC was clever enough to optimize out all the
loops in your pointer_arith.pyx...

Here are the timings I got from the updated version in the attachment. I
think this gives the correct picture:

D:\memview-benchmarks\new>python runme.py
numpy_only: 6.86 sec
cythonized_numpy: 5.74 sec
cythonized_numpy_2: 10.4 sec
cythonized_numpy_2b: 6.25 sec
cythonized_numpy_3: 2.43 sec
cythonized_numpy_4: 1.78 sec
pointer_arith: 1.79 sec
memview: 1.86 sec

There is a table in the attached PDF that should be easier to read.

The overhead from the numpy versions comes from slicing the ndarray. In
comparison, slicing the memoryview has a very small overhead. If we
slice the ndarray in Cython, this is not much better than just using
plain numpy in Python. But if we use memoryviews, slicing is just a
little bit slower than using C style pointer arithmetics.

And consider this: Numerical code using array slicing in Fortran90 with
gfortran is often 2x slower than the same code using pointer arithmetics
in C with GCC. At least in my experience (Fortran 77 is another matter.)

If you wonder why using np.dot was faster than writing out the loop in
Cython, that is due to Intel MKL in Enthought.

Conclusion:

Memoryviews are extremely fast, comparable to pointer arithmetics in C.

Now we need a real benchmark, e.g. some linear algebra solver or an FFT.
Something like Scimark perhaps. Cython vs. C vs. Fortran 90.

Sturla

  memview-benchmarks.zip
5K Download

  benchmark.pdf
232K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.