On Wed, Nov 14, 2012 at 1:06 PM, David Warde-Farley
<
d.warde...@gmail.com> wrote:
> On Wed, Nov 14, 2012 at 2:33 PM, Dag Sverre Seljebotn
> <
d.s.se...@astro.uio.no> wrote:
>>
>> On 11/14/2012 05:46 PM, Bradley Froehle wrote:
>>>
>>> Hi Shriramana:
>>>
>>> There isn't much benefit to doing this in Cython. NumPy will
>>> (hopefully) use an optimized BLAS/LAPACK library to do the matrix solve,
>>> and that is where you will likely spend all of your time anyway.
>>
>>
>> It's worth stressing that this is actually an understatement -- unless you
>> know *a lot* about your CPU and write code tuned specifically for it,
>> anything one implements oneself in Cython will be 10x to 100x slower.
>
>
> To add to what Dag said, if Python overhead is a significant problem (i.e.
> you have lots of 3x3 or 4x4 matrices, and a lot of time is spent in stupid
> error-checking and object unboxing by python/numpy) it is possible to call
> BLAS and LAPACK directly from Cython.
Those libraries are primarily optimized for larger matrix
computations; if you really have 3x3 matrices I wouldn't be surprised
if you could write a faster specialized implementation. (I could be
wrong though.)
> numpy.show_config() should be able to tell you what BLAS and LAPACK
> libraries NumPy is using on your system, so that you can link against them
> directly.
>
> On the other hand, if your matrices are sufficiently large that time inside
> spent inside LAPACK functions absolutely dwarfs everything else, Cython will
> not buy you anything. You'd be better off making sure that your BLAS library
> is a high-quality multithreaded implementation such as a tuned ATLAS or the
> Intel Math Kernel Library.
+1
Note that a lot has changed since those slides were written, though
they're still a good resource.
- Robert