Using LAPACK library for matrix inversion

Zahra Sheikh

unread,

Jan 31, 2018, 6:20:26 AM1/31/18

to cython-users

I would like to use the LAPACK library for my matrix inversion operation. I wrote the following code in cython

import numpy as np
cimport numpy as np
cimport cython
import ctypes


cdef extern from "f2pyptr.h": 
     void *f2py_pointer(object) except NULL
     
import scipy.linalg.lapack
##################################
ctypedef np.float64_t REAL_t
cdef extern from "numpy/arrayobject.h":
     cdef bint PyArray_IS_F_CONTIGUOUS(np.ndarray) nogil
     
ctypedef void (*sgetri_ptr) (long *N, double *A, long *LDA, long *IPIV, double *WORK, long *LWORK, long *INFO)
cdef sgetri_ptr sgetri = <sgetri_ptr>f2py_pointer(scipy.linalg.lapack.sgetri._cpointer) 
ctypedef void (*sgetrf_ptr) (long *M, long *N, double *A, long *LDA, long *IPIV, long *INFO)
cdef sgetrf_ptr sgetrf = <sgetrf_ptr>f2py_pointer(scipy.linalg.lapack.sgetrf._cpointer) 
cdef void inverse_matrix(REAL_t* A, long n, long info):
     cdef np.ndarray[long, ndim=1] ipiv = np.empty(n, dtype=np.int64)
     cdef np.ndarray[double, ndim=1] work     = np.empty(n, dtype=np.float64)
     sgetrf(&n, &n, A, &n, &ipiv[0], &info)
     sgetri(&n, A, &n, &ipiv[0], &work[0], &n, &info)




cpdef REAL_t[::1,:] inverse(REAL_t[::1,:] A):
      if (A.shape[0] !=A.shape[1]):
         raise TypeError("The input array must be a square matrix")
      cdef long N = A.shape[0]
      cdef long info
      A=np.require(A, requirements=['F'])
      inverse_matrix(&A[0,0], N, info)
      
      if info < 0:
          raise ValueError('illegal value in %d-th argument of internal getrf' % -info)
      elif info > 0:
          raise ValueError("Diagonal number %d is exactly zero. Singular matrix." % -info)
      return A

The python code to test the cython function:

>>> import numpy
>>> from numpy.linalg import inv
>>> import timeit
>>> import numpy as np
>>> r=numpy.random.random((1000, 1000))
>>> start = timeit.default_timer()

>>> a=inv(r)
>>> stop = timeit.default_timer()
>>> print "numpy inverse :", stop - start 
numpy inverse : 0.256020069122
>>> from sampling import inverse
>>> start = timeit.default_timer()
>>> xx = np.asfortranarray(r)
>>> inverse(xx)
<MemoryView of 'ndarray' at 0x7f970c18d910>
>>> stop = timeit.default_timer()
>>> print "numpy inverse :", stop - start 
numpy inverse : 0.055704832077
>>> s=inverse(xx)
>>> np.asarray(s)
array([[           nan,            nan,            nan, ...,
        5.92300282e-01, 8.71437036e-01, 2.21625427e-02],
       [           nan,            nan,            nan, ...,
        2.04778954e-02, 1.72210970e-01, 6.43044171e-01],
       [           nan,            nan,            nan, ...,
        9.23144353e-01, 2.00409025e-01, 5.08050017e-01],
       ...,
       [           nan,            nan,            nan, ...,
        7.11929240e-01, 1.29944553e-02, 2.40730572e-01],
       [           nan,            nan,            nan, ...,
        3.85504040e-01, 6.59880766e-01, 6.39747652e-01],
       [           nan,            nan,            nan, ...,
        8.05001051e-01, 2.59658534e-01, 3.27163026e-04]])
>>> a
array([[-0.18046473,  0.14750746,  0.56098709, ..., -0.0308015 ,
         0.30598235,  0.17729771],
       [-0.95803719,  0.40170724, -1.1091461 , ...,  0.23814492,
         0.13964142, -0.43978003],
       [-0.26002019,  0.09651142,  0.60862356, ..., -0.02854744,
         0.4185963 ,  0.18771676],
       ...,
       [-0.24850209,  0.13408271, -0.63191214, ...,  0.06194161,
        -0.093394  , -0.11269955],
       [-0.31974092,  0.20012965,  0.24912217, ...,  0.00788042,
         0.31400232,  0.1736472 ],
       [-0.59926938,  0.34155187, -0.12015427, ...,  0.08699674,
         0.231144  ,  0.06787475]])

Well, it compiles without any error message but the inverted matrix that the function inverse returns is not correct, a lot of nan values. I do not know what is going wrong in my code. Any suggestion about what I did incorrectly?

Best,

Zahra

Jake Vanderplas

unread,

Jan 31, 2018, 8:17:46 AM1/31/18

to cython...@googlegroups.com

One thing pops out immediately: you're using single-precision routines (prefixed by s) with double-precision data (float64).

The first thing I would try would be either switching to double-precision routines (dgetri/dgetrf).

Best of luck,

Jake

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Zahra Sheikh

unread,

Jan 31, 2018, 10:33:24 AM1/31/18

to cython-users

Thanks Jake. You were right. Just by changing sgetri to dgetri, the problem solved.

I am also trying to use blas for dot product of two big arrays and this is my code

ctypedef void (*dgemm_ptr) (char *TRANSA, char *TRANSB, long *M, long *N, long *K, double *ALPHA, double *A, long *LDA, double *B, long *LDB, double *BETA, double *C, long *LDC) nogil  
cdef dgemm_ptr dgemm = <dgemm_ptr>f2py_pointer(scipy.linalg.blas.dgemm._cpointer) 
@cython.wraparound(False)
@cython.boundscheck(False)
cpdef REAL_t[::1,:] dot_product(REAL_t[::1,:] a, REAL_t[::1,:] b, char TRANSA ='N', char TRANSB='N' ):


      cdef REAL_t[::1,:] c_view = np.empty((a.shape[0], b.shape[0]), np.float64, order="F")
      cdef long M, N, K, LDA, LDB, LDC
      cdef REAL_t BETA  = 0.0
      cdef REAL_t ALPHA = 1.0
      if (a.shape[1] !=b.shape[0]):


         raise TypeError("The input array must be a square matrix")


      if (TRANSA   == 'N'):
         M = a.shape[0] #the number  of rows  of the  matrix op( A )  and of the  matrix  C
      elif (TRANSA == 'T'):
         M = a.shape[1]
      if (TRANSB   == 'N'):
         N = b.shape[0]
      elif (TRANSB == 'T'):
         N = b.shape[1]
      K = a.shape[1]
      LDA = a.shape[0]
      LDB = b.shape[0]
      LDC = a.shape[0]
      dgemm(&TRANSA, &TRANSB, &M, &N, &K, &ALPHA, &a[0,0], &LDA, &b[0,0], &LDB, &BETA, &c_view[0,0], &LDC)
      return c_view

My code gets compiled again but

>>> import numpy

>>> import timeit
>>> import numpy as np

>>> r=numpy.random.random((3000, 3000))
>>> q=numpy.random.random((3000, 1000))
>>> np.dot(r,q)
array([[769.47528279, 760.17637351, 764.06190616, ..., 779.52032551,
        774.55241478, 765.68127797],
       [750.05372546, 747.86009793, 742.79687316, ..., 756.32746171,
        751.74569381, 750.0328267 ],
       [756.26952741, 746.44368055, 749.91408482, ..., 760.24406134,
        755.83377909, 751.56632733],
       ...,
       [759.02821505, 746.80607024, 750.38860308, ..., 751.83944717,
        763.76172043, 757.3773271 ],
       [743.53450678, 739.15206982, 731.06772496, ..., 749.9263085 ,
        748.23769299, 741.45560221],
       [759.73482995, 746.73973757, 744.92631803, ..., 760.2607522 ,
        759.59618196, 759.56029153]])
>>> from sampling import inverse,dot_product 
>>> p=np.asarray(dot_product(np.asfortranarray(r),np.asfortranarray(q)))
Segmentation fault (core dumped)

But I get a Segmentation fault error. I am again clueless what is the reason for this problem.

Regards,

Zahra

To unsubscribe from this group and stop receiving emails from it, send an email to cython-users...@googlegroups.com.

Jake Vanderplas

unread,

Jan 31, 2018, 10:51:32 AM1/31/18

to cython...@googlegroups.com

I see two potential issues in your dot routine:

- the shape of c should be (a.shape[0], b.shape[1]), but also needs to depend on the value of TRANSA and TRANSB (i.e. if TRANSA and TRANSB are both equal to 'T' or 'C', then c.shape should be (a.shape[1], b.shape[0]) if I'm remembering things correctly)

- LDC only equals LDA if TRANSA == 'N'. You'll need to account for that if you want a general solution.

I'd look carefully through the DGEMM docs , taking care to distinguish between op(A) and A, op(B) and B, etc. to make certain you have all the shape parameters set correctly.

Jake

To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe@googlegroups.com.

Zahra Sheikh

unread,

Jan 31, 2018, 2:21:48 PM1/31/18

to cython-users

Thanks a lot again Jake. The problem solved. My final question is how can I make these functions parallel? Is there any way to do that? I already installed openmpi on my machine.

Best,

Zahra

Jake Vanderplas

unread,

Jan 31, 2018, 5:39:01 PM1/31/18

to cython...@googlegroups.com

I don't have much experience with that, but I think you would need to install something like PBLAS and PLAPACK and link against those libraries instead of the built-in BLAS/LAPACK. That's about the extent of the advice I can give on that subject :)

Jake

To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe@googlegroups.com.

Zahra Sheikh

unread,

Feb 1, 2018, 7:30:35 AM2/1/18

to cython-users

Is there any git repository/ blog that has examples of using the PBLAS or LAPACK in a cython code?

Thanks,

Zahra

Zahra Sheikh

unread,

Feb 1, 2018, 1:05:21 PM2/1/18

to cython-users

Apparently ScaLAPACK covers both the BLAS and LAPACK libraries and can parallel computation. I installed it and I am wondering how can I wrap pdgemm, pdgetrf and pdgetri from the ScaLAPACK library in my cython code?

Regards,

Zahra

Zahra Sheikh

unread,

Feb 5, 2018, 4:54:32 PM2/5/18

to cython-users

Hi,

I have been trying to use ScaLAPACK and so far I have written the attached code. The code gets compiled but when I import the cython code in python I get the following error:

>>> import scalapack_funcs
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: ./scalapack_funcs.so: undefined symbol: pdgemm_

However, I also tried to wrap pdgemm_ from mkl_pblas.h header but my attempt apparently failed. Any suggestion how I can make this code work?

Thanks in advance.

Zahra

setup_test.py

scalapack_funcs.pyx

scalapack_func.h

scalapack_wrap.f90

Reply all

Reply to author

Forward