cusp::blas routines catching "not implemented" errors

48 views
Skip to first unread message

xaldu...@gmail.com

unread,
Jul 8, 2016, 2:05:50 PM7/8/16
to cusp-users
Hi!

I'm developing a scientific application that requires a matrix exponential routine (y = e^(Ht)*v; where H is a matrix, v and y vectors. t is a parameter). H is sparse and big. I've done this using a CPU approach by means of other libraries such as PETSc. Now I want to try a GPU implementation.

At some point a dense-dense matrix multiplication is required so I used cusp::blas:: gemm for this. I got the following error:

terminate called after throwing an instance of 'cusp::not_implemented_exception'

  what():  CUSP GEMM not implemented 

As it turns out, even trying to run this very simple example catches the error:

#include <cusp/array2d.h>

#include <cusp/print.h>

#include <cusp/gallery/poisson.h>

// include cusp blas header file

#include <cusp/blas/blas.h>

int main()

{

  // create an empty dense matrix structure

  cusp::array2d<float,cusp::host_memory> A(4, 4, 1.0);

  // create 2D Poisson problem

  // allocate space for output matrix

  cusp::array2d<float,cusp::host_memory> B(A.num_rows, A.num_cols);

  // compute output matrix

  cusp::blas::gemm(A, A, B);

  // print the contents of B

  cusp::print(B);

  return 0;

}


I've tried with cudatoolkit versions 6.5 and 7.5. I'm using cusp v0.5.1. Additional info:


+------------------------------------------------------+                       

| NVIDIA-SMI 340.32     Driver Version: 340.32         |                       

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla K20m          Off  | 0000:20:00.0     Off |                    0 |

| N/A   27C    P0    48W / 225W |     11MiB /  4799MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+

|   1  Tesla K20m          Off  | 0000:8B:00.0     Off |                    0 |

| N/A   28C    P0    51W / 225W |     11MiB /  4799MiB |     73%      Default |

+-------------------------------+----------------------+----------------------+

                                                                               

+-----------------------------------------------------------------------------+

| Compute processes:                                               GPU Memory |

|  GPU       PID  Process name                                     Usage      |

|=============================================================================|

|  No running compute processes found                                         |

+-----------------------------------------------------------------------------+

 

Please help and thank you for your time.

Marlon B.

Steven Dalton

unread,
Jul 8, 2016, 2:16:34 PM7/8/16
to cusp-...@googlegroups.com
Hello Marlon,

  Cusp does not have a native implementation of GEMM at this time. Cusp currently relies on CUBLAS to provide GEMM support. To call the operation you need to create a cublasHandle_t type in your application and pass the handle to the cusp::blas::gemm call using a execution policy. I have added a complete example to your program below, note that this functionality was introduced after the 0.5.1 release so you need to checkout the develop branch on github.

Steve

#include <cusp/array2d.h>
#include <cusp/print.h>
#include <cusp/gallery/poisson.h>

// include cusp blas header file
#include <cusp/blas/blas.h>
// include cublas specific blas dispatching routines
#include <cusp/system/cuda/detail/cublas/blas.h>

int main()

{
  // create a cublas handle
  cublasHandle_t handle;
  if(cublasCreate(&handle) != CUBLAS_STATUS_SUCCESS)
  {
    throw cusp::runtime_exception("cublasCreate failed");
  }

  // create an empty dense matrix structure
  cusp::array2d<float,cusp::device_memory> A(4, 4, 1.0);

  // create 2D Poisson problem
  // allocate space for output matrix
  cusp::array2d<float,cusp::device_memory> B(A.num_rows, A.num_cols);

  // compute output matrix
  cusp::blas::gemm(cusp::cuda::par.with(handle), A, A, B);

  // print the contents of B
  cusp::print(B);

  // destroy the cublas handle
  cublasDestroy(handle);

  return 0;

}

// compile with
nvcc -o gemm -arch=sm_50 -lcublas -lcudart gemm.cu -I $CUSP_DIR

--
You received this message because you are subscribed to the Google Groups "cusp-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cusp-users+...@googlegroups.com.
To post to this group, send email to cusp-...@googlegroups.com.
Visit this group at https://groups.google.com/group/cusp-users.
For more options, visit https://groups.google.com/d/optout.

Steven Dalton

unread,
Jul 8, 2016, 2:34:06 PM7/8/16
to cusp-...@googlegroups.com
One more note, if you have more general GEMM operations that exploit symmetry or specialized formats you may want to checkout BLAM[1]. The cusp interface does not provide entry points to all the operations supported by cublas.

xaldu...@gmail.com

unread,
Jul 11, 2016, 5:33:38 AM7/11/16
to cusp-users
Hi Steve,

Thank you very much for your help I really appreciate it. 
Your suggestion works like a charm by the way, even with cusp::complex datatype which is what I needed. 
I'll take a look into BLAM.

Thanks again.

Marlon B.
Reply all
Reply to author
Forward
0 new messages