CUBLASNotInitialized error

60 views
Skip to first unread message

Gavin Wiggins

unread,
Sep 25, 2020, 9:11:42 PM9/25/20
to PyFR Mailing List
I'm trying to run the Couette flow example using the CUDA backend via the following command:

$ pyfr run -b cuda -p couette_flow_2d.pyfrm couette_flow_2d.ini

But I get the following error:

File "/home/cades/miniconda3/lib/python3.8/site-packages/pyfr/backends/cuda/cublas.py", line 78, in _errcheck

    raise self._statuses[status]

pyfr.backends.cuda.cublas.CUBLASNotInitialized

Any suggestions on how to fix this?

Freddie Witherden

unread,
Sep 25, 2020, 9:58:31 PM9/25/20
to pyfrmai...@googlegroups.com
Hi Gavin,
This error appeared once before on the mailing list some years ago:

https://groups.google.com/forum/#!topic/pyfrmailinglist/RWWXHC_ACHE

although the issue appears to have been configuration related (perhaps a
32-/64-bit problem?).

Are you able to run any other CUBLAS applications on the system?

Regards, Freddie.

Gavin Wiggins

unread,
Sep 25, 2020, 11:35:04 PM9/25/20
to PyFR Mailing List
I ran some of the PyCuda examples at https://github.com/inducer/pycuda/tree/master/examples without any problems.

Gavin Wiggins

unread,
Sep 26, 2020, 12:25:34 AM9/26/20
to PyFR Mailing List
I used Numba to get some system info (see below). Looks like the cublas library is working fine.


System info:
--------------------------------------------------------------------------------
__Time Stamp__
Report started (local time)                   : 2020-09-26 00:17:34.207700
UTC start time                                : 2020-09-26 04:17:34.207709
Running time (s)                              : 3.237414

__Hardware Information__
Machine                                       : x86_64
CPU Name                                      : haswell
CPU Count                                     : 20
Number of accessible CPUs                     : 20
List of accessible CPUs cores                 : 0-19
CFS Restrictions (CPUs worth of runtime)      : None

CPU Features                                  : 64bit aes avx avx2 bmi bmi2 cmov
                                                cx16 cx8 f16c fma fsgsbase fxsr
                                                invpcid lzcnt mmx movbe pclmul
                                                popcnt rdrnd sahf sse sse2 sse3
                                                sse4.1 sse4.2 ssse3 xsave xsaveopt

Memory Total (MB)                             : 58373
Memory Available (MB)                         : 57310

__OS Information__
Platform Name                                 : Linux-4.15.0-118-generic-x86_64-with-glibc2.10
Platform Release                              : 4.15.0-118-generic
OS Name                                       : Linux
OS Version                                    : #119-Ubuntu SMP Tue Sep 8 12:30:01 UTC 2020
OS Specific Version                           : ?
Libc Version                                  : glibc 2.27

__Python Information__
Python Compiler                               : GCC 7.3.0
Python Implementation                         : CPython
Python Version                                : 3.8.5
Python Locale                                 : en_US.UTF-8

__LLVM Information__
LLVM Version                                  : 10.0.1

__CUDA Information__
CUDA Device Initialized                       : True
CUDA Driver Version                           : 9010
CUDA Detect Output:
Found 1 CUDA devices
id 0           b'Tesla K40m'                              [SUPPORTED]
                      compute capability: 3.5
                           pci device id: 5
                              pci bus id: 0
Summary:
    1/1 devices are supported

CUDA Librairies Test Output:
Finding cublas from System
    named  libcublas.so.9.2.88
    trying to open library...   ok
Finding cusparse from System
    named  libcusparse.so.9.2.88
    trying to open library...   ok
Finding cufft from System
    named  libcufft.so.9.2.88
    trying to open library...   ok
Finding curand from System
    named  libcurand.so.9.2.88
    trying to open library...   ok
Finding nvvm from System
    named  libnvvm.so.3.2.0
    trying to open library...   ok
Finding cudart from System
    named  libcudart.so.9.2.88
    trying to open library...   ok
Finding libdevice from System
    searching for compute_20... ok
    searching for compute_30... ok
    searching for compute_35... ok
    searching for compute_50... ok


__ROC information__
ROC Available                                 : False
ROC Toolchains                                : None
HSA Agents Count                              : 0
HSA Agents:
None
HSA Discrete GPUs Count                       : 0
HSA Discrete GPUs                             : None

__SVML Information__
SVML State, config.USING_SVML                 : False
SVML Library Loaded                           : False
llvmlite Using SVML Patched LLVM              : True
SVML Operational                              : False

__Threading Layer Information__
TBB Threading Layer Available                 : True
+-->TBB imported successfully.
OpenMP Threading Layer Available              : True
+-->Vendor: GNU
Workqueue Threading Layer Available           : True
+-->Workqueue imported successfully.

__Numba Environment Variable Information__
None found.

Freddie Witherden

unread,
Sep 26, 2020, 8:16:55 AM9/26/20
to PyFR Mailing List
Hi Gavin,

Looking at the output below it appears as if Numba is opening up and loading the CUBLAS shared library, although it does not state if it is actually calling any of the methods.  Would you be able to run a test case in Numba which calls down to CUBLAS?

Regards, Freddie.

On 25 Sep 2020, at 23:25, 'Gavin Wiggins' via PyFR Mailing List <pyfrmai...@googlegroups.com> wrote:

I used Numba to get some system info (see below). Looks like the cublas library is working fine.
--
You received this message because you are subscribed to the Google Groups "PyFR Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pyfrmailingli...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/pyfrmailinglist/a7160f22-ebdf-4266-8aab-d92c75ce8c3en%40googlegroups.com.

Gavin Wiggins

unread,
Sep 26, 2020, 10:56:09 AM9/26/20
to PyFR Mailing List
I have never used Numba (other than the above) so I don't have a readily available example to use.

Gavin Wiggins

unread,
Sep 26, 2020, 11:54:09 AM9/26/20
to PyFR Mailing List
I created the following example using CuPy which uses Cuda to calculate the norm of a matrix. I think this uses cublas. This example works fine.

# Compare CuPy and NumPy

import cupy as cp 
import numpy as np 
import time 

n = 100_000_000
ns = 10_000

# Using NumPy
x_cpu = np.arange(n) - 4
x_cpu.reshape((ns, ns))

ti = time.perf_counter()
norm_cpu = np.linalg.norm(x_cpu)
tf = time.perf_counter()

print(f'NumPy result: \t{norm_cpu}')
print(f'NumPy time: \t{tf - ti:.4g}')

# Using CuPy
x_gpu = cp.arange(n) - 4
x_gpu.reshape((ns, ns))

ti = time.perf_counter()
norm_gpu = cp.linalg.norm(x_gpu)
tf = time.perf_counter()

print(f'CuPy result: \t{norm_gpu}')
print(f'CuPy time: \t{tf - ti:.4g}')



Freddie Witherden

unread,
Sep 26, 2020, 12:10:26 PM9/26/20
to PyFR Mailing List
Hi Gavin,

So it looks as if the cupy.linalg.norm function makes does not call out to CUBLAS.  See:


Regards, Freddie.

On 26 Sep 2020, at 10:54, 'Gavin Wiggins' via PyFR Mailing List <pyfrmai...@googlegroups.com> wrote:

I created the following example using CuPy which uses Cuda to calculate the norm of a matrix. I think this uses cublas. This example works fine.

Gavin Wiggins

unread,
Sep 26, 2020, 12:20:52 PM9/26/20
to PyFR Mailing List
Here's another example that uses the CuPy solve function which does use cublas. At least from what I can tell in the CuPy source code. This example works fine too.

# Compare CuPy and NumPy

import cupy as cp 
import numpy as np 
import time 

# Using NumPy
a_cpu = np.array([[4, 3, 2], [-2, 2, 3], [3, -5, 2]])
b_cpu = np.array([25, -10, -4])

ti = time.perf_counter()
x_cpu = np.linalg.solve(a_cpu, b_cpu)
tf = time.perf_counter()

print(f'NumPy result: \t{x_cpu}')
print(f'NumPy time: \t{tf - ti:.4g}')

# Using CuPy
a_gpu = cp.array([[4, 3, 2], [-2, 2, 3], [3, -5, 2]])
b_gpu = cp.array([25, -10, -4])

ti = time.perf_counter()
x_gpu = cp.linalg.solve(a_gpu, b_gpu)
tf = time.perf_counter()

print(f'CuPy result: \t{x_gpu}')
print(f'CuPy time: \t{tf - ti:.4g}')




Freddie Witherden

unread,
Sep 26, 2020, 1:50:51 PM9/26/20
to pyfrmai...@googlegroups.com
Hi Gavin,

So I believe that here CuPy is using cusolver as opposed to CUBLAS to
solve the system.

Would you be able to run the following snippet for me?

from ctypes import CDLL, POINTER, c_void_p

lib = CDLL('libcublas.so')
create = lib.cublasCreate_v2
create.argtypes = [POINTER(c_void_p)]

handle = c_void_p()
print(create(handle))

This should print 0.

Regards, Freddie.
>>> <https://groups.google.com/d/msgid/pyfrmailinglist/a7160f22-ebdf-4266-8aab-d92c75ce8c3en%40googlegroups.com?utm_medium=email&utm_source=footer>.
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "PyFR Mailing List" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to pyfrmailingli...@googlegroups.com.
>> To view this discussion on the web, visit
>> https://groups.google.com/d/msgid/pyfrmailinglist/83e8c10c-87ad-4498-a33a-c32073a1c88cn%40googlegroups.com
>> <https://groups.google.com/d/msgid/pyfrmailinglist/83e8c10c-87ad-4498-a33a-c32073a1c88cn%40googlegroups.com?utm_medium=email&utm_source=footer>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "PyFR Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to pyfrmailingli...@googlegroups.com
> <mailto:pyfrmailingli...@googlegroups.com>.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/pyfrmailinglist/ea67f31c-abb6-4366-8df7-3ca10efba530n%40googlegroups.com
> <https://groups.google.com/d/msgid/pyfrmailinglist/ea67f31c-abb6-4366-8df7-3ca10efba530n%40googlegroups.com?utm_medium=email&utm_source=footer>.


Gavin Wiggins

unread,
Sep 26, 2020, 2:18:55 PM9/26/20
to PyFR Mailing List
Freddie, I ran your script and it prints 1.

Gavin Wiggins

unread,
Sep 26, 2020, 2:29:36 PM9/26/20
to PyFR Mailing List
In my bashrc file, I'm pointing to the CUDA environment as follows:

export PATH="$PATH:/usr/local/cuda-9.2/bin"
export LD_LIBRARY_PATH="/usr/local/cuda-9.2/lib64"

The libcublas.so is located in /usr/local/cuda-9.2/lib64/ as:

lrwxrwxrwx  1 root root        16 Sep 25 19:45 libcublas.so -> libcublas.so.9.2*
lrwxrwxrwx  1 root root        19 Sep 25 19:45 libcublas.so.9.2 -> libcublas.so.9.2.88*


Freddie Witherden

unread,
Sep 26, 2020, 3:04:26 PM9/26/20
to pyfrmai...@googlegroups.com
Hi Gavin,

It appears as if either your CUDA install or environment is broken. It
could be the version of CUBLAS the loader is finding is incompatible
with your main CUDA version.

Regards, Freddie.
> https://groups.google.com/d/msgid/pyfrmailinglist/79c53612-7e92-4a05-a4aa-f686e74c8d84n%40googlegroups.com
> <https://groups.google.com/d/msgid/pyfrmailinglist/79c53612-7e92-4a05-a4aa-f686e74c8d84n%40googlegroups.com?utm_medium=email&utm_source=footer>.


Gavin Wiggins

unread,
Sep 30, 2020, 10:53:05 AM9/30/20
to PyFR Mailing List
So I wiped the system I was working on and did a fresh install of CUDA. Everything seems to be working fine now.
Reply all
Reply to author
Forward
0 new messages