caffe make runtest error: Cannot create Cublas handle. Searched but DONOT find a solution

739 views
Skip to first unread message

Zhiqi Yang

unread,
Mar 9, 2018, 3:19:58 AM3/9/18
to Caffe Users
Hi, guys, I.m using caffe. and according to tutorial, I've done the order "make all -j8 ; make test" (all successed), but when running "make runtest" it failed

This is the error.

[ RUN      ] RNNLayerTest/3.TestGradientNonZeroCont
E0309 16:15:16.977672 182769 common.cpp:114] Cannot create Cublas handle. Cublas won't be available.
E0309 16:15:17.463361 182769 common.cpp:121] Cannot create Curand generator. Curand won't be available.
F0309 16:15:17.919613 182769 syncedmem.hpp:22] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
    @     0x7f56daba0778  (unknown)
    @     0x7f56daba06b2  (unknown)
    @     0x7f56daba00b4  (unknown)
    @     0x7f56daba3055  (unknown)
    @     0x7f56d0051103  caffe::SyncedMemory::mutable_cpu_data()
    @     0x7f56d0021348  caffe::Blob<>::Reshape()
    @     0x7f56d00217aa  caffe::Blob<>::Reshape()
    @           0x4911ef  caffe::RNNLayerTest<>::ReshapeBlobs()
    @           0x49161b  caffe::RNNLayerTest<>::RNNLayerTest()
    @           0x4919ab  testing::internal::TestFactoryImpl<>::CreateTest()
    @           0x8fa9e3  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0x8f3cf3  testing::TestInfo::Run()
    @           0x8f3e55  testing::TestCase::Run()
    @           0x8f4138  testing::internal::UnitTestImpl::RunAllTests()
    @           0x8f4413  testing::UnitTest::Run()
    @           0x46f86f  main
    @     0x7f56cf3b8b45  (unknown)
    @           0x477559  (unknown)
    @              (nil)  (unknown)
Makefile:532: recipe for target 'runtest' failed
make: *** [runtest] Aborted

-----------------------------
The error shows cublas problem:

E0309 16:15:16.977672 182769 common.cpp:114] Cannot create Cublas handle. Cublas won't be available.
E0309 16:15:17.463361 182769 common.cpp:121] Cannot create Curand generator. Curand won't be available.
F0309 16:15:17.919613 182769 syncedmem.hpp:22] Check failed: error == cudaSuccess (2 vs. 0)  out of memory

but my nvidia-smi works ok, and cuda has already been installed. I've searched many pages, but did not find a solution.

What's the problem here?


Message has been deleted

Przemek D

unread,
Mar 9, 2018, 3:45:04 AM3/9/18
to Caffe Users
The problem seems pretty clear from the error message:
Check failed: error == cudaSuccess (2 vs. 0)  out of memory

Your GPU seems to have insufficient memory. What GPU are you using, what is the output of nvidia-smi?

Zhiqi Yang

unread,
Mar 9, 2018, 4:59:55 AM3/9/18
to Caffe Users
thanks for your reply. My gpu has sufficient memory. nvidia-smi:

Fri Mar  9 17:59:09 2018       

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 378.13                 Driver Version: 378.13                    |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Graphics Device     Off  | 0000:02:00.0     Off |                  N/A |

| 23%   27C    P8    16W / 250W |   5219MiB / 11172MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+

|   1  Graphics Device     Off  | 0000:03:00.0     Off |                  N/A |

| 27%   47C    P2    58W / 250W |  11115MiB / 11172MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+

|   2  Graphics Device     Off  | 0000:82:00.0     Off |                  N/A |

| 28%   49C    P2    60W / 250W |   2262MiB / 11172MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+

|   3  Graphics Device     Off  | 0000:83:00.0     Off |                  N/A |

| 29%   51C    P2    61W / 250W |   5525MiB / 11172MiB |      0%      Default |



I also set TEST_GPUID := 1 to avoid GPU 0

在 2018年3月9日星期五 UTC+8下午4:45:04,Przemek D写道:

Zhiqi Yang

unread,
Mar 9, 2018, 5:06:06 AM3/9/18
to Caffe Users
I think the problem is Cannot create Cublas handle. Cublas won't be available, thus lead to cuda out of memory (cannot use cublas), not because of insufficient memory

在 2018年3月9日星期五 UTC+8下午4:45:04,Przemek D写道:
The problem seems pretty clear from the error message:

Zhiqi Yang

unread,
Mar 9, 2018, 5:06:45 AM3/9/18
to Caffe Users
but I donot how to solve this " Cannot create Cublas handle" problem ...

在 2018年3月9日星期五 UTC+8下午6:06:06,Zhiqi Yang写道:

Zhiqi Yang

unread,
Mar 9, 2018, 5:18:11 AM3/9/18
to Caffe Users
my nouveau output:

lsmod |grep nouveau
nouveau              1122508  0 
mxm_wmi                12515  1 nouveau
video                  18096  1 nouveau
ttm                    77862  2 ast,nouveau
drm_kms_helper         49210  2 ast,nouveau
drm                   249998  6 ast,ttm,drm_kms_helper,nouveau,nvidia_drm
wmi                    17339  2 mxm_wmi,nouveau
button                 12944  1 nouveau
i2c_algo_bit           12751  3 ast,igb,nouveau
i2c_core               46012  8 ast,drm,igb,i2c_i801,drm_kms_helper,i2c_algo_bit,nvidia,nouveau

should I blacklist nouveau?



在 2018年3月9日星期五 UTC+8下午4:45:04,Przemek D写道:
The problem seems pretty clear from the error message:

Przemek D

unread,
Mar 9, 2018, 8:04:07 AM3/9/18
to Caffe Users
Yeah, nouveau should definitely be disabled. Indeed it's similar to the other thread. I'm not an expert on driver issues but it seems a bit weird that nvidia-smi output looks fine with nouveau running at the same time...

Zhiqi Yang

unread,
Mar 10, 2018, 8:46:44 PM3/10/18
to Caffe Users
yeah. That's where the things are weired. It seems I did not find similar cases online. And donot know how to solve

在 2018年3月9日星期五 UTC+8下午9:04:07,Przemek D写道:
Reply all
Reply to author
Forward
0 new messages