Is this a bug caffe (download yestaday) about cudnn?

2,011 views
Skip to first unread message

fengyanchao

unread,
Jan 8, 2015, 10:58:56 PM1/8/15
to caffe...@googlegroups.com
for example  I test a model(such as  bvlc_reference_caffenet)  with python  then "caffemodel.set_device(0)"  every thing is ok ,but  if  I "caffemodel.set_device(1)"  
then: 
"cudnn_conv_layer.cu:30] Check failed: status == CUDNN_STATUS_SUCCESS (8 vs. 0)  CUDNN_STATUS_EXECUTION_FAILED
*** Check failure stack trace: ***"
I have make runtest on devide 1 ,and 
[==========] 1169 tests from 190 test cases ran. (180838 ms total)
[  PASSED  ] 1169 tests.

|   0  Tesla K40m          Off  | 0000:05:00.0     Off |                    0 |
| N/A   65C    P0   141W / 235W |   3387MiB / 11519MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K40m          Off  | 0000:42:00.0     Off |                    0 |
| N/A   37C    P8

The same problem occurs on master too !

fengyanchao

unread,
Jan 8, 2015, 11:23:18 PM1/8/15
to caffe...@googlegroups.com

without cudnn  is ok  on all devices
在 2015年1月9日星期五 UTC+8上午11:58:56,fengyanchao写道:

Jonathan L Long

unread,
Jan 9, 2015, 1:19:27 AM1/9/15
to fengyanchao, caffe...@googlegroups.com
Yes, I've seen this before. I can't remember if this is supposed to be fixed in dev or not (I guess not!).

A workaround is to use the environment variable CUDA_VISIBLE_DEVICES instead of using set_device.

JLL

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/78073faa-4be3-4e4b-b79c-da1cc14d2997%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Evan Shelhamer

unread,
Jan 9, 2015, 1:36:55 AM1/9/15
to Jonathan L Long, fengyanchao, caffe...@googlegroups.com
Sorry, this should be fixed! Like Jon said, this was due to be fixed in master and dev but seems to have gone missing -- we'll post a fix soon. I can confirm Jon's workaround in the meantime.

I've posted this as issue #1700. Follow there for the details and notice for when the fix is in.

Evan Shelhamer

Evan Shelhamer

unread,
Jan 9, 2015, 1:40:50 AM1/9/15
to caffe...@googlegroups.com

Alfredo Nava

unread,
Aug 21, 2015, 1:37:33 AM8/21/15
to Caffe Users, shel...@eecs.berkeley.edu
Where is that change made??

I am using DIGITS and I am getting that error after the first epoch
Reply all
Reply to author
Forward
0 new messages