Runtest fails after succesful compilation

405 views
Skip to first unread message

Philip Meier

unread,
Sep 27, 2017, 10:21:13 AM9/27/17
to Caffe Users
Hi,

I'm on a system with Ubuntu 16.04 with enabled CUDA and cuDNN. I compiled Caffe the official way with

make clean
make all
-j8

Compilation and

make test

works fine and runs through without errors. If I continue with

make runtest

it gets aborted after some successful runs with

[ RUN      ] NeuronLayerTest/3.TestDropoutGradient
F0927
15:58:46.882012 13805 math_functions.cu:416] Check failed: status == CURAND_STATUS_SUCCESS (201 vs. 0)  CURAND_STATUS_LAUNCH_FAILURE
*** Check failure stack trace: ***
   
@     0x7fa5f565d5cd  google::LogMessage::Fail()
   
@     0x7fa5f565f433  google::LogMessage::SendToLog()
   
@     0x7fa5f565d15b  google::LogMessage::Flush()
   
@     0x7fa5f565fe1e  google::LogMessageFatal::~LogMessageFatal()
   
@     0x7fa5f30a65da  caffe::caffe_gpu_rng_uniform()
   
@     0x7fa5f30b2862  caffe::DropoutLayer<>::Forward_gpu()
   
@           0x478629  caffe::Layer<>::Forward()
   
@           0x47a0bb  caffe::GradientChecker<>::CheckGradientSingle()
   
@           0x489874  caffe::GradientChecker<>::CheckGradientEltwise()
   
@           0x77a363  caffe::NeuronLayerTest_TestDropoutGradient_Test<>::TestBody()
   
@           0x91ea83  testing::internal::HandleExceptionsInMethodIfSupported<>()
   
@           0x91809a  testing::Test::Run()
   
@           0x9181e8  testing::TestInfo::Run()
   
@           0x9182c5  testing::TestCase::Run()
   
@           0x91959f  testing::internal::UnitTestImpl::RunAllTests()
   
@           0x9198c3  testing::UnitTest::Run()
   
@           0x46d4bd  main
   
@     0x7fa5f226c830  __libc_start_main
   
@           0x474fd9  _start
   
@              (nil)  (unknown)
Makefile:532: recipe for target 'runtest' failed
make
: *** [runtest] Aborted (core dumped)

I couldn't find any mentions of this error. Does someone know what is happening here? Is this something I need to address before using Caffe?



Message has been deleted

Philip Meier

unread,
Sep 28, 2017, 3:07:07 AM9/28/17
to Caffe Users
I've rerun the whole process beginning at cloning from git and now

make runtest

fails again after a few successful tests with:

[ RUN      ] RandomNumberGeneratorTest/1.TestRngGaussian2GPU
F0928
08:57:08.017076 11362 math_functions.cu:456] Check failed: status == CURAND_STATUS_SUCCESS (201 vs. 0)  CURAND_STATUS_LAUNCH_FAILURE
*** Check failure stack trace: ***

   
@     0x7f6e441f95cd  google::LogMessage::Fail()
   
@     0x7f6e441fb433  google::LogMessage::SendToLog()
   
@     0x7f6e441f915b  google::LogMessage::Flush()
   
@     0x7f6e441fbe1e  google::LogMessageFatal::~LogMessageFatal()
   
@     0x7f6e41858e8e  caffe::caffe_gpu_rng_gaussian<>()
   
@           0x8e4d48  caffe::RandomNumberGeneratorTest<>::RngGaussianFillGPU()
   
@           0x8e1e93  caffe::RandomNumberGeneratorTest_TestRngGaussian2GPU_Test<>::TestBody()
   
@           0x9d3f88  testing::internal::HandleSehExceptionsInMethodIfSupported<>()
   
@           0x9cf0d7  testing::internal::HandleExceptionsInMethodIfSupported<>()
   
@           0x9baa12  testing::Test::Run()
   
@           0x9bb204  testing::TestInfo::Run()
   
@           0x9bb84f  testing::TestCase::Run()
   
@           0x9c0cfd  testing::internal::UnitTestImpl::RunAllTests()
   
@           0x9d5165  testing::internal::HandleSehExceptionsInMethodIfSupported<>()
   
@           0x9cfd88  testing::internal::HandleExceptionsInMethodIfSupported<>()
   
@           0x9bf90c  testing::UnitTest::Run()
   
@           0x4c3d14  main
   
@     0x7f6e408a7830  __libc_start_main
   
@           0x4c3ab9  _start
   
@              (nil)  (unknown)

Makefile:532: recipe for target 'runtest' failed
make
: *** [runtest] Aborted (core dumped)


The only similarity I see is


status == CURAND_STATUS_SUCCESS (201 vs. 0)

Does someone know more about this check? Is this vital for a correct operation of caffe?

Przemek D

unread,
Sep 28, 2017, 6:26:00 AM9/28/17
to Caffe Users
Help us help you. What GPU are you using? What CUDA and cuDNN versions? This is not an entirely new problem, have you tried any of the known solutions?

Philip Meier

unread,
Sep 28, 2017, 7:38:03 AM9/28/17
to Caffe Users
Your question of my CUDA version lead to the solution to my issue: I had CUDA 8 installed but the CUDA toolkit was version 7.5.

I found this thread with someone having similar issues. Just for others who encounter the same problem:

When you install CUDA remember to perform the post-installation actions. In my case I probably spawned a new terminal afterwards and running

nvcc -V

in it, prompted me to install the CUDA toolkit via the package management. I did this but unfortunately it installed version 7.5 although i have CUDA 8 installed.

Przemek D

unread,
Sep 28, 2017, 7:54:56 AM9/28/17
to Caffe Users
Glad to hear you sorted it out. And thank you for posting your solution for others to find!
Reply all
Reply to author
Forward
0 new messages