Deeplab-v2's Caffe fails BatchNormaLayerTest

182 views
Skip to first unread message

Christopher Menart

unread,
Aug 7, 2017, 3:36:36 PM8/7/17
to Caffe Users
Deeplab 2, a prominent image-labeling network, uses their own modified version of Caffe (I'm hoping some users here will be familiar with it).

I'm trying to get the code working on an Ubuntu 16 desktop where I've successfully run normal Caffe many times. When I 'make runtest', however, one test fails, and then towards the end the whole thing crashes:

The failed test is
'[  FAILED  ] BatchNormLayerTest/2.TestGradient, where TypeParam = caffe::GPUDevice<float> (5066 ms)'

The crash, when I run make V=1 runtest, is:

'F0807 15:32:53.907034  3963 blob.cpp:163] Check failed: data_
*** Check failure stack trace: ***
    @     0x7fd0d29e05cd  google::LogMessage::Fail()
    @     0x7fd0d29e2433  google::LogMessage::SendToLog()
    @     0x7fd0d29e015b  google::LogMessage::Flush()
    @     0x7fd0d29e2e1e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fd0d03a214b  caffe::Blob<>::mutable_cpu_data()
    @     0x7fd0d047dbf7  caffe::BatchNormLayer<>::Forward_cpu()
    @           0x472852  caffe::Layer<>::Forward()
    @           0x62add7  caffe::BatchNormLayerTest_TestForwardInplace_Test<>::TestBody()
    @           0x8b7b13  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0x8b112a  testing::Test::Run()
    @           0x8b1278  testing::TestInfo::Run()
    @           0x8b1355  testing::TestCase::Run()
    @           0x8b262f  testing::internal::UnitTestImpl::RunAllTests()
    @           0x8b2953  testing::UnitTest::Run()
    @           0x46649d  main
    @     0x7fd0cf710830  __libc_start_main
    @           0x46d829  _start
    @              (nil)  (unknown)
Makefile:526: recipe for target 'runtest' failed
make: *** [runtest] Aborted (core dumped)'

From what I've been able to find, data_ seems to have something to do with a network layer not receiving proper data, possibly. But I've been unable to find out what network is running this test or on what data.



Details: I'm running CUDA 8.0, no cuDNN because I only have a Tesla c2050 which cannot run cuDNN. And I've gotten the error above virtually every time I tried to run the test, but one time three tests failed as follows:

[  FAILED  ] BatchNormLayerTest/3.TestGradient, where TypeParam = caffe::GPUDevice<double> (4656 ms)

[  FAILED  ] BatchNormLayerTest/3.TestForwardInplace, where TypeParam = caffe::GPUDevice<double> (3 ms)

[  FAILED  ] BatchNormLayerTest/3.TestForward, where TypeParam = caffe::GPUDevice<double> (2 ms)

I include more detailed copy of the error messages below.
Reply all
Reply to author
Forward
0 new messages