Siamese Train: out of memory error.

40 views

Skip to first unread message

Lorena Sandru

unread,

May 27, 2017, 6:07:39 PM5/27/17

to Caffe Users

Hello,

I try to train a siamese neural network with train.py file from caffe, but I receive the error below after I run the command.

The data set has lmdb format: (2, 512, 512), the first image is on the first channel, the second image is on the second channel and a binary label 0/1.
I'm using caffe with nccl (2 gpus) and I have 8gb RAM.
The error appear for all batch_size, including batch_size=1

python3 train.py --solver=siamese_solver.prototxt

F0527 23:58:22.607717 2842 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
    @     0x7f1a2eef05cd google::LogMessage::Fail()
    @     0x7f1a2eef2433 google::LogMessage::SendToLog()
    @     0x7f1a2eef015b google::LogMessage::Flush()
    @     0x7f1a2eef2e1e google::LogMessageFatal::~LogMessageFatal()
    @     0x7f1a2f43dac8 caffe::SyncedMemory::mutable_gpu_data()
    @     0x7f1a2f3fb6c2 caffe::Blob<>::mutable_gpu_data()
    @     0x7f1a2f45f010 caffe::ConvolutionLayer<>::Forward_gpu()
    @     0x7f1a2f3e5a01 caffe::Net<>::ForwardFromTo()
    @     0x7f1a2f3e5b07 caffe::Net<>::Forward()
    @     0x7f1a2f40d042 caffe::Solver<>::Test()
    @     0x7f1a2f40da5e caffe::Solver<>::TestAll()
    @     0x7f1a2f410fb7 caffe::Solver<>::Step()
    @     0x7f1a2ff9e18e boost::python::objects::caller_py_function_impl<>::operator()()
    @     0x7f1a2e41f00d boost::python::objects::function::call()
    @     0x7f1a2e41f208 (unknown)
    @     0x7f1a2e427053 boost::python::handle_exception_impl()
    @     0x7f1a2e41c409 (unknown)
    @           0x5b7167 PyObject_Call
    @           0x528d06 PyEval_EvalFrameEx
    @           0x52e12b PyEval_EvalCodeEx
    @           0x4ebdd7 (unknown)
    @           0x5b7167 PyObject_Call
    @           0x5262af PyEval_EvalFrameEx
    @           0x528814 PyEval_EvalFrameEx
    @           0x528814 PyEval_EvalFrameEx
    @           0x528814 PyEval_EvalFrameEx
    @           0x52e12b PyEval_EvalCodeEx
    @           0x4ebcc3 (unknown)
    @           0x5b7167 PyObject_Call
    @           0x4f413e (unknown)
    @           0x5b7167 PyObject_Call
    @           0x54d359 (unknown)

How can I resolve this issue?

Reply all

Reply to author

Forward

0 new messages