Siamese Train: out of memory error.

40 views
Skip to first unread message

Lorena Sandru

unread,
May 27, 2017, 6:07:39 PM5/27/17
to Caffe Users
Hello,

I try to train a siamese neural network with train.py file from caffe, but I receive the error below after I run the command.

The data set has lmdb format: (2, 512, 512), the first image is on the first channel, the second image is on the second channel and a binary label 0/1.
I'm using caffe with nccl (2 gpus)  and I have 8gb RAM.
The error appear for all batch_size, including batch_size=1


python3 train.py  --solver=siamese_solver.prototxt

F0527 23:58:22.607717  2842 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
    @     0x7f1a2eef05cd  google::LogMessage::Fail()
    @     0x7f1a2eef2433  google::LogMessage::SendToLog()
    @     0x7f1a2eef015b  google::LogMessage::Flush()
    @     0x7f1a2eef2e1e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f1a2f43dac8  caffe::SyncedMemory::mutable_gpu_data()
    @     0x7f1a2f3fb6c2  caffe::Blob<>::mutable_gpu_data()
    @     0x7f1a2f45f010  caffe::ConvolutionLayer<>::Forward_gpu()
    @     0x7f1a2f3e5a01  caffe::Net<>::ForwardFromTo()
    @     0x7f1a2f3e5b07  caffe::Net<>::Forward()
    @     0x7f1a2f40d042  caffe::Solver<>::Test()
    @     0x7f1a2f40da5e  caffe::Solver<>::TestAll()
    @     0x7f1a2f410fb7  caffe::Solver<>::Step()
    @     0x7f1a2ff9e18e  boost::python::objects::caller_py_function_impl<>::operator()()
    @     0x7f1a2e41f00d  boost::python::objects::function::call()
    @     0x7f1a2e41f208  (unknown)
    @     0x7f1a2e427053  boost::python::handle_exception_impl()
    @     0x7f1a2e41c409  (unknown)
    @           0x5b7167  PyObject_Call
    @           0x528d06  PyEval_EvalFrameEx
    @           0x52e12b  PyEval_EvalCodeEx
    @           0x4ebdd7  (unknown)
    @           0x5b7167  PyObject_Call
    @           0x5262af  PyEval_EvalFrameEx
    @           0x528814  PyEval_EvalFrameEx
    @           0x528814  PyEval_EvalFrameEx
    @           0x528814  PyEval_EvalFrameEx
    @           0x52e12b  PyEval_EvalCodeEx
    @           0x4ebcc3  (unknown)
    @           0x5b7167  PyObject_Call
    @           0x4f413e  (unknown)
    @           0x5b7167  PyObject_Call
    @           0x54d359  (unknown)

How can I resolve this issue?
Reply all
Reply to author
Forward
0 new messages