Hello,
I try to train a siamese neural network with train.py file from caffe, but I receive the error below after I run the command.
The data set has lmdb format: (2, 512, 512), the first image is on the first channel, the second image is on the second channel and a binary label 0/1.
I'm using caffe with nccl (2 gpus) and I have 8gb RAM.
The error appear for all batch_size, including batch_size=1
python3 train.py --solver=siamese_solver.prototxt
F0527 23:58:22.607717 2842 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
@ 0x7f1a2eef05cd google::LogMessage::Fail()
@ 0x7f1a2eef2433 google::LogMessage::SendToLog()
@ 0x7f1a2eef015b google::LogMessage::Flush()
@ 0x7f1a2eef2e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f1a2f43dac8 caffe::SyncedMemory::mutable_gpu_data()
@ 0x7f1a2f3fb6c2 caffe::Blob<>::mutable_gpu_data()
@ 0x7f1a2f45f010 caffe::ConvolutionLayer<>::Forward_gpu()
@ 0x7f1a2f3e5a01 caffe::Net<>::ForwardFromTo()
@ 0x7f1a2f3e5b07 caffe::Net<>::Forward()
@ 0x7f1a2f40d042 caffe::Solver<>::Test()
@ 0x7f1a2f40da5e caffe::Solver<>::TestAll()
@ 0x7f1a2f410fb7 caffe::Solver<>::Step()
@ 0x7f1a2ff9e18e boost::python::objects::caller_py_function_impl<>::operator()()
@ 0x7f1a2e41f00d boost::python::objects::function::call()
@ 0x7f1a2e41f208 (unknown)
@ 0x7f1a2e427053 boost::python::handle_exception_impl()
@ 0x7f1a2e41c409 (unknown)
@ 0x5b7167 PyObject_Call
@ 0x528d06 PyEval_EvalFrameEx
@ 0x52e12b PyEval_EvalCodeEx
@ 0x4ebdd7 (unknown)
@ 0x5b7167 PyObject_Call
@ 0x5262af PyEval_EvalFrameEx
@ 0x528814 PyEval_EvalFrameEx
@ 0x528814 PyEval_EvalFrameEx
@ 0x528814 PyEval_EvalFrameEx
@ 0x52e12b PyEval_EvalCodeEx
@ 0x4ebcc3 (unknown)
@ 0x5b7167 PyObject_Call
@ 0x4f413e (unknown)
@ 0x5b7167 PyObject_Call
@ 0x54d359 (unknown)
How can I resolve this issue?