CudaError when trying to train Fully Convolutional Networks

76 views
Skip to first unread message

Pujan Paudel

unread,
Sep 12, 2017, 2:25:31 PM9/12/17
to Caffe Users
I was trying to train  voc-fcn16s  on my system and it  gives me an error just after Network initialization has been done . The sequences goes sth like : 


This network produces output loss
I0912 13:17:43.730710 23926 net.cpp:255] Network initialization done.
I0912 13:17:43.730861 23926 solver.cpp:56] Solver scaffolding done.
I0912 13:17:43.926789 23926 net.cpp:744] Ignoring source layer conv1_1
I0912 13:17:43.926807 23926 net.cpp:744] Ignoring source layer conv1_2
I0912 13:17:43.926810 23926 net.cpp:744] Ignoring source layer conv2_1
I0912 13:17:43.926815 23926 net.cpp:744] Ignoring source layer conv2_2
I0912 13:17:43.926836 23926 net.cpp:744] Ignoring source layer conv3_1
I0912 13:17:43.926841 23926 net.cpp:744] Ignoring source layer conv3_2
I0912 13:17:43.926843 23926 net.cpp:744] Ignoring source layer conv3_3
I0912 13:17:43.926847 23926 net.cpp:744] Ignoring source layer conv4_1
I0912 13:17:43.926851 23926 net.cpp:744] Ignoring source layer conv4_2
I0912 13:17:43.926854 23926 net.cpp:744] Ignoring source layer conv4_3
I0912 13:17:43.926858 23926 net.cpp:744] Ignoring source layer conv5_1
I0912 13:17:43.926862 23926 net.cpp:744] Ignoring source layer conv5_2
I0912 13:17:43.926865 23926 net.cpp:744] Ignoring source layer conv5_3


And Then It Stops in this line : 

 Check failed: error == cudaSuccess (77 vs. 0)  an illegal memory access was encountered

I  wonder if  this problem is related with my GPU settings or the FCN architecture . 

And are those messages of Ignoring source layer .... after " Solver scaffolding done " . Are they pointing towards something wrong . 


Hieu Do Trung

unread,
Sep 13, 2017, 6:34:50 AM9/13/17
to Caffe Users
It seems that you didn't use the train.prototxt at the link you listed.
Normally, "Ignoring source layer" means that the model from which you fine-tuned from (the .caffemodel file) has that layer, but the .prototxt file doesn't have.

Pujan Paudel

unread,
Sep 13, 2017, 8:59:44 AM9/13/17
to Caffe Users

 Has this something to do with  Check failed: error == cudaSuccess (77 vs. 0)  an illegal memory access was encountered Also ?  

Pujan Paudel

unread,
Sep 13, 2017, 3:41:49 PM9/13/17
to Caffe Users
I was able to remove those warnings before the stopping error . But , it still doesn't start training ,  pauses for some 10-20 seconds and then again complains about  error == Cuda success (77 vs. 0) 

Jianyuan Shi

unread,
Jan 21, 2018, 4:19:15 AM1/21/18
to Caffe Users
Hi, Pujan Paudel. Had you solved the problem? Could you please tell me how?

在 2017年9月13日星期三 UTC+8上午2:25:31,Pujan Paudel写道:
Reply all
Reply to author
Forward
0 new messages