Check failed: target_blobs[j]->shape() == source_blob->shape()

201 views
Skip to first unread message

Swami

unread,
Jul 2, 2015, 10:34:51 AM7/2/15
to caffe...@googlegroups.com
I started training my network by initializing some weights fro ma pre-trained network (using same variable names). The training went fine and I had save snapshots of it.

When I try to resume the training , I get the error in the title, starting from the layer which loads the pre-trained weights.

ie. Say I have a network specified like this:

Layer1: weights1
Layer2 : weights2
Layer3: weights3
Layer4: weights4

When I start training I load the weights only for layers 3 and 4 from a pre-trained model and save the snapshot. When I try to resume training, I dont use the '--weights' switch but instead use only the '--snapshot' switch to specify the snapshot I want to load from.

Any ideas as to whats happening ? The specific error message is this:
------------------------------------------------------------------------------------------------------------------------------------------
I0702 10:29:15.144909 20124 solver.cpp:254] Restoring previous solver status from casia_small_scale/solverstate_matchnet_new_0701_iter_10000.solverstate
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 716251510

I0702 10:29:40.683454 20124 solver.cpp:637] SGDSolver: restoring history
I0702 10:29:51.483162 20124 solver.cpp:294] Iteration 10000, Testing net (#0)
F0702 10:29:51.483803 20124 net.cpp:684] Check failed: target_blobs[j]->shape() == source_blob->shape() 
*** Check failure stack trace: ***
    @     0x7f340bdcbdaa  (unknown)
    @     0x7f340bdcbce4  (unknown)
    @     0x7f340bdcb6e6  (unknown)
    @     0x7f340bdce687  (unknown)
    @     0x7f340c263513  caffe::Net<>::ShareTrainedLayersWith()
    @     0x7f340c33814f  caffe::Solver<>::Test()
    @     0x7f340c338015  caffe::Solver<>::TestAll()
    @     0x7f340c337086  caffe::Solver<>::Step()
    @     0x7f340c336b75  caffe::Solver<>::Solve()
    @           0x414ed8  caffe::Solver<>::Solve()
    @           0x411304  train()
    @           0x41334c  main
    @     0x7f340b2ddec5  (unknown)
    @           0x410799  (unknown)
    @              (nil)  (unknown)
Aborted (core dumped)

------------------------------------------------------------------------------------------------------------------------------------------


codeforever

unread,
Jan 31, 2016, 12:44:07 PM1/31/16
to Caffe Users

Have you solved the problem? I have met the same thing.
在 2015年7月2日星期四 UTC+8下午10:34:51,Swami写道:
Reply all
Reply to author
Forward
0 new messages