Error when trying to train a Siamese Network

464 views
Skip to first unread message

Youyou

unread,
Aug 19, 2015, 8:45:13 AM8/19/15
to Caffe Users
Hi guys, I'm trying to train a Siamese network to learn if there is the same object in an image or not. I converted all my data to leveldb, so i have a train_leveldb file and a test_leveldb one, i converted both the training and testing data with a batch size of 64. When i try runing train_reid_siamese.sh (It's the same as train_mnist_siamese.sh, it's just that the paths or modified), my program end up with an error "core dumped". Please notice this is just the end of my output:
  .
  .
  .
  .
layer {
  name: "loss"
  type: "ContrastiveLoss"
  bottom: "feat"
  bottom: "feat_p"
  bottom: "sim"
  top: "loss"
  contrastive_loss_param {
    margin: 1
  }
}
I0819 14:28:44.534545 24501 layer_factory.hpp:74] Creating layer pair_data
I0819 14:28:44.534564 24501 net.cpp:90] Creating Layer pair_data
I0819 14:28:44.534574 24501 net.cpp:368] pair_data -> pair_data
I0819 14:28:44.534590 24501 net.cpp:368] pair_data -> sim
I0819 14:28:44.534603 24501 net.cpp:120] Setting up pair_data
F0819 14:28:44.534725 24501 db_leveldb.cpp:15] Check failed: status.ok() Failed to open leveldb /home/youssef/Umons/Data_Converted/test_leveldb
IO error: lock /home/youssef/Umons/Data_Converted/test_leveldb/LOCK: Resource temporarily unavailable
*** Check failure stack trace: ***
    @     0x7f8acbf4cdaa  (unknown)
    @     0x7f8acbf4cce4  (unknown)
    @     0x7f8acbf4c6e6  (unknown)
    @     0x7f8acbf4f687  (unknown)
    @     0x7f8acc29303a  caffe::db::LevelDB::Open()
    @     0x7f8acc315016  caffe::DataLayer<>::DataLayerSetUp()
    @     0x7f8acc30c199  caffe::BasePrefetchingDataLayer<>::LayerSetUp()
    @     0x7f8acc2e5733  caffe::Net<>::Init()
    @     0x7f8acc2e74a2  caffe::Net<>::Net()
    @     0x7f8acc2b6d4c  caffe::Solver<>::InitTestNets()
    @     0x7f8acc2b743b  caffe::Solver<>::Init()
    @     0x7f8acc2b7606  caffe::Solver<>::Solver()
    @           0x40c8f0  caffe::GetSolver<>()
    @           0x406541  train()
    @           0x404a81  main
    @     0x7f8acb45eec5  (unknown)
    @           0x40502d  (unknown)
    @              (nil)  (unknown)
Aborted (core dumped)





I don't understand why the program fail to open /home/youssef/Umons/Data_Converted/test_leveldb/LOCK since the file is here.I attached my prototxt files, and my solver if anyone want to take a look. Notice that i have a training set of 80000 paired images and a small testing set of 5600 paired images, with a label of 1 if there is the same object in the image and 0 otherwise. To train the data i run in a shell the train_reid_siamese.sh file. If you have a bit of a clue don't hesitate to post it guys ;). Thanks in advance 
Reid_siamese.prototxt
Reid_siamese_solver.prototxt
Reid_siamese_train_test.prototxt
train_reid_siamese.sh

Youyou

unread,
Aug 19, 2015, 10:05:52 AM8/19/15
to Caffe Users
Or is it just that i don't have enough available memory in my computer??

Abhimanyu Dubey

unread,
Aug 19, 2015, 11:01:22 AM8/19/15
to Caffe Users

Youyou

unread,
Aug 19, 2015, 11:09:15 AM8/19/15
to Caffe Users
Thanks Abhimanyu. That solved my problem ;)

meghna pippal

unread,
Nov 10, 2015, 12:23:24 PM11/10/15
to Caffe Users
I am trying to train using siamese network on my database.
I converted my database into lmdb format and appended line backnend in data layer.
but when i run ./examples/siamese_tree/train_mnist_siamese.sh
I am getting following errors please help.

I1110 22:27:24.481112  4381 net.cpp:514] Sharing parameters 'conv1_w' owned by layer 'conv1', param index 0
F1110 22:27:24.481154  4381 net.cpp:533] Check failed: this_blob->shape() == owner_blob->shape() Cannot share param 'conv1_w' owned by layer 'conv1' with layer 'conv1_p'; shape mismatch.  Owner layer param shape is 20 1 5 5 (500); sharing layer expects shape 20 2 5 5 (1000)
*** Check failure stack trace: ***
    @     0x7f0d75009daa  (unknown)
    @     0x7f0d75009ce4  (unknown)
    @     0x7f0d750096e6  (unknown)
    @     0x7f0d7500c687  (unknown)
    @     0x7f0d753e8a02  caffe::Net<>::AppendParam()
    @     0x7f0d753ebc13  caffe::Net<>::Init()
    @     0x7f0d753ed685  caffe::Net<>::Net()
    @     0x7f0d753c4ada  caffe::Solver<>::InitTrainNet()
    @     0x7f0d753c5f0c  caffe::Solver<>::Init()
    @     0x7f0d753c6259  caffe::Solver<>::Solver()
    @     0x7f0d753bd7f3  caffe::Creator_SGDSolver<>()
    @           0x40f03e  caffe::SolverRegistry<>::CreateSolver()
    @           0x4076b4  train()
    @           0x4054a1  main
    @     0x7f0d7451bec5  (unknown)
    @           0x405b8d  (unknown)
    @              (nil)  (unknown)
Aborted (core dumped)



I am attaching my files.
In my database i am having 100 classes.

Thank You
mnist_siamese.prototxt
mnist_siamese_solver.prototxt
mnist_siamese_train_test.prototxt

zoey

unread,
Oct 15, 2016, 3:10:51 AM10/15/16
to Caffe Users
hi, I have met the same problem as you. Could you please share some tips about the solution? Thanks a lot!

在 2015年11月11日星期三 UTC+8上午1:23:24,meghna pippal写道:

Przemek D

unread,
Oct 25, 2016, 7:56:27 AM10/25/16
to Caffe Users
You input a 3-channel image and use Slice layer to, presumably, obtain 3 single-channel images. However, your Slice definition only outputs two blobs. Since your specified slice point is 1, it means your input blob is cut at 1, resulting in one blob with 1 channel, and the other with remaining 2 channels. Conv1 inputs the first blob, so its filter is of shape 1 5 5 (hence the param shape 20 1 5 5 in your error log). Conv2 inputs the other, but to accomodate 2-channel input it builds filters of shape 2 5 5 (hence the 20 2 5 5 in your log). Naturally, you can't share weights of different shape.
Solution: split layer into 3 single-channel images by specifying another top to the Slice layer. You'll probably want to add the Silence layer too, and redirect the unneeded blob to it (otherwise it'll clutter your output log).
Reply all
Reply to author
Forward
0 new messages