Classification problem while using my model: "shape mismatch"

caffeuser

unread,

Jan 21, 2016, 2:44:45 AM1/21/16

to Caffe Users

Hi,
I have developed a caffe classification model based on my customized training data. Then I am trying to use this model to classify a given image. I gave below commands and I got an error while I tried to classify:
Command:
./myresults/test_net.bin \
myresults/imagenet_deploy.prototxt \
myresults/caffe_imagenet_train_iter_4500.caffemodel \
data/ilsvrc12/newimagenet_mean.binaryproto \
data/ilsvrc12/my_synset_words.txt \
examples/images/cat.jpg

Error:
Cannot copy param 0 weights from layer 'fc6'; shape mismatch. Source param shape is 2 50 (100); target param shape is 2 72 (144). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
*** Check failure stack trace: ***
    @     0x7f1e8b0a7daa (unknown)
    @     0x7f1e8b0a7ce4 (unknown)
    @     0x7f1e8b0a76e6 (unknown)
    @     0x7f1e8b0aa687 (unknown)
    @     0x7f1e8b430f67 caffe::Net<>::CopyTrainedLayersFrom()
    @     0x7f1e8b43b162 caffe::Net<>::CopyTrainedLayersFromBinaryProto()
    @     0x7f1e8b43b1c6 caffe::Net<>::CopyTrainedLayersFrom()
    @           0x406e35 Classifier::Classifier()
    @           0x403eab main
    @     0x7f1e89ac3ec5 (unknown)
    @           0x40434e (unknown)
    @              (nil) (unknown)
Aborted (core dumped)

I used below processing to create my customized model (https://github.com/BVLC/caffe/issues/550):
"
I'm not very knowledgeable as I just got started using Caffe as well, so folks should feel free to jump in and correct me. The documentation for the general procedure of training with your data is here: http://caffe.berkeleyvision.org/imagenet_training.html , and you will be able to do all your training by copying and modifying the files in CAFFE_ROOT_DIR/examples/imagenet, which we will call the imagenet directory. Using the imagenet architecture should yield decent out of the box results for categorizing images.

To summarize, the steps I followed to train Caffe were:

    Group your data into a training folder and a testing folder. Caffe will train on one set of images and test it's accuracy on the other set of images. Your data should be formatted to be 256x256 color jpeg files. For each set, create a text file specifying the categories that the pictures belong to. This text file is formatted like so,

    /home/my_test_dir/picture-foo.jpg 0
    /home/my_test_dir/picture-foo1.jpg 1

where picture-foo belongs to category 0 and picture-foo1 belongs to category 1.

    Now copy and modify create_imagenet.sh from the imagenet directory, changing the arguments to point to your folders and text files. Run create_imagenet.sh and it will generate training and testing leveldb directories. Caffe will work with these leveldb directories from now on.

    Copy and modify make_imagenet_mean.sh from the imagenet directory, changing the arguments to point at your spanking new leveldb folders. This will generate mean.prototxt files that caffe uses to normalize images, improving your results. I would recommend specifying absolute paths for everything to minimize headaches.

    Copy and modify imagenet_{deploy,solver,train,val}.prototxt. You'll want to change the source and mean_file parameters in imagenet_{train,val} to point to your leveldbs and your mean.prototxt files (again, absolute paths). You may also want to change the batch_size parameter based on the hardware that you'll be running caffe on. Lastly, change the solver.prototxt file to point to your newly modified train and val prototxt files! I believe you can leave deploy.prototxt alone.

    Take a step back and make sure you haven't missed anything. You will have deploy, solver, train, and val prototxt files; two image mean binaryproto files; one train_leveldb folder, and one val_leveldb folder. That's two folders and six files in total.

    You guessed it- copy and modify train_imagenet.sh! Point it to your solver prototxt file.

    Run the modified train_imagenet script. This will periodically spit out solverstate files and data files with names like caffe_train_iter_#.

    After training terminates, you can find a script in CAFFE_ROOT_DIR/build/tools called test_net.bin. test_net.bin will take your val.prototxt, a caffe_train_iter_# data file, and the number of testing iterations as arguments. It will tell you how your trained network is doing."

Any help is much appreciated. I am new to caffe and struggling a lot :(

Thanks

YAO CHOU

unread,

Jan 29, 2016, 11:19:52 PM1/29/16

to Caffe Users

Hey caffeuser,

Got the same error for my ip1 layer. Really confused. still work very well for training, but not classification.
Do you solve your problem, could you share some solutions? Thanks.

Yao

Message has been deleted

Ahmed Ibrahim

unread,

Mar 30, 2016, 11:04:24 AM3/30/16

to Caffe Users

I do not think you can pass an image like "cat.jpg" directly to caffe , also test_net is deprecated . if you complied caffe for python or matlab you can look up some classification tutorials which are plenty and simple. or what i can suggest is to create a data layer with your test images then modify your prototxt to refer to it, then use

caffe test ...

Ma'mur Abdullayev

unread,

Apr 16, 2016, 11:10:41 PM4/16/16

to Caffe Users

hi Yao. I got error "Cannot copy param 0 weights from layer 'ip1'; shape mismatch. Source param shape is 500 186050 (93025000); target param shape is 500 800 (400000)." in "ip1". did you solve it. If you did can you show me how to solve it please.

Введите код...

суббота, 30 января 2016 г., 13:19:52 UTC+9 пользователь YAO CHOU написал:

Jan

unread,

Apr 19, 2016, 4:53:27 AM4/19/16

to Caffe Users

It is really simple actually:

The message says that the shape of the FC layer you want to use does not match with the shape of the layer in the caffemodel (already trained weights). That usually happens when you feed data with a shape/size different from the data which the network was trained with. Conv-layers can adapt to changes in size of the feature maps, as the size of the kernels does not depend on the size of the input. The weight matrices of FC layers on the other hand _do_ depend on the size of the input. So you get a shape mismatch. Also there is usually no good way to map the trained weights to the differently sized new weight matrix.

In finetuning you usually just initialize that layer randomly and retrain it, keeping the weights of the conv kernels.

Jan

Reply all

Reply to author

Forward