How to successfully train on a subset of ImageNet?

296 views

Skip to first unread message

Steven H

unread,

Aug 17, 2016, 12:13:23 PM8/17/16

to Caffe Users

Hello,

While I have worked with Deep Learning before (primarily through the Matlab toolkit MatConvNet), I am just getting started with Caffe. I have gone through the Caffe tutorials, and have run the LeNet example code without issue. Now, to begin gaining experience using my own data in Caffe, I decided to reimplement one of the first projects I did in MatConvNet. This project is a simple one: collect ImageNet data on 14 different objects and train the AlexNet architecture on this data for classification purposes. In essence, this is very similar to the ImageNet Competition but with a smaller amount of training data and classes to shrink the amount of training time needed. The dataset I collected has 32,498 training images and 3,580 testing images. I was able to implement this in MatConvNet and get high accuracies on the test set.

However, I have been unable to successfully implement this experiment in Caffe, and I am unsure what the issue is. I used this page as an example to learn how to train a network on my own dataset. I used the Caffe tool convert_imageset.cpp to create lmdb files for my training and test set and the Caffe tool convert_image_mean.cpp to create the image mean .binaryproto file from my training data. Next, I made a copy of the CaffeNet train_val.prototxt from models/bvlc_reference_caffenet/train_val.prototxt, only making minor changes such as updating the paths to the data files, changing the batch size, and changing the number of outputs from 1000 to 14 on the last Inner Product layer. Finally, I copied the solver file from models/bvlc_reference_caffenet/solver.prototxt, making a few changes to make the network train for a shorter amount of time. In MatConvNet, I only need to train the network for around 30 epochs to have a high accuracy, so given that 127 iterations = 1 batch for my data ( I have a batch size of 256 during training), I set my solver to train for around 30 epochs as well (3810 iterations).

The problem is that after training, the accuracy output by Caffe was only around 0.07 (aka 1/14 = random guessing). I tried training the network for longer (72390 iterations), but the accuracy doesn't seem to significantly increase with iterations nor the loss significantly decrease. I am unsure what I need to change to get this network to perform well, being new to Caffe. Do you have any ideas? I have posted the solver file below, in case you think it may have to do with my selection of hyperparameters. Any help would be appreciated. Feel free to ask me questions if further clarification is needed.

net: "food_train_test.prototxt"
test_iter: 60
test_interval: 260
base_lr: 0.01
lr_policy: "step"
gamma: 0.1
stepsize: 23000
display: 20
max_iter: 72390
momentum: 0.9
weight_decay: 0.0005
snapshot: 9880
snapshot_prefix: "../results/food-data-long/"
solver_mode: GPU

Reply all

Reply to author

Forward

0 new messages