Accuracy not improving above 70%, what else can i do?

Diego Rueda

unread,

Feb 19, 2015, 9:29:43 AM2/19/15

to caffe...@googlegroups.com

Hi, I am using the following network to train on http://www.robots.ox.ac.uk/~vgg/data/flowers/17/, i am using 1conv-pool combo and 2 fully connected layers using drop-out as well I am able to get the worst of days a 70% accuracy on test set, I have trained up to 1M iterations and reducing LR every 30K iterations, still not improving.
Can you guys give me advice on what should i don next to improve this accuracy?

Thanks.

Michael Wilber

unread,

Feb 20, 2015, 8:04:08 AM2/20/15

to caffe...@googlegroups.com

70% rank-1 accuracy on a 14-category dataset really is not that bad.

Try fine-tuning an existing model to this task! The "Model Zoo" contains many networks that are already great at classifying ImageNet images. These models are *very* similar to models that are likely to do well at your flower task, so I expect a few iterations of fine-tuning could make quite a difference. http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html

My suggestion:

- Start from, say, pre-trained AlexNet or Network-in-Network in the model zoo

- Since you only have <1,500 images, set the learning weights of everything but the last layer to 0 (or extremely close to 0)

- Change the number of outputs to 17

- Lower the base learning rate of this layer too, to avoid overfitting

With only 1 convolution layer, you probably can't learn very discriminative features, especially since your dataset has less than 1,500 images.

"You young whippersnappers don't know how hard it used to be! Back in MY day, we had to design our OWN features! By hand! Up hill, both ways!"

good luck.

Michael Wilber

unread,

Feb 20, 2015, 12:00:58 PM2/20/15

to Diego Rueda, caffe...@googlegroups.com

In my experience, fine-tuning a large pre-trained network while keeping
everything but the output layer fixed (or almost fixed) is one great way
to avoid overfitting to a smaller dataset.

Think of it this way: ImageNet contains millions of natural images and
you can learn awesome features from them. Then, when it's time to pick
which flower you have, the last network can just use the previous
(excellent) ImageNet-trained features and adapt to your specific (small)
flower dataset.

Diego Rueda <ing.die...@gmail.com> writes:
> I have tried using more convolutional layers and max pooling, but after
> training i notice the accuracy drops to 67% more or less, thats why i have
> this simple network since i don't know how to add to it without reducing
> accuracy

>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Caffe Users" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/caffe-users/PWkXZW-xMfU/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> caffe-users...@googlegroups.com.
>> To post to this group, send email to caffe...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/caffe-users/01285bb9-811e-4c32-9fab-4214ad7a09da%40googlegroups.com
>> <https://groups.google.com/d/msgid/caffe-users/01285bb9-811e-4c32-9fab-4214ad7a09da%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>

ath...@ualberta.ca

unread,

Feb 20, 2015, 7:02:52 PM2/20/15

to caffe...@googlegroups.com

Hi Diego,

Tip... always try generic features with linear svms first to get a baseline before net engineering (way easier).

For comparison, I did a basic test on flowers17 and get 95.6%. Rip L2-normalized fc6 from AlexNet & train linear (1 vs. rest) svms with (early) flip augmentation. Even without optimizing on a validation set, you can get 92+% this way using pool5, fc7 whatever. Since this is so close to 100%, better to try flowers102 where I get 93.7% using basics above. Net engineering should improve upon these results but this way you have a starting point.

Best,

Andy Hess

Steven Clark

unread,

Feb 25, 2015, 2:58:27 PM2/25/15

to caffe...@googlegroups.com, ing.die...@gmail.com

For newbies out there (like me), Michael's comment below is important and bears repeating. Training from scratch on a 5-class image problem, the top-1 performance was ~75% after 10K iterations. Whereas, by fine-tuning the googlenet pre-trained network, top-1 performance was 92% after just 1K iterations! (and still improving).

This was taking the approach of divide base_lr by 10, multiply the blob_lr of new, final layer(s) by 10.

Reply all

Reply to author

Forward