Why I got an accuracy 99 % but when I deploy the model the result is different ?

jaba marwen

unread,

May 30, 2017, 7:25:23 AM5/30/17

to Caffe Users

Hi every one ,

I fine-tuned googlenet to classify images into three type of objects (Lamborghini ,cylinder head and a piece of plane ) link to the dataset .
I have split the data set as the following , 4998 for training and 1002 for testing .

I set batch_size during training to 8 and to 10 during testing .
I renamed the last three fully connected layers , changed num_output to 3 and set lr_mult to 10 and 20 .

that is the config of my solver.prototxt :

net:"/home/jaba/caffe/data/diota_model/train_val.prototxt"
test_iter: 100
test_interval: 100
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 500
display: 50
max_iter: 10000
momentum: 0.9
weight_decay: 0.0005
snapshot: 200
snapshot_prefix:"/home/jaba/caffe/data/diota_model/train_val.prototxt"
solver_mode: GPU

After training , I got 99 % as accuracy ( you can check the log file ) .

But , the problem is when I deploy the model I did not get 99 % . In fact , I tried to test my model on val_lmdb directory using this script :

import os
import glob
import cv2
import caffe
import lmdb
import numpy as np
from caffe.proto import caffe_pb2

MODEL_FILE ='deploy.prototxt'
PRETRAINED='train_val.prototxt_iter_200.caffemodel'

caffe.set_mode_cpu()
#load_model 

net = caffe.Net(MODEL_FILE, PRETRAINED, caffe.TEST)

#load input and configure preprocessing 



mean_file = np.array([104,117,123]) 

transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_mean('data', mean_file)
transformer.set_transpose('data', (2,0,1))
transformer.set_channel_swap('data', (2,1,0))
transformer.set_raw_scale('data', 255.0)


#fixing the batch size

net.blobs['data'].reshape(1,3,227,227)






lmdb_env=lmdb.open('/home/bme/jaba/test/val_lmdb')

lmdb_txn=lmdb_env.begin()

lmdb_cursor=lmdb_txn.cursor()

datum=caffe_pb2.Datum()


for key,value in lmdb_cursor:
    
    datum.ParseFromString(value)
    
    label=datum.label
    data=caffe.io.datum_to_array(datum)
    
    image=np.transpose(data,(1,2,0))

    
    net.blobs['data'].data[...]=transformer.preprocess('data',image)

    out=net.forward()
    out_put=out['prob'].argmax()
    print('{},{}'.format(key,label))
    print 'prediction :::'+ str( out_put)

After executing this script on val_lmdb directory , I did not get accuracy 99 % as expect

Would you tell me why I have a result like this ? 99 % accuracy but when I test on val_lmdb not the same accuracy ?

Thanks in advance !

thanks

log.log

deploy.prototxt

train_val.prototxt

Jonathan R. Williford

unread,

May 30, 2017, 11:40:39 AM5/30/17

to jaba marwen, Caffe Users

What is the performance on the validation? You can plot the training accuracy vs validation accuracy during the training and see how it evolves. Your model is probably over generalizing.

I wasn't able to check to your examples, besides the first dozen or so examples. It looked like the same object with pictures taken from different angles, which I'm not sure would generalize well. I would make sure that the training data is not substantially different from the validation data. You could randomize which images are selected for training and validation. However, if this improves the validation accuracy a lot, it might mean that future examples will not generalize well and that you need more or better training examples.

Jonathan

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users+unsubscribe@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/94bcd770-0e60-4e1d-be37-52ded2aaeaf0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Przemek D

unread,

May 31, 2017, 4:23:54 AM5/31/17

to Caffe Users, marwe...@gmail.com

It's either what Jonathan is saying about your model overfitting the data, or your validation dataset is statistically different from the training one (i.e. you're testing your net on something else than you trained it on).

To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.

Jonathan R. Williford

unread,

May 31, 2017, 9:53:49 AM5/31/17

to jaba marwen, Caffe Users

Hi Jaba,

I think Przemek explained it better.

By generalize, I also meant the training would not generalize to the validation data. I think the validation data is probably too different from the training data (given the variability within training example). I would split the videos into small chunks (e.g. 10 frames) and then interleave which segments are being used for training and validation. Do *not* take for first 70% of frames of a video as training and last remaining frames as validation.

Once you are interested in being able to generalize, create a validation set from separate videos.

Best,

Jonathan

On Wed, May 31, 2017 at 3:23 PM, jaba marwen <marwe...@gmail.com> wrote:

First of all , I would like to thank you for your answer . I have
changed the script to above to calculate the accuracy on
validation_lmdb folder and I found 59.38 % as accuracy ( 595 correct
classifications among 1002 test images ).

For the dataset , I know that it is not well representative . In fact
, I took 200 frames from a video that describes a single object
(that's why you see same object with pictures taken from different
angles ) Then , I augment the data-set using different transformations
( zoom , random noise , rotation , translation ) to obtain 1000 images
for each category .

As you said the model would not generalize well . For the moment , I
want just a simple prototype even if it would not generalize (just
working on the validation set ) well. But , When I calculate the
accuracy as the following :

- total number of test samples = test_iter * batch_size = N
- total number of correct prediction samples = M (in N images)
-> accuracy = M/N
I did not get the same accuracy printed during training .

Do you think it can be a problem of configuration because
test_iter=100 , batch_size=10 and the size of validation set is 1002 ?

Reply all

Reply to author

Forward