Different accuracy with command line and python

Daniela G

unread,

Apr 14, 2016, 6:46:12 AM4/14/16

to Caffe Users

Hello,

I'm trying to train and test a CNN with my own data. For that, I already ran a bash file on the command line and I got 0,867 of accuracy.

Now I want to analyze the results using python so I tried to make a script to run the net there. I used some of the code of this example, but I got 0,27 of accuracy (test_acc). Adding a new variable "test_acc1" which gets the values from the accuracy blob, I get 0,867 again. I just don't understand why the "test_acc" value is different since the calculation method is the same of the accuracy layer. And I would like to know which value is the correct one.

niter = 10000 
test_interval = 500 
train_loss = zeros(niter) 
test_acc = zeros(int(np.ceil(niter / test_interval))) 
test_acc1 = zeros(int(np.ceil(niter / test_interval))) 
output = zeros((niter, 8, 2)) 
 
# the main solver loop 
for it in range(niter): 
    solver.step(1)  # SGD by Caffe 
     
    # store the train loss 
    train_loss[it] = solver.net.blobs['loss'].data 
     
    # store the output on the first test batch 
    # (start the forward pass at conv1 to avoid loading new data) 
    solver.test_nets[0].forward(start='conv1') 
    output[it] = solver.test_nets[0].blobs['score'].data[:8] 
     
    # each output is (batch size, feature dim, spatial dim) 
    [(k, v.data.shape) for k, v in solver.net.blobs.items()] 
 
    # just print the weight sizes (we'll omit the biases) 
    [(k, v[0].data.shape) for k, v in solver.net.params.items()] 
 
    # run a full test every so often 
    # (Caffe can also do this for us and write to a log, but we show here 
    #  how to do it directly in Python, where more complicated things are easier.) 
    if it % test_interval == 0: 
        print 'Iteration', it, 'testing...' 
        correct = 0 
        for test_it in range(100): 
            solver.test_nets[0].forward() 
            correct += sum(solver.test_nets[0].blobs['score'].data.argmax(1) 
                           == solver.test_nets[0].blobs['label'].data) 
        test_acc[it // test_interval] = correct / 1e4 
        test_acc1[it // test_interval] = solver.test_nets[0].blobs['accuracy'].data

Also, I have 90 images for training and 30 for testing. The data batch_size for each one of them are 90 and 30 respectively.

Thank you in advance!

Daniela G

unread,

Apr 19, 2016, 5:20:39 AM4/19/16

to Caffe Users

Anyone?

Jan

unread,

Apr 19, 2016, 6:33:27 AM4/19/16

to Caffe Users

Given your test set has 10000 examples (i.e. your batch size is 100), your test_acc should be correct.

The problem is the following: for test_acc1 you only request the value of the accuracy blob at the end of the last forward() pass. The accuracy values for all other batches but the last one are lost. The averaging over several batches is not done by the accuracy layer, but by the solver. Since you are not using the solver's Test() function, but are directly feeding through the test_net, this averaging never takes place.

It seems that the Test() function is not exposed though the pycaffe interface, so your "test_acc method" seems to be the best way to compute the accuracy over a number of batches.

Jan

Daniela G

unread,

Apr 22, 2016, 8:23:10 AM4/22/16

to Caffe Users

Thank you for your reply!

So when running through the command line, the accuracy value is just the last one? Isn't it using the Test() function you talked about?

Jan

unread,

Apr 25, 2016, 9:46:00 AM4/25/16

to Caffe Users

Depends. The "caffe" command line utility just creates a Solver instance from the configuration you give, e.g. solver.prototxt. Then it calls the solvers methods, such as Step() to do a number of training steps, which in turn calls Test(), the function I mentioned which computes an average loss/accuracy value for the validation/test set, in intervals specified by test_interval.

So the outputs caffe prints on the terminal for the test net(s) are indeed averages over all test iterations (specified by test_iter). All other outputs caffe prints (such as training loss) are values computed for the current training batch only! So if you specifiy in the solver display: 10, caffe will display the output of every 10th training iteration, but the outputs of each of the 9 training iterations inbetween are neither saved nor averaged in any way, they are just computed and never displayed. This is also why the statistics output by caffe are not really meaningful, save for the test net outputs that have some real significance.

Jan

Daniela G

unread,

Apr 29, 2016, 4:55:35 AM4/29/16

to Caffe Users

Thank you Jan :)

Reply all

Reply to author

Forward