How to output the testing label by using Caffe?

5,015 views
Skip to first unread message

wanzheng zhu

unread,
Mar 17, 2015, 3:16:30 AM3/17/15
to caffe...@googlegroups.com

I am doing a classification problems using Caffe. But what I can see is only the training and testing accuracy, How do I see the output label for each sample so that I would know where goes wrong?

By the way, how do I see the inner kernel weight/feature?

Thank you!

Axel Angel

unread,
Mar 17, 2015, 5:50:05 AM3/17/15
to caffe...@googlegroups.com
You can try my testing python script: I shows the misclassified samples and the confusion matrix.

Launched like this: python ../src/convnet_test_lmdb.py --proto lenet.prototxt --model snapshots/lenet_mnist_v3-id_iter_1000.caffemodel --lmdb ../caffe/examples/mnist/mnist_test_lmdb/

import sys
import caffe
import matplotlib
import numpy as np
import lmdb
import argparse
from collections import defaultdict

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument('--proto', type=str, required=True)
    parser.add_argument('--model', type=str, required=True)
    parser.add_argument('--lmdb', type=str, required=True)
    args = parser.parse_args()

    count = 0
    correct = 0
    matrix = defaultdict(int) # (real,pred) -> int
    labels_set = set()

    net = caffe.Net(args.proto, args.model, caffe.TEST)
    caffe.set_mode_cpu()
    lmdb_env = lmdb.open(args.lmdb)
    lmdb_txn = lmdb_env.begin()
    lmdb_cursor = lmdb_txn.cursor()

    for key, value in lmdb_cursor:
        datum = caffe.proto.caffe_pb2.Datum()
        datum.ParseFromString(value)
        label = int(datum.label)
        image = caffe.io.datum_to_array(datum)
        image = image.astype(np.uint8)

        out = net.forward_all(data=np.asarray([image]))
        plabel = int(out['prob'][0].argmax(axis=0))

        count = count + 1
        iscorrect = label == plabel
        correct = correct + (1 if iscorrect else 0)
        matrix[(label, plabel)] += 1
        labels_set.update([label, plabel])

        if not iscorrect:
            print("\rError: key=%s, expected %i but predicted %i" \
                    % (key, label, plabel))

        sys.stdout.write("\rAccuracy: %.1f%%" % (100.*correct/count))
        sys.stdout.flush()

    print(str(correct) + " out of " + str(count) + " were classified correctly")

    print ""
    print "Confusion matrix:"
    print "(r , p) | count"
    for l in labels_set:
        for pl in labels_set:
            print "(%i , %i) | %i" % (l, pl, matrix[(l,pl)])

Nikiforos Pittaras

unread,
Mar 17, 2015, 11:46:23 AM3/17/15
to caffe...@googlegroups.com
Axel, does that code work with any .prototxt model definition?

Axel Angel

unread,
Mar 17, 2015, 2:19:44 PM3/17/15
to caffe...@googlegroups.com
I don't know, probably not *all* imaginable prototxt but I'm sure it works on most of them. It should be easy to adapt anyway.

Nikiforos Pittaras

unread,
Mar 18, 2015, 5:13:29 AM3/18/15
to caffe...@googlegroups.com
Do you perhaps have any guidelines for converting prototxt files for feature extraction? I assume you use the train_val prototxt ?

Nikiforos Pittaras

unread,
Mar 18, 2015, 6:39:27 AM3/18/15
to caffe...@googlegroups.com
^ For example., using your code and the net definition from examples/feature_extraction, I am getting
Exception: Input blob arguments do not match net inputs.

and by printing, I am getting
kwargs:
['data']
selt(self.inputs)
[]

Axel Angel

unread,
Mar 19, 2015, 9:53:13 AM3/19/15
to caffe...@googlegroups.com
This one is used only for training, you need to use the deploy one (in LeNet it's just "lenet.prototxt").

wanzheng zhu

unread,
Mar 19, 2015, 8:59:49 PM3/19/15
to caffe...@googlegroups.com
Thank you guys very much for your help! @Axel Angel  @ Nikiforos Pittaras

wanzheng zhu

unread,
Mar 20, 2015, 7:28:35 AM3/20/15
to caffe...@googlegroups.com
Thanks! @Axel Angel.
I can run your script on mnist example. But when I use your script on my own example, I found the following errors:

raise Exception('Input blob arguments do not match net inputs.')
Exception: Input blob arguments do not match net inputs.

Do you know what happened?
BTW, I think maybe one of the reason is I did resize for all the the pictures before training. But how to add resize when using your script? Thank you a lot!

On Thursday, 19 March 2015 21:53:13 UTC+8, Axel Angel wrote:

Toru Hironaka

unread,
Aug 11, 2015, 2:32:33 PM8/11/15
to Caffe Users
I used your code with MNIST, worked perfect but not my own data. It seemed to mis-classify. It looks like classify only certain labels. Do you have any idea? 

Harsh Wardhan

unread,
Oct 15, 2015, 12:57:03 AM10/15/15
to Caffe Users
Thanks @Axel Angel. Your code was really helpful but the accuracy that I'm getting with CIFAR-10 Quick is around 55% while it should have been around 75%. I have downloaded the dataset in binary format from this link. And thereafter, I followed the standard guidelines for training from the BVLC website. Any thoughts on this?

Fradaric Joseph

unread,
Feb 23, 2016, 10:20:33 AM2/23/16
to Caffe Users
Hi Harsh, have you solved your issue? I am struck the the same problem, I am getting 54% for the cifar_quick with this test setup. 

Harsh Wardhan

unread,
Feb 23, 2016, 11:24:18 AM2/23/16
to Caffe Users
See my answer here.

Harsh Wardhan

unread,
Feb 23, 2016, 11:28:30 AM2/23/16
to Caffe Users
I forgot to explain one thing here. You guys are not subtracting the mean which results in low accuracy. The link to the code posted above takes care of that. Apart from this there's nothing wrong with your approach.
Message has been deleted

Guillem C

unread,
May 9, 2016, 11:21:51 AM5/9/16
to Caffe Users
It looks like your model is always predicting 0 as output, which may be caused by an imbalanced dataset. If your testing data is representative of your training data, your dataset is definitely skewed to the 0 class, so the model might learn to predict always 0, because it decreases the error a lot.

Consider a batch of 100 samples, if 95% of them are the same class and the model always predicts that class, it gets 95% of the samples correct, so the error is very low. Try to balance your dataset or balance the number of positive and negative samples in each batch.

On Wednesday, May 4, 2016 at 12:46:24 PM UTC+2, monjoy saha wrote:
@Axel Angel:  I am using your code for confusion matrix. Unfortunately I am getting error like

Error: 10.jpg, expected 1 but predicted 0

Accuracy: 96.5%1378 out of 1428 were classified correctly

Confusion matrix:
(real , predicted) | count
(0 , 0) | 1378
(0 , 1) | 0
(1 , 0) | 50
(1 , 1) | 0

Means all true positive images are misclassified. Could you kindly suggest what may be the issue? Training and testing code are working fine. Here lmdb images directly used do I have to use substract mean image as I used during training? Please
help.
Reply all
Reply to author
Forward
0 new messages