Replicating Digits results with Caffe Python interface

Geoffrey Ulman

unread,

May 11, 2015, 6:23:52 PM5/11/15

to digits...@googlegroups.com

I have a model trained with Digits and am trying to use it with the Caffe Python interface. However, similar to this issue https://github.com/NVIDIA/DIGITS/issues/62 i'm seeing different results in Digits versus the Caffe Python API.

The relevant piece of my Python code is:

caffe.set_mode_gpu()
net = caffe.Classifier( MODEL_DEFINITION,
MODEL_WEIGHTS,
mean=mean_pixel,
raw_scale=255)
input_images = [ caffe.io.load_image( TEST_IMAGE, color=False ) ]
predictions = net.predict( input_images, oversample=False )

My training / test images are 256x256 greyscale images.

Some questions:

I didn't see any way to determine exactly what mean pixel value Digits was using, so I simply download the mean image that it displays and calculate its mean value. It appears that mean value needs to be on 0-255 not 0-1, but that was one thing I wasn't positive about.
This tutorial indicates that caffe.Classifier.predict does "10 predictions, cropping the center and corners of the image as well as their mirrored versions, and average over the predictions". Is this what Digits is doing as well to generate its classification?
Are there any other obvious/known gotchas when trying to replicate Digits results in Caffe that I might be running into? Is there a tutorial or good example code somewhere?

Thanks again for any help,

Geoff

Geoffrey Ulman

unread,

May 11, 2015, 6:43:17 PM5/11/15

to digits...@googlegroups.com

I'm currently looking in digits/model/tasks/caffe_train.py as that appears to be where the results reported by the "Test one image" button are calculated, but haven't found anything that I'm doing glaringly differently yet.

Allison Gray

unread,

May 11, 2015, 6:48:59 PM5/11/15

to digits...@googlegroups.com, geoff...@gmail.com

Hi Geoffrey,

I have not had this problem before, although I have not tried to use the mean image displayed in the GUI. Have you tried using the mean.binaryproto for your mean, and then loading it as a *.npy file? This can be found in your jobs directory. This was setup the first time you ran DIGITS. Mine is located in /home/ubuntu/.digits/jobs. You can access all of the files that DIGITS creates for its jobs here.

Let me know if this helps.

Geoffrey Ulman

unread,

May 11, 2015, 7:20:40 PM5/11/15

to digits...@googlegroups.com, geoff...@gmail.com

Woops! Looks like it was a silly oversight on my end. I was providing input images with pixel values on 0-255 and caffe was expecting to receive images on 0-1 then scale them to 0-255 itself (since I had set raw_scale=255).

I did take another look at mean.binaryproto (I'd seen it originally but dealing with the protocol buffer format scared me away from considering using it initially). But since you mentioned it I took another look. For those who might find this thread later, there's a good example of reading the file in the last comment here: https://github.com/BVLC/caffe/issues/290.

As it turns out (and as one would hope), the resulting data is basically the same downloading the jpg from the DIGITS webpage for the dataset or reading from mean.binaryproto.

Geoffrey Ulman

unread,

May 11, 2015, 7:28:01 PM5/11/15

to digits...@googlegroups.com

As an aside, I do think it would be incredibly useful to have a simple tutorial demonstrating how to take the various model files generated by digits and apply them to classify an image in python code. Or maybe something already exists and I missed it?

It just seems like the logical next step for many people whose first experiments with DNN training is using Digits. Eventually they'll want to take their model and use it in some code.

Luke Yeager

unread,

May 11, 2015, 7:41:42 PM5/11/15

to digits...@googlegroups.com, geoff...@gmail.com

I'm seeing different results in Digits versus the Caffe Python API.

How much different? Which one seems to be working better?

I didn't see any way to determine exactly what mean pixel value Digits was using, so I simply download the mean image that it displays and calculate its mean value. It appears that mean value needs to be on 0-255 not 0-1, but that was one thing I wasn't positive about

It should be in the [0-255] range, not 0-1.

NOTE: if you're using the default LeNet model, there's a bug in the models provided with DIGITS 1.0 that was fixed in this commit. I assume that's not your problem since you're working with 256x256 images.

This tutorial indicates that caffe.Classifier.predict does "10 predictions, cropping the center and corners of the image as well as their mirrored versions, and average over the predictions". Is this what Digits is doing as well to generate its classification?

DIGITS doesn't use oversampling. That's going to give you different results.

Are there any other obvious/known gotchas when trying to replicate Digits results in Caffe that I might be running into? Is there a tutorial or good example code somewhere?

We don't have a tutorial in DIGITS.

Allison Gray

unread,

May 11, 2015, 7:44:29 PM5/11/15

to digits...@googlegroups.com, geoff...@gmail.com

That is a good idea. It wouldn't take long to do either. We can look into it and get something posted.

On Monday, May 11, 2015 at 3:23:52 PM UTC-7, Geoffrey Ulman wrote:

Geoffrey Ulman

unread,

May 11, 2015, 9:11:59 PM5/11/15

to digits...@googlegroups.com, geoff...@gmail.com

I'm seeing different results in Digits versus the Caffe Python API.
How much different? Which one seems to be working better?

The issue was the input image data (I was giving caffe images with 0-255 pixel values instead of 0-1, then telling it to use raw_scale=255).

Now I'm getting identical values in digits and python/caffe (at least to the precision that the digits web interface reports).

Stephane

unread,

Sep 2, 2015, 9:30:26 AM9/2/15

to DIGITS Users, geoff...@gmail.com

Hello,
I'm facing a quite similar problem and I have no idea how to solve it. I have difficulty in understandng the way you solved your problem.

I trained networks based on LeNet and AlexNet with digits. It gives good results when I classify new images with these networks.
Then, when I train the same networks with same images with caffe only (by using command caffe train ...), results are bad with no convergence at all: random guess or NAN results. Results are the same with caffe from Nvidia branch or from BVLC branch.
The difference is that I use the C++ interface.

When you say: "The issue was the input image data (I was giving caffe images with 0-255 pixel values instead of 0-1, then telling it to use raw_scale=255)".
Where is this located in caffe? I can't find in the code where it is done.

Is there a simple procedure to format data in the exact same way for caffe and digits and to obtain the same results?

Thanks in advance for any help.
Regards,
Stephane

Reply all

Reply to author

Forward