Caffe Bug? Or my misunderstanding: Euclidean loss

992 views
Skip to first unread message

David Freelan

unread,
May 7, 2016, 5:46:09 PM5/7/16
to Caffe Users
I am using the Euclidean loss layer as the last layer of my function.
As shown here: http://caffe.berkeleyvision.org/tutorial/layers.html

To my understanding, this sums up the square error in each dimension of the output, then divides by the number of dimensions*2


Here's my python program trying to confirm that:
import numpy as np
import h5py
import caffe

with h5py.File('/home/dfreelan/dev/caffe/datasets/HearthstoneTesting.h5','r') as hf:
    data = hf.get('data')
    labels = hf.get('label')
    np_data = np.array(data)
    np_labels = np.array(labels)
    caffe.set_mode_cpu()
    net = caffe.Net("singleLayerExample.prototxt","_iter_350000.caffemodel",caffe.TEST)
    for i in range (5,10):
        net.blobs['data']= np_data[i];
        net.blobs['label'] = np_labels[i]
        net.forward()
        print('label is' , labels[i])
        print ('euclid loss' ,+net.blobs["EuclidLoss"].data)
        error = net.blobs['tanhFinal'].data[0][0]-np_labels[i][0];
        error = (error*error)/2
        print('my calculated error: ' , error)

I print the label to prove that my labels are 1 dimensional
and you can see clearly how I calculate the error, subtract actual-expected, square it, and divide by 2.

But the result i get looks like this:
('label is', array([ 0.52552474], dtype=float32))
('euclid loss', 0.00089161232)
('my calculated error: ', 0.11557598412036896)
('label is', array([-0.38310024], dtype=float32))
('euclid loss', 0.10378566)
('my calculated error: ', 0.086627870798110962)
('label is', array([ 0.30064058], dtype=float32))
('euclid loss', 0.00017976735)
('my calculated error: ', 0.0015344701241701841)
('label is', array([ 0.45833334], dtype=float32))
('euclid loss', 0.0055524549)
('my calculated error: ', 0.082913808524608612)
('label is', array([ 0.125], dtype=float32))
('euclid loss', 6.3007174e-05)
('my calculated error: ', 0.141860231757164)



It's not even just a constant factor off, it's wayyyyy over the place.


Am I doing something wrong? ( i really hope so!)



mySetup.prototxt

David Freelan

unread,
May 7, 2016, 7:10:04 PM5/7/16
to Caffe Users
I have started putting print statments inside the euclidean loss function
What i've found so far is that hte operation is being done correctly, with the wrong layer input!

I have no idea where it's getting the invalid data from,  but it is grabbing correctly from "tanhFinal" layer, however there is clearly something  wrong with the "label" layer -- it doesn't appear to be grabbing that layer at all. However, my prototxt looks fine to me.

In an effort to see if there was something wrong with my HDF5 file, I tried both 32 bit and 64 floating point in my HDF5 file for the labels, neither appears to make a difference.



Next step in debugging may be to figure out where the heck it's grabbing this incorrect blob from--- so i can make it grab the right  one, but the Blob object has no good identifier in blob.cpp. No names like we see in the prototxt file.


Any assistance in debugging from here or otherwise would be appreciated

Jan

unread,
May 9, 2016, 5:29:14 AM5/9/16
to Caffe Users
Oh man, I just prepared a nice answer and it vanished into the google nirvana. Whatever, here is the short version:

You need

       net.blobs['data'].data[...] = np_data[i]
       net.blobs['label'].data[...] = np_labels[i]


instead of

       net.blobs['data']= np_data[i];
       net.blobs['label'] = np_labels[i]

Also you should be using a "deploy" kind definition of your network, or the net.forward() call will overwrite all the data you put into the blobs via the API (because the hdf5data layers will just read in data and put it into the blobs). A "deploy" config is basically the same but with removed data layers and replaced with an "Input" layer to declare the input blobs along with their dimensions.

As a data type for the data to be fed through the network you should always use np.float32.

The euclidean layer does compute the MSE, there is no additional division by 2.

Jan
Reply all
Reply to author
Forward
0 new messages