caffe test: each iteration loads the same image

497 views
Skip to first unread message

Leonid Berov

unread,
Jan 25, 2016, 11:49:19 AM1/25/16
to Caffe Users
I finetuned a network and want to test on my test set using the command line interface. 
However on each iteration the batches seem to load the same image.

The log:
build/tools/caffe test -model models/somenet/train_val.prototxt -weights models/somenet/googlenet_1epoch.caffemodel -gpu 0 -iterations 10
...
I0125 17:16:27.605752  5047 caffe.cpp:263] Batch 0, loss1/loss = 0.0242291
I0125 17:16:27.605821  5047 caffe.cpp:263] Batch 0, loss2/loss1 = 4.37631e-07
I0125 17:16:27.605839  5047 caffe.cpp:263] Batch 0, loss3/loss1 = 3.36683
I0125 17:16:27.704399  5047 caffe.cpp:263] Batch 1, loss1/loss = 0.0242291
I0125 17:16:27.704460  5047 caffe.cpp:263] Batch 1, loss2/loss1 = 4.37631e-07
I0125 17:16:27.704514  5047 caffe.cpp:263] Batch 1, loss3/loss1 = 3.36683
I0125 17:16:27.803939  5047 caffe.cpp:263] Batch 2, loss1/loss = 0.0242291
I0125 17:16:27.804002  5047 caffe.cpp:263] Batch 2, loss2/loss1 = 4.37631e-07
I0125 17:16:27.804018  5047 caffe.cpp:263] Batch 2, loss3/loss1 = 3.36683
I0125 17:16:27.903465  5047 caffe.cpp:263] Batch 3, loss1/loss = 0.0242291
I0125 17:16:27.903524  5047 caffe.cpp:263] Batch 3, loss2/loss1 = 4.37631e-07
I0125 17:16:27.903540  5047 caffe.cpp:263] Batch 3, loss3/loss1 = 3.36683
...

Using the python interface I found out that this is actually the result that my net produces for the first image in my test file. 
The batch size for testing is set to 1, so at least that number makes sense.
What I do not understand is why each iterations uses the same image for each batch instead of, well, iterating over them.

My source file certainly has more then one line:
/some/place/154544_3-92700729927.jpg 3.92700729927
/some/place/231678_3-93203883495.jpg 3.93203883495
/some/place/129610_4-12605042017.jpg 4.12605042017
...


So how would I get build/tools/caffe test to iterate over my test data (while still using a batch size of 1)?

Jan C Peters

unread,
Jan 26, 2016, 5:04:11 AM1/26/16
to Caffe Users
Actually it should iterate over your test data. It could be that the loss values are indeed the same for all your images due to a bad weight combination. Try to make sure using the python interface whether that is the case (i.e. feed also the second and the third image through your net and see what comes out).

Jan
Message has been deleted

Leonid Berov

unread,
Jan 26, 2016, 12:46:40 PM1/26/16
to Caffe Users
So I used the deploy net in python to make sure the predictions change for the first test images, and they do.

Of course I have no access to the loss layers in deploy mode, so I started playing with the SGDSolver and train_val.prototext in Python. Here things get really strange.

1. I run the test_net for 10 iterations:
Like with the commandline tool the loss stays the same. I check the pictures that are loaded in the input layer - they are all the same, the first image of my training set.
What freaks me out: the loss is constant, but significantly different from the one I get on the commandline (as reported above)! Oo

loss1/loss 1.04371011257
loss2/loss1 4.22122335434
loss3/loss1 2.7532222271

loss1/loss 1.04371011257
loss2/loss1 4.22122335434
loss3/loss1 2.7532222271

loss1/loss 1.04371011257
loss2/loss1 4.22122335434
loss3/loss1 2.7532222271

2. So I get nervous and see what happens if I run the train net from Python wrapper for 10 iterations.
The reported loss changes from iteration to iteration, which is also the case with the command line. However when I check the pictures in the input layer... They are again all the same.
That means the reported changes in loss are due to the changing net parameters, at least in Python. 
Like it is the case with the training, reported loss using Python wrapper differs from what I get from command line. So I have no idea if the input data ever changed during my training runs.

This is really confusing to me!
Lets see:
 1. My test- and trainfiles contains lines of the form <img path> <float value>. Each line is separated by \n
 2. For visualising the content of the input layer in python I use (somewhat shortened version, no imports):

solver = caffe.SGDSolver('models/somenet/solver.prototxt')
solver.net.copy_from('models/somenet/googlenet_1epoch.caffemodel')
net = solver.test_nets[0]

transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))  # transforms from (w,h,c) to (c,w,h)
transformer.set_raw_scale('data', 255)  # the reference model operates on images in [0,255] range instead of [0,1]
transformer.set_channel_swap('data', (2,1,0))  # the reference model has channels in BGR order instead of RGB
transformer.set_mean('data', np.array((104, 117, 123)))

for i in range(10):
    net.forward()
    forward_pics.append(transformer.deprocess('data', net.blobs['data'].data[0]))

# visualising the captured data:
plt.imshow(transformer.deprocess('data', net.blobs['data'].data[0]))

3. I made sure that I use the same caffemodel for commandline and python wrapper. The solver.prototext I use for solver instantiation in python references the same train_val.prototext I use for the commandline.


Any idea what I am seeing here?

Leonid Berov

unread,
Jan 26, 2016, 3:23:56 PM1/26/16
to Caffe Users
I found the problem. ImageData does only accept input files of form "<image path> <int>\n" .
If it gets a float it silently fails and returns the dataset as it is at this moment, in my case consisting of 1 image and its int-truncated label.

In order to work with floats it is apparently suggested to use HD5 input layers. I just used the code from the following PR, works like a charm.

Jan C Peters

unread,
Jan 27, 2016, 3:12:09 AM1/27/16
to Caffe Users
Huh, ok. Yeah, I guess the code from that PR is reasonable and does not mess up anything important. But keep in mind that for regression problems you should use appropriate loss functions, like the Euclidean.

Jan

p.Paul

unread,
Feb 16, 2017, 9:02:34 AM2/16/17
to Caffe Users
 How did you check the images and labels loaded to the net, when it was training?  You said '  I check the pictures that are loaded in the input layer '

Leonid Berov

unread,
Feb 16, 2017, 9:30:56 AM2/16/17
to Caffe Users
The code I used for visualizing is posted in the lower part of the post you refer to. It lacks imports and other small things like list declarations for brevity reasons, but apart from that it should be executable and self explanatory. :)
This obviously only works for the python wrapper, not for the command line tools.

p.Paul

unread,
Feb 16, 2017, 10:04:51 AM2/16/17
to Caffe Users
Thank you very much!
Reply all
Reply to author
Forward
0 new messages