So I used the deploy net in python to make sure the predictions change for the first test images, and they do.
Of course I have no access to the loss layers in deploy mode, so I started playing with the SGDSolver and train_val.prototext in Python. Here things get really strange.
1. I run the test_net for 10 iterations:
Like with the commandline tool the loss stays the same. I check the pictures that are loaded in the input layer - they are all the same, the first image of my training set.
What freaks me out: the loss is constant, but significantly different from the one I get on the commandline (as reported above)! Oo
loss1/loss 1.04371011257
loss2/loss1 4.22122335434
loss3/loss1 2.7532222271
loss1/loss 1.04371011257
loss2/loss1 4.22122335434
loss3/loss1 2.7532222271
loss1/loss 1.04371011257
loss2/loss1 4.22122335434
loss3/loss1 2.7532222271
2. So I get nervous and see what happens if I run the train net from Python wrapper for 10 iterations.The reported loss changes from iteration to iteration, which is also the case with the command line. However when I check the pictures in the input layer... They are again all the same.
That means the reported changes in loss are due to the changing net parameters, at least in Python.
Like it is the case with the training, reported loss using Python wrapper differs from what I get from command line. So I have no idea if the input data ever changed during my training runs.
This is really confusing to me!
Lets see:
1. My test- and trainfiles contains lines of the form <img path> <float value>. Each line is separated by \n
2. For visualising the content of the input layer in python I use (somewhat shortened version, no imports):
solver = caffe.SGDSolver('models/somenet/solver.prototxt')
solver.net.copy_from('models/somenet/googlenet_1epoch.caffemodel')
net = solver.test_nets[0]
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1)) # transforms from (w,h,c) to (c,w,h)
transformer.set_raw_scale('data', 255) # the reference model operates on images in [0,255] range instead of [0,1]
transformer.set_channel_swap('data', (2,1,0)) # the reference model has channels in BGR order instead of RGB
transformer.set_mean('data', np.array((104, 117, 123)))
for i in range(10):
net.forward()
forward_pics.append(transformer.deprocess('data', net.blobs['data'].data[0]))
# visualising the captured data:
plt.imshow(transformer.deprocess('data', net.blobs['data'].data[0]))
3. I made sure that I use the same caffemodel for commandline and python wrapper. The solver.prototext I use for solver instantiation in python references the same train_val.prototext I use for the commandline.
Any idea what I am seeing here?