In the process of debugging the training of a model, I created a super-simple model with just an HDF5Data layer:
name: 'train'
layer {
name: 'data'
type: 'HDF5Data'
top: 'cls-label'
hdf5_data_param {
source: './seg-mnist-dataset/seg-mnist-2x2-all-trn-classification-hdf5.txt'
batch_size: 1
shuffle: true
}
}
The dataset "seg-mnist-2x2-all-trn-classification-hdf5.txt" contains 2 examples (for debugging purposes). Each example contains a "data" and "cls-label" array / blob:
seg-mnist-2x2-all/hdf5/trn/0/seg-mnist-2x2-all_trn_0000000.cls.hdf5
seg-mnist-2x2-all/hdf5/trn/0/seg-mnist-2x2-all_trn_0000001.cls.hdf5
The shapes are 1x1x32x32 for "data" and 1x10x1x1 for "cls-label". I initially tried to use the shapes 1x32x32 and 10x1x1, but apparently the first dimension must be equal for both.
Unexpectedly, when I run the training, the average of the two vectors are used as the input:
I1108 15:49:53.744596 37395 solver.cpp:337] Iteration 0, Testing net (#0)
I1108 15:49:53.774808 37395 solver.cpp:404] Test net output #0: cls-label = 0.5
I1108 15:49:53.774857 37395 solver.cpp:404] Test net output #1: cls-label = 1
I1108 15:49:53.774866 37395 solver.cpp:404] Test net output #2: cls-label = 0.5
I1108 15:49:53.774874 37395 solver.cpp:404] Test net output #3: cls-label = 0.5
I1108 15:49:53.774881 37395 solver.cpp:404] Test net output #4: cls-label = 0.5
I1108 15:49:53.774888 37395 solver.cpp:404] Test net output #5: cls-label = 0.5
I1108 15:49:53.774921 37395 solver.cpp:404] Test net output #6: cls-label = 0
I1108 15:49:53.774930 37395 solver.cpp:404] Test net output #7: cls-label = 0
I1108 15:49:53.774937 37395 solver.cpp:404] Test net output #8: cls-label = 0
I1108 15:49:53.774945 37395 solver.cpp:404] Test net output #9: cls-label = 0.5
I1108 15:49:53.775005 37395 solver.cpp:228] Iteration 0, loss = 0
I1108 15:49:53.775049 37395 solver.cpp:244] Train net output #0: cls-label = 1
I1108 15:49:53.775068 37395 solver.cpp:244] Train net output #1: cls-label = 1
I1108 15:49:53.775076 37395 solver.cpp:244] Train net output #2: cls-label = 0
I1108 15:49:53.775084 37395 solver.cpp:244] Train net output #3: cls-label = 0
I1108 15:49:53.775090 37395 solver.cpp:244] Train net output #4: cls-label = 1
I1108 15:49:53.775097 37395 solver.cpp:244] Train net output #5: cls-label = 1
I1108 15:49:53.775106 37395 solver.cpp:244] Train net output #6: cls-label = 0
I1108 15:49:53.775113 37395 solver.cpp:244] Train net output #7: cls-label = 0
I1108 15:49:53.775120 37395 solver.cpp:244] Train net output #8: cls-label = 0
I1108 15:49:53.775127 37395 solver.cpp:244] Train net output #9: cls-label = 0
I1108 15:49:53.775135 37395 sgd_solver.cpp:106] Iteration 0, lr = 0.001
I1108 15:49:53.804088 37395 solver.cpp:228] Iteration 100, loss = 0
I1108 15:49:53.804152 37395 solver.cpp:244] Train net output #0: cls-label = 0
I1108 15:49:53.804164 37395 solver.cpp:244] Train net output #1: cls-label = 1
I1108 15:49:53.804172 37395 solver.cpp:244] Train net output #2: cls-label = 1
I1108 15:49:53.804180 37395 solver.cpp:244] Train net output #3: cls-label = 1
I1108 15:49:53.804188 37395 solver.cpp:244] Train net output #4: cls-label = 0
I1108 15:49:53.804194 37395 solver.cpp:244] Train net output #5: cls-label = 0
I1108 15:49:53.804201 37395 solver.cpp:244] Train net output #6: cls-label = 0
I1108 15:49:53.804209 37395 solver.cpp:244] Train net output #7: cls-label = 0
I1108 15:49:53.804216 37395 solver.cpp:244] Train net output #8: cls-label = 0
I1108 15:49:53.804224 37395 solver.cpp:244] Train net output #9: cls-label = 1
I1108 15:49:53.804231 37395 sgd_solver.cpp:106] Iteration 100, lr = 0.001
Any ideas on what is going on or how to fix this?
Thanks!
Jonathan