Hey, i've been trying to figure it out for quite a while, but where are too many weird things about using hdf5 multilabel data for training. What i'm trying to achieve right now is just test my old conv neural network architecture with hdf5, replacing single input label with 37 floats; After making it to work that way, i want to do actual task i planned for it - bounding-box regression;
My current dataset is done that way:
out_filename = 'train';
fileIDtrain = fopen([out_filename '.txt']);
nNumberCols = 37;
format = ['%s' repmat(' %f', [1 nNumberCols])];
A = textscan(fileIDtrain,format);
ind = randperm(numel(A{1}));
A_shuffled = cellfun(@(x) x(ind), A, 'UniformOutput', 0); %in case hdf5 shuffle doesn't work
data = single([]);
labels = single(zeros(nNumberCols, length(A_shuffled{1})));
for k = 1:length(A_shuffled{1})
R = mod(k, 5000);
if R == 0
disp('5k...');
end
filename = strcat([out_filename '/'], A_shuffled{1}{k});
I = mat2gray(imread(filename));
%data(k, :, :, :) = I;
data = cat(4, data, I);
for l = 1:nNumberCols
labels(l, k) = A_shuffled{1,1 + l}(k);
end
end
if exist([out_filename '.h5'], 'file')
fprintf('Warning: replacing existing file %s \n', [out_filename '.h5']);
delete([out_filename '.h5']);
end
dat_dims=size(data);
lab_dims=size(labels);
num_samples=dat_dims(end);
assert(lab_dims(end)==num_samples, 'Number of samples should be matched between data and labels');
h5create([out_filename '.h5'], '/data', [dat_dims(1:end-1) Inf], 'Datatype', 'single', 'ChunkSize', [dat_dims(1:end-1) 1000]); % width, height, channels, number
h5create([out_filename '.h5'], '/label', [lab_dims(1:end-1) Inf], 'Datatype', 'single', 'ChunkSize', [lab_dims(1:end-1) 1000]); % width, height, channels, number
startloc.data=[1 1 1 1];
startloc.lab=[1 1];
h5write([out_filename '.h5'], '/data', single(data), startloc.data, size(data));
h5write([out_filename '.h5'], '/label', single(labels), startloc.lab, size(labels));
and same for "test".
dat_dims == [40 26 1 49950] //40x26 grayscale x numSamples images
lab_dims == [37 49950] //37 single x numSamples
Caffe seems to accept it, but:
it says:
I0630 11:32:52.984699 3244 layer_factory.hpp:74] Creating layer ip3
I0630 11:32:52.985677 3244 net.cpp:90] Creating Layer ip3
I0630 11:32:52.985677 3244 net.cpp:410] ip3 <- ip2
I0630 11:32:52.986654 3244 net.cpp:368] ip3 -> ip3
I0630 11:32:52.986654 3244 net.cpp:120] Setting up ip3
I0630 11:32:52.987629 3244 net.cpp:127] Top shape: 600 37 (22200)
I0630 11:32:52.989583 3244 layer_factory.hpp:74] Creating layer loss
I0630 11:32:52.989583 3244 net.cpp:90] Creating Layer loss
I0630 11:32:52.990559 3244 net.cpp:410] loss <- ip3
I0630 11:32:52.990559 3244 net.cpp:410] loss <- label
I0630 11:32:52.990559 3244 net.cpp:368] loss -> loss
I0630 11:32:52.991536 3244 net.cpp:120] Setting up loss
I0630 11:32:52.991536 3244 layer_factory.hpp:74] Creating layer loss
F0630 11:32:52.992512 3244 softmax_loss_layer.cpp:42] Check failed: outer_num_
* inner_num_ == bottom[1]->count() (600 vs. 22200) Number of labels must match n
umber of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C,
H, W), label count (number of labels) must be N*H*W, with integer values in {0,
1, ..., C-1}.
*** Check failure stack trace: ***
600 is the batch size;
2) I have to use EuclideanLoss instead, which is a ***** to train. It gets stuck at 0.5 loss unless i set crazy learning rate like 0.1. Even then results are still questionable(arount 0.07 loss and can't test it, see p.4).
3) Cannot use accuracy layer as hdf5_classification is suggesting. I get:
F0630 11:39:55.590533 5180 accuracy_layer.cpp:34] Check failed: outer_num_ * in
ner_num_ == bottom[1]->count() (100 vs. 3700) Number of labels must match number
of predictions; e.g., if label axis == 1 and prediction shape is (N, C, H, W),
label count (number of labels) must be N*H*W, with integer values in {0, 1, ...,
C-1}.
4) Even if i train a net with only EuclideanLoss at the end and super-high learning rate, i can't deploy it by removing EuclideanLoss and replacing hdf5 input with:
name: "AZFACE"
input: "data"
input_dim: 1
input_dim: 1
input_dim: 26
input_dim: 40
It says "F0630 11:46:20.542387 4464 blob.cpp:455] Check failed: ShapeEquals(proto) shape
mismatch (reshape not set)"
But if i don't remove EuclideanLoss, i get F0630 11:48:53.482650 9416 insert_splits.cpp:35] Unknown blob input label to la
yer 1. That is the only part that makes sense. I don't think i need EuclideanLoss while predicting, but how do i remove it painlessly?
I really hope you guys can explain all this. Or point to some consistent explanation of how are you supposed to do that.
My current train proto attached.