hdf5 woes compilation. how functional it is?

116 views

DeployEuclideanLossSoftmaxWithLossaccuracycaffeconvolutionmhdf5layermultilabelpredictionproblemprototxt

Skip to first unread message

Белый Охотник

unread,

Jun 30, 2015, 1:54:52 AM6/30/15

to caffe...@googlegroups.com

Hey, i've been trying to figure it out for quite a while, but where are too many weird things about using hdf5 multilabel data for training. What i'm trying to achieve right now is just test my old conv neural network architecture with hdf5, replacing single input label with 37 floats; After making it to work that way, i want to do actual task i planned for it - bounding-box regression;

My current dataset is done that way:

out_filename = 'train';
fileIDtrain = fopen([out_filename '.txt']);


nNumberCols = 37;
format = ['%s' repmat(' %f', [1 nNumberCols])];
A = textscan(fileIDtrain,format);


ind = randperm(numel(A{1}));
A_shuffled = cellfun(@(x) x(ind), A, 'UniformOutput', 0); %in case hdf5 shuffle doesn't work


data = single([]);
labels = single(zeros(nNumberCols, length(A_shuffled{1})));
for k = 1:length(A_shuffled{1})
    R = mod(k, 5000);
    if R == 0
        disp('5k...');
    end


    filename = strcat([out_filename '/'], A_shuffled{1}{k});
    I = mat2gray(imread(filename));


    %data(k, :, :, :) = I;
    data = cat(4, data, I);
    
    for l = 1:nNumberCols
        labels(l, k) = A_shuffled{1,1 + l}(k);
    end
end


if exist([out_filename '.h5'], 'file')
   fprintf('Warning: replacing existing file %s \n', [out_filename '.h5']);
   delete([out_filename '.h5']);
end


dat_dims=size(data);
lab_dims=size(labels);
num_samples=dat_dims(end);


assert(lab_dims(end)==num_samples, 'Number of samples should be matched between data and labels');


h5create([out_filename '.h5'], '/data', [dat_dims(1:end-1) Inf], 'Datatype', 'single', 'ChunkSize', [dat_dims(1:end-1) 1000]); % width, height, channels, number 
h5create([out_filename '.h5'], '/label', [lab_dims(1:end-1) Inf], 'Datatype', 'single', 'ChunkSize', [lab_dims(1:end-1) 1000]); % width, height, channels, number 


startloc.data=[1 1 1 1];
startloc.lab=[1 1]; 


h5write([out_filename '.h5'], '/data', single(data), startloc.data, size(data));
h5write([out_filename '.h5'], '/label', single(labels), startloc.lab, size(labels));

and same for "test".

dat_dims == [40 26 1 49950] //40x26 grayscale x numSamples images

lab_dims == [37 49950] //37 single x numSamples

Caffe seems to accept it, but:

1) I cannot use SoftmaxWithLoss, as in this example: https://github.com/BVLC/caffe/blob/master/examples/hdf5_classification/train_val.prototxt

it says:

I0630 11:32:52.984699  3244 layer_factory.hpp:74] Creating layer ip3
I0630 11:32:52.985677  3244 net.cpp:90] Creating Layer ip3
I0630 11:32:52.985677  3244 net.cpp:410] ip3 <- ip2
I0630 11:32:52.986654  3244 net.cpp:368] ip3 -> ip3
I0630 11:32:52.986654  3244 net.cpp:120] Setting up ip3
I0630 11:32:52.987629  3244 net.cpp:127] Top shape: 600 37 (22200)


I0630 11:32:52.989583  3244 layer_factory.hpp:74] Creating layer loss
I0630 11:32:52.989583  3244 net.cpp:90] Creating Layer loss
I0630 11:32:52.990559  3244 net.cpp:410] loss <- ip3
I0630 11:32:52.990559  3244 net.cpp:410] loss <- label
I0630 11:32:52.990559  3244 net.cpp:368] loss -> loss
I0630 11:32:52.991536  3244 net.cpp:120] Setting up loss
I0630 11:32:52.991536  3244 layer_factory.hpp:74] Creating layer loss
F0630 11:32:52.992512  3244 softmax_loss_layer.cpp:42] Check failed: outer_num_
* inner_num_ == bottom[1]->count() (600 vs. 22200) Number of labels must match n
umber of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C,
H, W), label count (number of labels) must be N*H*W, with integer values in {0,
1, ..., C-1}.
*** Check failure stack trace: ***

600 is the batch size;

2) I have to use EuclideanLoss instead, which is a ***** to train. It gets stuck at 0.5 loss unless i set crazy learning rate like 0.1. Even then results are still questionable(arount 0.07 loss and can't test it, see p.4).

3) Cannot use accuracy layer as hdf5_classification is suggesting. I get:

F0630 11:39:55.590533  5180 accuracy_layer.cpp:34] Check failed: outer_num_ * in
ner_num_ == bottom[1]->count() (100 vs. 3700) Number of labels must match number
 of predictions; e.g., if label axis == 1 and prediction shape is (N, C, H, W),
label count (number of labels) must be N*H*W, with integer values in {0, 1, ...,
 C-1}.

4) Even if i train a net with only EuclideanLoss at the end and super-high learning rate, i can't deploy it by removing EuclideanLoss and replacing hdf5 input with:

name: "AZFACE"
input: "data"
input_dim: 1
input_dim: 1
input_dim: 26
input_dim: 40

It says "F0630 11:46:20.542387 4464 blob.cpp:455] Check failed: ShapeEquals(proto) shape

mismatch (reshape not set)"

But if i don't remove EuclideanLoss, i get F0630 11:48:53.482650 9416 insert_splits.cpp:35] Unknown blob input label to la

yer 1. That is the only part that makes sense. I don't think i need EuclideanLoss while predicting, but how do i remove it painlessly?

I really hope you guys can explain all this. Or point to some consistent explanation of how are you supposed to do that.

My current train proto attached.

train.txt

Reply all

Reply to author

Forward

0 new messages