Killed by OS when feeding large HDF5 input files

13 views
Skip to first unread message

adzen

unread,
May 6, 2016, 6:09:28 AM5/6/16
to Caffe Users
I tried to do regression to match the quality score of input image.

The dataset comes from here:


First, I modified the "Siamese Network Tutorial" example to accept two channels of input, 
one for original image, another for the distorted image.


The entire dataset is divided into 4 HDF5 files:
train_ori.h5, train_dis.h5, 
test_ori.h5, test_dis.h5.


For smaller size of input (all input files are smaller than 3 GB),
it accepted the input and finished the training process.

But seems I just use about half of the dataset, it's regression error is high.

So I decided to use all images of the dataset.  
 

Therefore, the 4 HDF5 files becomes:
train_ori.h5 (5.9 GB), train_dis.h5 (5.9 GB),
test_ori.h5 (1.2 GB), test_dis.h5 (1.2 GB).

At the beginning of training, it failed to load the dataset.



Someone in this discussion group mentioned "For large HDF5 file, you need to separate them".

So I write a script to divide each file into 5 smaller files.


However, it still failed to load the files. (killed by OS)


Do I need to change the file format? Or other things I can try to make it works?


Thanks!
Reply all
Reply to author
Forward
0 new messages