Killed by OS when feeding large HDF5 input files

13 views

Skip to first unread message

unread,

May 6, 2016, 6:09:28 AM5/6/16

to Caffe Users

I tried to do regression to match the quality score of input image.

The dataset comes from here:

First, I modified the "Siamese Network Tutorial" example to accept two channels of input,

one for original image, another for the distorted image.

The entire dataset is divided into 4 HDF5 files:

train_ori.h5, train_dis.h5,

test_ori.h5, test_dis.h5.

For smaller size of input (all input files are smaller than 3 GB),

it accepted the input and finished the training process.

But seems I just use about half of the dataset, it's regression error is high.

So I decided to use all images of the dataset.

Therefore, the 4 HDF5 files becomes:

train_ori.h5 (5.9 GB), train_dis.h5 (5.9 GB),

test_ori.h5 (1.2 GB), test_dis.h5 (1.2 GB).

At the beginning of training, it failed to load the dataset.

Someone in this discussion group mentioned "For large HDF5 file, you need to separate them".

So I write a script to divide each file into 5 smaller files.

However, it still failed to load the files. (killed by OS)

Do I need to change the file format? Or other things I can try to make it works?