First, I modified the "Siamese Network Tutorial" example to accept two channels of input,
one for original image, another for the distorted image.
The entire dataset is divided into 4 HDF5 files:
train_ori.h5, train_dis.h5,
test_ori.h5, test_dis.h5.
For smaller size of input (all input files are smaller than 3 GB),
it accepted the input and finished the training process.
But seems I just use about half of the dataset, it's regression error is high.
So I decided to use all images of the dataset.
Therefore, the 4 HDF5 files becomes:
train_ori.h5 (5.9 GB), train_dis.h5 (5.9 GB),
test_ori.h5 (1.2 GB), test_dis.h5 (1.2 GB).
At the beginning of training, it failed to load the dataset.
Someone in this discussion group mentioned "For large HDF5 file, you need to separate them".
So I write a script to divide each file into 5 smaller files.
However, it still failed to load the files. (killed by OS)
Do I need to change the file format? Or other things I can try to make it works?
Thanks!