HDF5 files <2GB but still abort each 5000 iterations

13 views
Skip to first unread message

haoch...@gmail.com

unread,
Jan 20, 2017, 12:19:30 AM1/20/17
to Caffe Users
Dear all,
I use HDF5 files as input. There are totally 41 HDF5 files and each HDF5 file is about 800M, which is obviously smaller than 2GB. but the training process still abort.
Interestingly and surprisingly, the training process began to abort at iteration 6500, and  then I finetune the snapshot, it abrupt after another 5000 iterations, that mean it aborts at iteration 6500,11500,16500, ....,just as shown as below:

Check failed: shape[i] <= 2147483647 / count_ (224 vs. 221) blob size exceeds INT_MAX
I0118 23:52:47.280005 17727 sgd_solver.cpp:106] Iteration 6500, lr = 5e-06
F0118 23:53:08.550027 17727 blob.cpp:34] Check failed: shape[i] <= 2147483647 / count_ (224 vs. 221) blob size exceeds INT_MAX
.
.
.
I0118 23:52:47.280005 17727 sgd_solver.cpp:106] Iteration 11500, lr = 5e-06
F0118 23:53:08.550027 17727 blob.cpp:34] Check failed: shape[i] <= 2147483647 / count_ (224 vs. 221) blob size exceeds INT_MAX
.
.
.
I0120 11:39:07.996150 25766 sgd_solver.cpp:106] Iteration 16500, lr = 1.25e-05
F0120 11:39:29.272754 25766 blob.cpp:34] Check failed: shape[i] <= 2147483647 / count_ (224 vs. 221) blob size exceeds INT_MAX

Can somebody tell me why and how to solve it?
Reply all
Reply to author
Forward
0 new messages