Floating point exception, Caffe with HDF5 (possibly blob reshape error)

497 views
Skip to first unread message

Ankit Dhall

unread,
Apr 25, 2016, 12:58:35 PM4/25/16
to Caffe Users
I am novice in using Caffe. I was trying to get a very basic set up with HDF5 file as input.
I got a 'Floating point exception (core dumped)' when I ran the script. 

Seems like the problem may be in the shape of the HDF5 data that is passed onto Caffe.

The terminal looked like this:

root@ankit:/home/ankit/caffe# ./examples/_temp/train_lenet.sh
I0425
22:01:22.738116 10225 caffe.cpp:185] Using GPUs 0
I0425
22:01:22.773546 10225 caffe.cpp:190] GPU 0: GeForce 920M
I0425
22:01:22.973305 10225 solver.cpp:48] Initializing solver from parameters:
test_iter
: 100
test_interval
: 500
base_lr
: 0.01
display
: 100
max_iter
: 10000
lr_policy
: "inv"
gamma
: 0.0001
power
: 0.75
momentum
: 0.9
weight_decay
: 0.0005
snapshot
: 5000
snapshot_prefix
: "examples/_temp/lenet"
solver_mode
: GPU
device_id
: 0
net
: "examples/_temp/2.prototxt"
I0425
22:01:22.973559 10225 solver.cpp:91] Creating training net from net file: examples/_temp/2.prototxt
I0425
22:01:22.973896 10225 net.cpp:49] Initializing net from parameters:
name
: "Tester"
state
{
  phase
: TRAIN
}
layer
{
  name
: "data"
  type
: "HDF5Data"
  top
: "data"
  top
: "label"
  hdf5_data_param
{
    source
: "examples/_temp/train_h5_list.txt"
 
}
}
layer
{
  name
: "conv1"
  type
: "Convolution"
  bottom
: "data"
  top
: "conv1"
  param
{
    lr_mult
: 1
    decay_mult
: 1
 
}
  param
{
    lr_mult
: 2
    decay_mult
: 0
 
}
  convolution_param
{
    num_output
: 96
    kernel_size
: 11
    stride
: 4
    weight_filler
{
      type
: "gaussian"
      std
: 0.01
   
}
    bias_filler
{
      type
: "constant"
      value
: 0
   
}
 
}
}
layer
{
  name
: "pool1"
  type
: "Pooling"
  bottom
: "conv1"
  top
: "pool1"
  pooling_param
{
    pool
: MAX
 
}
}
layer
{
  name
: "fullyCon1"
  type
: "InnerProduct"
  bottom
: "pool1"
  top
: "fullyCon1"
  param
{
    lr_mult
: 1
    decay_mult
: 1
 
}
  param
{
    lr_mult
: 2
    decay_mult
: 0
 
}
  inner_product_param
{
    num_output
: 2
    weight_filler
{
      type
: "gaussian"
      std
: 0.01
   
}
    bias_filler
{
      type
: "constant"
      value
: 0
   
}
 
}
}
layer
{
  name
: "prob"
  type
: "Softmax"
  bottom
: "fullyCon1"
  top
: "prob"
}
I0425
22:01:22.974227 10225 layer_factory.hpp:77] Creating layer data
I0425
22:01:22.974256 10225 net.cpp:91] Creating Layer data
I0425
22:01:22.974269 10225 net.cpp:399] data -> data
I0425
22:01:22.974293 10225 net.cpp:399] data -> label
I0425
22:01:22.974334 10225 hdf5_data_layer.cpp:79] Loading list of HDF5 filenames from: examples/_temp/train_h5_list.txt
I0425
22:01:22.974364 10225 hdf5_data_layer.cpp:93] Number of HDF5 files: 1
I0425
22:01:22.975319 10225 hdf5.cpp:32] Datatype class: H5T_FLOAT
*** Aborted at 1461601882 (unix time) try "date -d @1461601882" if you are using GNU date ***
PC
: @     0x7fe2880cf76e caffe::Blob<>::Reshape()
*** SIGFPE (@0x7fe2880cf76e) received by PID 10225 (TID 0x7fe288803a40) from PID 18446744071697135470; stack trace: ***
   
@     0x7fe2866312f0 (unknown)
   
@     0x7fe2880cf76e caffe::Blob<>::Reshape()
   
@     0x7fe2880fecea caffe::HDF5DataLayer<>::LayerSetUp()
   
@     0x7fe2880baee3 caffe::Net<>::Init()
   
@     0x7fe2880bc138 caffe::Net<>::Net()
   
@     0x7fe28808fc2a caffe::Solver<>::InitTrainNet()
   
@     0x7fe288091221 caffe::Solver<>::Init()
   
@     0x7fe2880915aa caffe::Solver<>::Solver()
   
@     0x7fe2881ddfe3 caffe::Creator_SGDSolver<>()
   
@           0x41366c caffe::SolverRegistry<>::CreateSolver()
   
@           0x40aef3 train()
   
@           0x4080ed main
   
@     0x7fe28661ca40 (unknown)
   
@           0x408969 _start
   
@                0x0 (unknown)
Floating point exception (core dumped)


The file that created the HDF5 file is:
import h5py, os

 

import sys


 

import caffe

import numpy as np


 

SIZE
= 256 # fixed size to all images

with open( 'file_list.txt', 'r' ) as T :

 lines
= T.readlines()

# If you do not have enough memory split data into

# multiple batches and generate multiple separate h5 files

X
= np.zeros( (len(lines), 3, SIZE, SIZE), dtype='f4' )

y
= np.zeros( (len(lines), 1), dtype='f4' )

for i,l in enumerate(lines):

 sp
= l.split(' ')

 
#print 'iter: ', i

 
#print sp[0], sp[1]

 img
= caffe.io.load_image( sp[0] )

 img
= caffe.io.resize( img, (SIZE, SIZE, 3) ) # resize to fixed size

 img
= np.transpose(img, (2, 0, 1))

 
#print img.shape

 
# you may apply other input transformations here...

 X
[i] = img

 y
[i] = int(sp[1])

with h5py.File('train.h5','w') as H:

 H
.create_dataset( 'data', data=X ) # note the name X given to the dataset!

 H
.create_dataset( 'label', data=y ) # note the name y given to the dataset!

with open('train_h5_list.txt','w') as L:

 L
.write( 'examples/_temp/train.h5' ) # list all h5 files you are going to use

print 'Done'


And the 'file_list.txt' contains the file paths and the label:
/home/ankit/caffe/examples/images/cat.jpg 0
/home/ankit/caffe/examples/images/fish-bike.jpg 1
/home/ankit/caffe/examples/images/cat_gray.jpg 0

 
Any help/suggestion would be great.

Thanks and regards,
Ankit

Jan

unread,
Apr 26, 2016, 5:15:36 AM4/26/16
to Caffe Users
I have seen that error before, but I am not sure what the root cause was. It could be that is was some memory issue or capability issue...

Oh, on a second view I see that you haven't set a batchsize for the data layer! You should definitely do that. I don't know what caffe does when you do not set it, in the proto there is no default value given...

Jan

Ankit Dhall

unread,
Apr 28, 2016, 6:08:30 AM4/28/16
to Caffe Users
I added the batch_size specification and it seems to work now.
Thanks a lot, Jan. :)

Regards,
Ankit
Reply all
Reply to author
Forward
0 new messages