caffe, label and image dataset

416 views
Skip to first unread message

rsl

unread,
Feb 18, 2016, 10:47:04 AM2/18/16
to Caffe Users
I'm new in caffe and I'm confused by reading too many post about how to pass the data to network.
 Let me explain my problem: I have images set and a csv file which in each row I saved the image name(it contains number and letter) and and its multilable(feature vector of the image). Because the feature vector as a multilabel contains float numbers, I have to convert data to hdf5. My first question is, is hdf5 file could read my image name which contains letter? the second question is,  whether I convert the image set to hdf5 or converting csv file of image label to the hdf5 is enough?
Thanks,

Jan C Peters

unread,
Feb 19, 2016, 4:06:04 AM2/19/16
to Caffe Users
If you want to use HDF5 as a datasource to caffe, the images really need to be in there, not only references to images files. So you need to write your own script to read the filenames and labels from the csv, load the images, possibly preprocess them already a bit and then store the image and the labels into the HDF5 file. That should be very easy and short using python.

Jan

rsl

unread,
Feb 19, 2016, 8:32:09 AM2/19/16
to Caffe Users
Thanks Jan for your complete answer. I already used this script: http://stackoverflow.com/a/31808324/1525479 to convert the images dataset but I do not know how should I modify it to accept vector of labels.

Jan C Peters

unread,
Feb 19, 2016, 9:07:44 AM2/19/16
to Caffe Users
That is most simple, let N be the length of your label vector:

- Change the shape of y:
y = np.zeros( (len(lines), N), dtype='f4' )  # in the SO code the first dim 1 does not make sense, just leave it out

- Assign a whole N-element vector (array) to y[i] in the loop, e.g.
labelvec = array(sp[1:], dtype='f4')
y
[i] = labelvec

That is pretty much it.

Jan
Reply all
Reply to author
Forward
0 new messages