Audio data organisation for input into caffe (hdf5)

92 views

Skip to first unread message

Jack Hanson

unread,

Jul 31, 2015, 3:59:25 AM7/31/15

to Caffe Users

Hello all and thanks for reading,

I am very new to caffe and was just wondering how to organise my data for inputting.

I currently have several thousand 10 second, single channel data samples of music belonging to 6 classes, sampled at 8kHz. This obviously means that the data size of each sample is 1x80000, which I figure is too large to do in one iteration. I would just like to know how you would recommend feeding this information to the CNN in hdf5 format. I currently am using the Matlab functions h5create and h5 write to create the databases, and I think I understand how to to put them into the CNN train.prototxt file for command line training and (later on) testing.

I also read that hdf5 has a different data order than Matlab (that of Numsamples x Kchannels x Rows x Columns rather than C x R x K x N), so I figured that for the matlab implementation of h5create I should initialise the traindata.h5 file to be of size (8000 x 1 x 1 x Numsamples) and the trainlabels.h5 to be of size (1 x Numsamples) in Matlab. Is this correct?

I'm also sure I will have many more questions in the future, but thanks again in advance

Jack

Reply all

Reply to author

Forward

0 new messages