Audio data organisation for input into caffe (hdf5)

92 views
Skip to first unread message

Jack Hanson

unread,
Jul 31, 2015, 3:59:25 AM7/31/15
to Caffe Users
Hello all and thanks for reading,

I am very new to caffe and was just wondering how to organise my data for inputting.

I currently have several thousand 10 second, single channel data samples of music belonging to 6 classes, sampled at 8kHz. This obviously means that the data size of each sample is 1x80000, which I figure is too large to do in one iteration. I would just like to know how you would recommend feeding this information to the CNN in hdf5 format. I currently am using the Matlab functions h5create and h5 write to create the databases, and I think I understand how to to put them into the CNN train.prototxt file for command line training and (later on) testing.

I also read that hdf5 has a different data order than Matlab (that of Numsamples x Kchannels x Rows x Columns rather than C x R x K x N), so I figured that for the matlab implementation of h5create I should initialise the traindata.h5 file to be of size (8000 x 1 x 1 x Numsamples) and the trainlabels.h5 to be of size (1 x Numsamples) in Matlab. Is this correct?

I'm also sure I will have many more questions in the future, but thanks again in advance

Jack


Reply all
Reply to author
Forward
0 new messages