CDBN for Audio Data

89 views

Skip to first unread message

John Bell

unread,

Jan 1, 2016, 6:42:09 AM1/1/16

to Caffe Users

Hello,

I am trying to replicate....

http://papers.nips.cc/paper/3674-unsupervised-feature-learning-for-audio-classification-using-convolutional-deep-belief-networks.pdf

I have a CDBN......

https://github.com/yusugomori/DeepLearning/blob/master/python/CDBN.py

A python script to create spectrograms from MP3.....

http://stackoverflow.com/questions/15311853/plot-spectogram-from-mp3

So it's looking like the tools are coming together, pre-processing seems ok in making a spectrogram and applying PCA. I am a little unclear on how to train layer, by layer...

"First, we extracted the spectrogram from each utterance of the TIMIT training data [13]. The spec-trogram had a 20 ms window size with 10 ms overlaps. The spectrogram was further processed using PCA whitening (with 80 components) to reduce the dimensionality. We then trained 300 first-layer bases with a filter length (n W ) of 6 and a max-pooling ratio (local neighborhood size) of 3. We further trained 300 second-layer bases using the max-pooled first-layer activations as input, again with a filter length of 6 and a max-pooling ratio of 3."

So how do we train each layer as per the protocol?

Regards,

Daniel

Reply all

Reply to author

Forward

0 new messages