I have a CDBN......
A python script to create spectrograms from MP3.....
So it's looking like the tools are coming together, pre-processing seems ok in making a spectrogram and applying PCA. I am a little unclear on how to train layer, by layer...
"First, we extracted the spectrogram from each utterance of the TIMIT training data [13]. The spec-trogram had a 20 ms window size with 10 ms overlaps. The spectrogram was further processed using PCA whitening (with 80 components) to reduce the dimensionality. We then trained 300 first-layer bases with a filter length (n W ) of 6 and a max-pooling ratio (local neighborhood size) of 3. We further trained 300 second-layer bases using the max-pooled first-layer activations as input, again with a filter length of 6 and a max-pooling ratio of 3."
So how do we train each layer as per the protocol?
Regards,
Daniel