MFCC vs FBANK for chain models ?

787 views
Skip to first unread message

Titouan Parcollet

unread,
Jul 14, 2019, 2:19:04 PM7/14/19
to kaldi-help
Hi there!

I was wondering why is it still preferred to use mfcc_hires over standard fbanks for training nnet3 chain models? With neural networks, we don't need the features to be decorrelated. Does anyone has compared both features with a medium/big dataset?

Thank you!

Daniel Povey

unread,
Jul 14, 2019, 2:21:08 PM7/14/19
to kaldi-help
The reason we use MFCC is because they are more easily compressible, being decorrelated; we dump them to disk
with compression to 1 byte per coefficient.  But we dump all the coefficients, so it's equivalent to filterbanks times
a full-rank matrix, no information is lost.  For convolutional architectures, we convert  them back into filterbanks inside the 
network; see idct-layer.

Dan


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/67c3ec69-412e-488d-8391-3541f3920ffc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Titouan Parcollet

unread,
Jul 14, 2019, 2:30:13 PM7/14/19
to kaldi-help
Thank you! 


Le dimanche 14 juillet 2019 20:21:08 UTC+2, Dan Povey a écrit :
The reason we use MFCC is because they are more easily compressible, being decorrelated; we dump them to disk
with compression to 1 byte per coefficient.  But we dump all the coefficients, so it's equivalent to filterbanks times
a full-rank matrix, no information is lost.  For convolutional architectures, we convert  them back into filterbanks inside the 
network; see idct-layer.

Dan


On Sun, Jul 14, 2019 at 11:19 AM Titouan Parcollet <parcolle...@gmail.com> wrote:
Hi there!

I was wondering why is it still preferred to use mfcc_hires over standard fbanks for training nnet3 chain models? With neural networks, we don't need the features to be decorrelated. Does anyone has compared both features with a medium/big dataset?

Thank you!

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.
Message has been deleted

范璐

unread,
Jul 15, 2019, 4:04:50 AM7/15/19
to kaldi-help

HiDan.

Is idct-layer only compatible for convolutional layers or idct-layer can be used before any layers? e.g. tdnn, tdnnf and so on


在 2019年7月15日星期一 UTC+8上午2:21:08,Dan Povey写道:
The reason we use MFCC is because they are more easily compressible, being decorrelated; we dump them to disk
with compression to 1 byte per coefficient.  But we dump all the coefficients, so it's equivalent to filterbanks times
a full-rank matrix, no information is lost.  For convolutional architectures, we convert  them back into filterbanks inside the 
network; see idct-layer.

Dan


On Sun, Jul 14, 2019 at 11:19 AM Titouan Parcollet <parcolle...@gmail.com> wrote:
Hi there!

I was wondering why is it still preferred to use mfcc_hires over standard fbanks for training nnet3 chain models? With neural networks, we don't need the features to be decorrelated. Does anyone has compared both features with a medium/big dataset?

Thank you!

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Jul 15, 2019, 2:01:49 PM7/15/19
to kaldi-help
There's no point using idct-layer before layers like tdnn-f.  The only difference it might make is by reversing the 'liftering'
whereby, in normal MFCCs, the higher-frequency components are scaled up.  But I suspect that would degrade, not
improve, WERs, if anything.


To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Lu Fan

unread,
Jul 15, 2019, 9:51:11 PM7/15/19
to kaldi-help
the conv layers is just compute 3*3 on input seq*freq.
 tdnn is compute on all freq of concated frames like (-1,0,1).So tdnn computation is like dct or idct. right?
Another question. Is specaugment compatible for mfccs?

在 2019年7月16日星期二 UTC+8上午2:01:49,Dan Povey写道:

Daniel Povey

unread,
Jul 15, 2019, 10:15:15 PM7/15/19
to kaldi-help
the conv layers is just compute 3*3 on input seq*freq.
 tdnn is compute on all freq of concated frames like (-1,0,1).So tdnn computation is like dct or idct. right?

No.
 
Another question. Is specaugment compatible for mfccs?

you would do it on filterbanks using idct-layer, see the example in mini_librispeech.
 
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Lu Fan

unread,
Jul 16, 2019, 2:18:48 AM7/16/19
to kaldi-help
Yes, I want to use tdnn_lstm chain model with SpecAugment. not the conv-tdnnf model.
MFCC-> idct-layer -> SpecAugment -> tdnn_lstm, Is this flow correct?
But There's no point using idct-layer before layers like tdnn-f. So I was confused.

在 2019年7月16日星期二 UTC+8上午10:15:15,Dan Povey写道:

Daniel Povey

unread,
Jul 16, 2019, 2:03:32 PM7/16/19
to kaldi-help
Yes that's fine.
There is no point before TDNN-F, no, because it doesn't care, but there is a point when it's before SpecAugment.
Dan


To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Lu Fan

unread,
Jul 16, 2019, 8:01:56 PM7/16/19
to kaldi-help
I see. Thanks very much.

在 2019年7月17日星期三 UTC+8上午2:03:32,Dan Povey写道:
Reply all
Reply to author
Forward
0 new messages