a few thoughts of smbr

xiaofeng wu

unread,

Feb 17, 2017, 5:42:03 AM2/17/17

to kaldi-help

Hi Dan,

I notice that in the smbr, the acoustic_scale's default setting is 0.1, which means compared to language model, it has much less impact on the smbr training.

The intuition of this is that smbr should be trained after the normal way , i.e. CE, CTC can not learn more information out of the acoustic signal anymore. right?

Another question is, if the training domain and the test domain are very different, while doing smbr training,

should we include test data's text in building language model for generating den_lattice?

I can't think clearly of what impact would this cause: say, with test text included, in the den_lattice there might be some new paths appeared,

but these path would have low probability in the training data, therefore smbr will very likely to punish these paths.

Furthermore, is it a good idea to first do some data selection based on the test set's domain, than train acoustic model, or, with this selected sub-data-set only for the follow on smbr training?

Thanks in advance.

Best

Daniel Povey

unread,

Feb 17, 2017, 12:31:05 PM2/17/17

to kaldi-help

I notice that in the smbr, the acoustic_scale's default setting is 0.1, which means compared to language model, it has much less impact on the smbr training.
The intuition of this is that smbr should be trained after the normal way , i.e. CE, CTC can not learn more information out of the acoustic signal anymore. right?

Not sure what you're saying, but the acoustic scale should be similar to what you will use in test time. For regular CE systems with normal frame rate, we use an acoustic scale of 0.1 in discriminative training. For CE systems with one-third frame rate we use an acoustic scale of 0.333 in discriminative training (not checked in yet). For 'chain' systems we use an acoustic scale of 1.0 in discriminative training.

Another question is, if the training domain and the test domain are very different, while doing smbr training,
should we include test data's text in building language model for generating den_lattice?

I don't recommend this.

Furthermore, is it a good idea to first do some data selection based on the test set's domain, than train acoustic model, or, with this selected sub-data-set only for the follow on smbr training?

Thanks in advance.

smbr and other discriminative training tends to be sensitive to the amount of data (needs a lot of data, else it won't generalize well) so I don't recommend to do any data selection. But if you really have tons of data (thousands of hours), some data selection might be useful. The language model you use should reflect the data you are training on; but the script already does that for you.

Dan

xiaofeng wu

unread,

Feb 17, 2017, 10:19:40 PM2/17/17

to kaldi-help, dpo...@gmail.com

Thank you very much !

Reply all

Reply to author

Forward