worse result on lattice rescoring

mili lali

unread,

Sep 27, 2019, 1:09:03 PM9/27/19

to kaldi-help

Hi

I want to use lattice rescore.

I have a text course contain about 30M words and build 3,4gram language models by srilm.

I use utils/build_const_arpa_lm.sh to convert LMs and use steps/lmrescore_const_arpa.sh to rescore.

I use chain models based on wsj 1h.

the basd WER is 19.69%.

and rescoring WER is 36% !!!

why rescoring worse the results?

Daniel Povey

unread,

Sep 27, 2019, 1:12:37 PM9/27/19

to kaldi-help

Make sure the word-list is the same, or you can't do LM rescoring.
(Would be complicated, at least. ) . i.e. check that words.txt is the
same.

And you should probably check that the perplexity given your new
language model is better. And make sure that the "old LM" you give
to lmrescore_const_arpa.sh is the same one you decoded with.

Dan

> --
> Go to http://kaldi-asr.org/forums.html find out how to join
> ---
> You received this message because you are subscribed to the Google Groups "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/9860aa13-5f3f-4377-8246-640fcb73046c%40googlegroups.com.

mili lali

unread,

Sep 27, 2019, 1:18:53 PM9/27/19

to kaldi-help

And you should probably check that the perplexity given your new
language model is better.

I think the perplexity of the new language model is bigger than old one.

I think if I get bigger corpse for lm is get big perplexity.

So what approach to use lattice rescoring?

Daniel Povey

unread,

Sep 27, 2019, 1:20:47 PM9/27/19

to kaldi-help

bigger == worse, for perplexity.
I don't understand your question.

> --
> Go to http://kaldi-asr.org/forums.html find out how to join
> ---
> You received this message because you are subscribed to the Google Groups "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/156acbc8-8a04-4e02-a7ad-81121bc2523a%40googlegroups.com.

mili lali

unread,

Sep 27, 2019, 1:24:04 PM9/27/19

to kaldi-help

bigger == worse, for perplexity.

yes, I mean bigger = worse, for perplexity.

I mean we use big lm for lattice rescoring. and I think big lm have worse perplexity.

so what the approach to use lattice rescoring?

Ho Yin Chan

unread,

Sep 28, 2019, 11:54:52 AM9/28/19

to kaldi-help

Big LM usually has lower perplexity and better performance. It may be your test data not matched with your training text, or you may have some issues during building the big lm

mili lali

unread,

Sep 28, 2019, 12:23:47 PM9/28/19

to kaldi-help

I use this script to train language model.

https://github.com/kaldi-asr/kaldi/blob/8ce3a95761e0eb97d95d3db2fcb6b2bfb7ffec5b/egs/babel/s5d/local/train_lms_srilm.sh

kaldi/egs/babel/s5d/local/train_lms_srilm.sh

Is it old? or new way or new script to train language model?

mili lali

unread,

Oct 3, 2019, 11:29:58 AM10/3/19

to kaldi-help

Hi

instead of using steps/lmrescore_const_arpa.sh, I use steps/lmrescore.sh and it's Ok, WER decreased.

what are between them?

I did something wrong?

best regards

Ho Yin Chan

unread,

Oct 3, 2019, 11:53:18 PM10/3/19

to kaldi-help

From your information provided, there are two issues.

1) Perplexity of your 4-gram is bigger, which means worse (please also let people know how big is your test set).

2) const arpa re-scoring seems doesn't work well in your work. Please check your command procedures again with ./utils/build_const_arpa_lm.sh and steps/lmrescore_const_arpa.sh to see if any mistake

mili lali於 2019年10月3日星期四 UTC+8下午11時29分58秒寫道：

mili lali

unread,

Oct 4, 2019, 5:44:21 AM10/4/19

to kaldi-help

Hi

Thanks.

1) Perplexity of your 4-gram is bigger, which means worse (please also let people know how big is your test set).

My test set has about 10 hours wave.

I have 3 language models. Here is WER of the test set and Perplexity of them on train text

1- The text corpus contains 5 M words. (WER is ~20%) (Perplexity : ppl= 1185.709 ppl1= 4110.826)

2- 1 interpolated to train corpus text set language model (WER is ~ 10%) ( ppl= 16.57505 ppl1= 27.14281)

3- The text corpus contains 50 M words. use for rescoring (WER is ~36%) (ppl= 2499.126 ppl1= 9876.815)

2) const arpa re-scoring seems doesn't work well in your work. Please check your command procedures again with ./utils/build_const_arpa_lm.sh and steps/lmrescore_const_arpa.sh to see if any mistake

sorry about that I have a mistake, I first decoding with big lm and rescoring with interpolated lm.

In abow, I decoding with big lm and rescoring with bigger lm.

I think my big text corpus is so far my test corpus.

I think my speech recognition system is dependent on the language model

Ho Yin Chan

unread,

Oct 4, 2019, 7:13:07 AM10/4/19

to kaldi-help

Your test set has 10hours, which means the text tokens would be in the scale of perhaps several 10 of thousands only. 50M text can be used as the train data as long as it is in the same language, segmented and normalized in the same way as in the test set text. I suspect you have issue where the 50M text is mismatched with your test data text.

mili lali

unread,

Oct 5, 2019, 1:23:42 PM10/5/19

to kaldi-help

Many Thanks

I think you tell true. My test is semi-informal but my text corpus is formal news texts and long sentences.

also, my test set contain alone word and small sentences.

1- forget informal case. In the case of small sentences and alone words what you think better approach to build a language model.

2- Is it a law that big corpus have better Perplexity? (in case that train and test are in the same case)

best regards

Reply all

Reply to author

Forward