Creating a Small, Deterministic Language Model

316 views
Skip to first unread message

ark...@onvego.com

unread,
Oct 7, 2017, 8:51:06 AM10/7/17
to kaldi-help
Hi everyone
I would like to build a system that uses a small vocabulary and set of possible phrases, such as dialog systems.
My question is may the decoding graph be reduced so that the system work faster and with higher beam?
( I have looked at this guide but have not been able to do what it takes )
Another question, if given a small vocabulary, would be better to use the grammar language model than the n-gram?
(since I have few phrases, a statistical model will be less accurate, is not it? )

Thanks


Daniel Povey

unread,
Oct 7, 2017, 12:24:39 PM10/7/17
to kaldi-help
If you have few words and/or a simple language model or grammar, your
graph will naturally be much smaller and you can increase the beam
(notice that in the RM setup, the beam is larger than the default;
there is a file in conf/ that we configure this with I think).

Your choice of statistical model vs. grammar will normally be driven
by the application, and by considerations of what you want to happen
when the user says something that is not in the grammar.
> --
> Go to http://kaldi-asr.org/forums.html find out how to join
> ---
> You received this message because you are subscribed to the Google Groups
> "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kaldi-help+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

ark...@onvego.com

unread,
Oct 7, 2017, 1:29:15 PM10/7/17
to kaldi-help
Hi Dan
Thank you very much for quick answer.
When I used the n-gram, by adding the symbol <unk> for the words that oov,
now used in grammar, I can not do this?
I do not understand what you mean by "
by considerations of what you want to happen

when the user says something that is not in the grammar".

Daniel Povey

unread,
Oct 7, 2017, 1:31:03 PM10/7/17
to kaldi-help
What I mean is, if you limit the decoding to the grammar, then even if
someone says something different, it will be decoded as something in
the grammar, which may not be what you want.
I think you should read and think about the HTK Book-- unless you
understand the basic framework of how speech recognition works, you'll
be lost.

ark...@onvego.com

unread,
Oct 7, 2017, 1:48:48 PM10/7/17
to kaldi-help
I understand your intention and read the book.
As stated when I work with n-gram 
I add a special symbol for words that are not in my language model( corpus\lexicon ). So I ask if there is a similar option in grammar ..

Daniel Povey

unread,
Oct 7, 2017, 1:49:44 PM10/7/17
to kaldi-help
Of course you are free to add the "unk" word to your grammar at any
point, it is a normal word.
Reply all
Reply to author
Forward
0 new messages