create my own grammar G.fst

2,916 views
Skip to first unread message

rcd...@hotmail.com

unread,
Dec 9, 2015, 6:50:30 AM12/9/15
to kaldi-help
Hi, all

Is there anyone can give me an command example using kaldi that how to create our own grammar G.fst?
Any sentence is okay, like "Hello world".

Daniel Povey

unread,
Dec 9, 2015, 2:34:41 PM12/9/15
to kaldi-help
A grammar from one sentence doesn't really make sense.
Read www.openfst.org to understand OpenFst.  What you want is an acceptor (input and output symbols the same).

Here is an example grammar FST:

cat <<EOF >words.txt
<eps> 0
monday 1
tuesday 2
black 3
EOF

cat <<EOF | fstcompile --isymbols=words.txt --osymbols=words.txt --keep_isymbols=false --keep_osymbols=false >G.fst
0    1  black  black  0.0
1    2  monday monday 0.0
1    2  tuesday tuesday 0.0
2  0.0
EOF

This will recognize either 'black monday' or 'black tuesday'.  Note, the 0 and 1 and 2 represent states.

Dan


--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

d wk

unread,
Dec 9, 2015, 2:52:17 PM12/9/15
to kaldi-help, dpo...@gmail.com
If you are working with scripts from egs/, many (such as voxforge) generate language models in lm.arpa format. You can make your own such models from a corpus of text, e.g.

$ ~/kaldi/tools/srilm/bin/i686-m64/ngram-count -text corpus.txt -lm lm.arpa

The utils/mkgraph.sh and related scripts take this lm.arpa and create G.fst etc all the way to HCLG.fst. See egs/voxforge/s5/run.sh.

Julian Hall

unread,
May 30, 2017, 6:38:33 AM5/30/17
to kaldi-help
Replying to this because this thread is one of the highest on google search fresults for how to make G.fst, and doesn't contain the correct answer:

To make a G.fst from an lm.arpa.gz file, the appropriate script is utils/format_lm.sh.  Then utils/mkgraph.sh takes that G.fst and composes it with the lexicon and other FSTs to produce HCLG.fst.
Reply all
Reply to author
Forward
0 new messages