Computational Time reduction of large Language models

950 views
Skip to first unread message

Sana Khamekhem

unread,
Sep 18, 2018, 5:04:03 AM9/18/18
to kaldi-help
Hi all,

I'm using n-gram rescoring at the decoding stage of my recognition system. In my work, I need to compose the FST graph for each utterance. 
So, to decode all utterances, it takes a lot of time and consumes more memory.

I'm asking if there is some researches that propose a fitting process to reduce the size of the FST, which can result in the reduction of computational time of the decoding!!!!!

Daniel Povey

unread,
Sep 18, 2018, 12:41:49 PM9/18/18
to kaldi-help
It depends on the specific scenario.  Why do you need to construct the FST for each utterance-- what is different each time?

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/a0281d3f-6b91-4f4b-8255-cd3204e975fb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sana Khamekhem

unread,
Sep 19, 2018, 5:11:08 AM9/19/18
to kaldi-help
I need to reconstruct the FST because the used lexicon changes for each utterance (dynamic lexicon).

Daniel Povey

unread,
Sep 19, 2018, 1:57:39 PM9/19/18
to kaldi-help
There is a new way to do that fairly efficiently, see
egs/mini_librispeech/s5/local/grammar/extend_vocab_demo.sh
However, I'm not guaranteeing that you will find it easy to use.



Armando

unread,
Jan 16, 2019, 11:08:48 AM1/16/19
to kaldi-help
I was thinking of experimenting with this framework for a different but related scenario, which is quite common in speech processing
At run-time a list of new words is provided together with some text (several hours of speech or some web-collected material maybe) and the system is adapted to the data to improve recognition on in-domain data.
Usually I'd build a new LM with old lexicon+ new words and then interpolate with the old LM. But the decoding graph constrcution would take up too much time and memory for an on-the-fly scenario. So I was actually building the new n-gram LM, with its "small" G.gst and HCLG.fst and then using grammar-fst to stich to the old, big HCLG.fst throught the non-terminal in the old G.fst (which can replace the word <unk> or alternatively a different word that I'm not interested in seeing in the decoded output and that does not appear "too much" in the old LM)
I got some interesting result and even big improvment on the new test data, depending on the order of the new LM and also the word replaced by the non-terminal in the old LM, without taking too much time or memory in the new graph construction

The thing is, I am able to run smoothly make-grammar-fst only if I comment the code line
KALDI_ERR << "Two arcs had the same left-context phone.";
in grammar-fst.cc

otherwise it exits as:

ERROR (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:168) Two arcs had the same left-context phone.

and if that line was there, i guess it needs to be there. I do not have that error is, like in the mini_librispeech demo, I build the non-top-level G.gst with only new words and a simple uniform prob as grammar
As far as I know, my code is up-to-date

The new G.fst (for the non-top-level graph) is built from an n-gram model exactly like to big one of the top-level graph.
I also followed recommendation of concatenating
#nonterm_begin and #nonterm_end at the start/end of the non-top-level G.fst (indeed all the ilabel sequences from G.fst start with #nonterm_begin and end with #nonterm_end)

any idea of why that happens? (also if using this framework for on-the fly LM adaptation makes sense in general)

Daniel Povey

unread,
Jan 16, 2019, 12:57:52 PM1/16/19
to kaldi-help
It is likely a bug in the code or scripts somewhere, but can you please do a little work in gdb to give me some information about the call stack and any local variables that you think might be relevant?  e.g. entry_state, expected_nonterminal_symbol, left_context_phone?  (and please use phones.txt to translate the symbols to text form so I know what they mean).

Armando

unread,
Jan 17, 2019, 6:42:56 AM1/17/19
to kaldi-help
the entry state is always 0
the other infos in the loop over the arcs of the non-top-level fst

src/fstbin//make-grammar-fst --write-as-grammar=false --nonterm-phones-offset=436 graph_extvocab_top/HCLG.fst 440 graph_extvocab_nontop/HCLG.fst graph_extvocab_combined/HCLG.fst
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[+BREATH+_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[+BREATH+_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[+CONV+_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[+CONV+_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[+FW+_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[+FW+_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[+NOISE+_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[+NOISE+_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[SIL]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[SIL_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[SIL_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[C_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[C_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[G_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[G_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[H_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[H_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[K_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[K_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[L_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[L_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[M_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[M_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[N_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[N_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[P_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[P_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[Q_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[Q_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[T_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[T_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a1_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a1_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a2_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a2_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a3_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a3_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a4_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a4_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a5_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a5_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a6_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a6_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a7_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a7_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a8_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a8_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a9_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[a9_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b1_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b1_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b2_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b2_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b3_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b3_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b4_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b4_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b5_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b5_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b6_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b6_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b7_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b7_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b8_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[b8_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[c_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[c_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[d_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[d_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[d1_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[d1_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e1_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e1_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e2_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e2_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e3_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e3_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e4_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e4_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e5_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e5_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e6_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e6_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e7_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e7_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e8_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e8_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e9_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[e9_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[g_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[g_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[h_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[h_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i1_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i1_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i2_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i2_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i3_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i3_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i4_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i4_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i5_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[i5_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[j_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[j_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[k_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[k_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[l_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[l_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[m_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[m_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[m1_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[m1_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[m2_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[m2_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[n_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[n_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o1_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o1_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o2_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o2_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o3_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o3_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o4_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o4_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o5_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o5_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o6_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o6_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o7_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o7_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o8_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o8_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o9_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[o9_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p1_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p1_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p2_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p2_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p3_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p3_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p4_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p4_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p5_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p5_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p6_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p6_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p7_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p7_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p8_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[p8_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[r_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[r_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[s_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[s_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[t_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[t_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u1_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u1_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u2_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u2_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u3_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u3_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u4_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u4_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u5_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u5_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u6_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u6_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u7_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u7_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u8_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u8_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u9_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[u9_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[v_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[v_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[v1_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[v1_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[v2_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[v2_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[w_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[w_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[x_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[x_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y1_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y1_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y2_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y2_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y3_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y3_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y4_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y4_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y5_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[y5_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[z_E]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[z_S]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[#nonterm_bos]
LOG (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:163) nonterminal:[#nonterm_begin] expected_nonterminal_symbol:[#nonterm_begin] left_context_phone:[+BREATH+_E]
ERROR (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:172) Two arcs had the same left-context phone.

[ Stack-Trace: ]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::FatalMessageLogger::~FatalMessageLogger()
fst::GrammarFst::InitEntryOrReentryArcs(fst::ConstFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, unsigned int> const&, int, int, std::unordered_map<int, int, std::hash<int>, std::equal_to<int>, std::allocator<std::pair<int const, int> > >*)
fst::GrammarFst::Init()
fst::GrammarFst::GrammarFst(int, fst::ConstFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, unsigned int> const&, std::vector<std::pair<int, fst::ConstFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, unsigned int> const*>, std::allocator<std::pair<int, fst::ConstFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, unsigned int> const*> > > const&)
main
__libc_start_main
/people/amuscariello/git/kaldi/vecsys/src/fstbin//make-grammar-fst() [0x40e392]

ERROR (make-grammar-fst[5.5.239~2-48b5]:InitEntryOrReentryArcs():grammar-fst.cc:172) Two arcs had the same left-context phone.

[ Stack-Trace: ]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
kaldi::FatalMessageLogger::~FatalMessageLogger()
fst::GrammarFst::InitEntryOrReentryArcs(fst::ConstFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, unsigned int> const&, int, int, std::unordered_map<int, int, std::hash<int>, std::equal_to<int>, std::allocator<std::pair<int const, int> > >*)
fst::GrammarFst::Init()
fst::GrammarFst::GrammarFst(int, fst::ConstFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, unsigned int> const&, std::vector<std::pair<int, fst::ConstFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, unsigned int> const*>, std::allocator<std::pair<int, fst::ConstFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, unsigned int> const*> > > const&)
main
__libc_start_main
/people/amuscariello/git/kaldi/vecsys/src/fstbin//make-grammar-fst() [0x40e392]


Daniel Povey

unread,
Jan 17, 2019, 5:37:13 PM1/17/19
to kaldi-help
OK, so the problem is that the start state of the sub-FST has more than one arc leaving it with the same phonetic-context label, which happened because of disambiguation symbols.
It looks like I need to add some stuff in the grammar-FST preparation code to handle this case; it's not hard and you can help test it.

Dan

Daniel Povey

unread,
Jan 17, 2019, 7:42:00 PM1/17/19
to kaldi-help
Please see whether
fixes it.
You'll have to recompile and rebuild the graph (the smaller one).

Armando

unread,
Jan 18, 2019, 12:16:22 PM1/18/19
to kaldi-help
I obtain the same result as before
also, if I try re-building the top-level graph (even if not necessary for this test) with
fstrmsymbols --only-at-start=true
determinization takes too much time I have to stop
...

Daniel Povey

unread,
Jan 18, 2019, 1:57:07 PM1/18/19
to kaldi-help
Oh, OK.  I think I have to fix the issue a different way, more like what I was originally talking about.  It will be done by tonight.

Dan

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Jan 19, 2019, 12:58:29 AM1/19/19
to kaldi-help

Armando

unread,
Jan 19, 2019, 1:53:00 PM1/19/19
to kaldi-help
well, the graph is now built
but I can not decode anymore with the combined HCLG.fst



ASSERTION_FAILED (nnet3-latgen-faster-parallel[5.5.194~1-5635]:TopSortTokens():lattice-faster-decoder.cc:998) : 'loop_count < max_loop && "Epsilon loops exist in your decoding " "graph (this is not allowed!)"'

[ Stack-Trace: ]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::FatalMessageLogger::~FatalMessageLogger()
kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)
kaldi::LatticeFasterDecoderTpl<fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > >, kaldi::decoder::StdToken>::TopSortTokens(kaldi::decoder::StdToken*, std::vector<kaldi::decoder::StdToken*, std::allocator<kaldi::decoder::StdToken*> >*)
kaldi::LatticeFasterDecoderTpl<fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > >, kaldi::decoder::StdToken>::GetRawLattice(fst::VectorFst<fst::ArcTpl<fst::LatticeWeightTpl<float> >, fst::VectorState<fst::ArcTpl<fst::LatticeWeightTpl<float> >, std::allocator<fst::ArcTpl<fst::LatticeWeightTpl<float> > > > >*, bool) const
kaldi::DecodeUtteranceLatticeFasterClass::operator()()
kaldi::TaskSequencer<kaldi::DecodeUtteranceLatticeFasterClass>::RunTask(kaldi::TaskSequencer<kaldi::DecodeUtteranceLatticeFasterClass>::RunTaskArgsList*)
std::thread::_Impl<std::_Bind_simple<void (*(kaldi::TaskSequencer<kaldi::DecodeUtteranceLatticeFasterClass>::RunTaskArgsList*))(kaldi::TaskSequencer<kaldi::DecodeUtteranceLatticeFasterClass>::RunTaskArgsList*)> >::_M_run()


clone

LOG (kld_lattice-scale[5.4.16~3-4f22]:main():lattice-scale.cc:90) Done 21 lattices.
Aborted
...

Daniel Povey

unread,
Jan 19, 2019, 1:57:53 PM1/19/19
to kaldi-help
OK, I think that may be something you have to fix at the level of your decoding-graph building.
It is a requirement of Kaldi decoding graphs that you cannot have epsilon loops, and that could happen if your sub-graph can accept an empty sequence and your higher-level graph allows the sub-graph to occur in a loop with no real words in between.

Dan

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Armando

unread,
Jan 21, 2019, 11:54:03 AM1/21/19
to kaldi-help
Oh, I think I'm doing something wrong in the genration of the G.fst of the subgraph (ater all, for a general n-gram backoff LM is the thing that is not shown in the demo)
if I take a list of new words, generate a simple G.fst like in the demo, the graph has no epsilon cycles. But If I do the same  with arpa2fst followed by start/end concatenation of the #nonterm_begin and #nonterm_end, it looks like I am indeed introducing a path with only a sequence of epsilon in the output side.
That's what I do


  cat $newlm | \
  $lmbin/arpa2fst --disambig-symbol=#0 --read-symbol-table=$lang_ext/words.txt - $lang_ext/G.nobeos.fst
 
  echo "0 1 #nonterm_begin <eps>" > $lang_ext/nonterm_begin.txt
  echo "1" >> $lang_ext/nonterm_begin.txt
  echo "0 1 #nonterm_end <eps>" > $lang_ext/nonterm_end.txt
  echo "1" >> $lang_ext/nonterm_end.txt

  fstcompile --isymbols=$lang_ext/words.txt --osymbols=$lang_ext/words.txt $lang_ext/nonterm_begin.txt > $lang_ext/nonterm_begin.fst
  fstcompile --isymbols=$lang_ext/words.txt --osymbols=$lang_ext/words.txt $lang_ext/nonterm_end.txt > $lang_ext/nonterm_end.fst
 
  fstconcat  $lang_ext/nonterm_begin.fst $lang_ext/G.nobeos.fst |\
  fstconcat  - $lang_ext/nonterm_end.fst | fstarcsort --sort_type=ilabel > $lang_ext/G.fst


and in G.fst that's what I have
0    1    6821    0
1    2    0    0
2    3    0    0    2.64242363
2    2    6824    6824    11.3357344
2    2    6825    6825    11.2054815
2    2    6826    6826    10.0648232
2    2    6827    6827    11.3288708
2    2    6828    6828    10.6278124
2    2    6829    6829    11.3357344
.
.
.
2    2    6935    6935    11.2054815
2    2    6936    6936    11.3357344
2    2    6937    6937    11.3357344
2    2    6938    6938    10.9181242
3    4    6822    0
4

so, the path  [0 1 2 3 4] has only <eps> in the output side
do you know the correct way of concatenating the nonterminal at the start/end? I think it's where I'm introducing that
...

Daniel Povey

unread,
Jan 21, 2019, 12:04:51 PM1/21/19
to kaldi-help
A general language model will allow an empty sequence.  To disallow that you may have to compose with an FST that allows one or more words.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Armando

unread,
Feb 1, 2019, 1:48:01 PM2/1/19
to kaldi-help
I forgot to complete with som experiments I have been doing. I have written the fst of the LM (bigram, trigram) of the non-top level graph so no epsilon path exists, so as to avoid epslion cycle in the combined graph
I have done a an experiment where I evaluate a baseline system on a (relatively) out-of-domain corpus, then add new words (115 words) with an in-domain corpus (~15hours) to possibly adapt the LM. Then I train the LM on those in-domain 15h and build the correspoding graph to be used as the non top level graph.
I used either a bigram or a trigram for the new LM (actually, did not even included BOF arcs in the following experiments) and tried different words as nonterminals in the top-level graph (words that anyway, I am not interested to see in the decoded output, such as different type of filler words, modeled both acoustically and linguistically nonetheless, like noise, breath, spoken noise etc).
I indicate with:

lines: number of lines this word appears in the LM of the top level graph (where it is replaced by nonterminal label)
time: time to build the combined graph (with make-grammar-fst)
mem: peak memory occupation during construction of the combined graph (I think in the on-the-fly scenario, those are important parameters)
size: size of the combined LM
wer

words to be used as non terminal:
NOISE
CONV (kind of a spoken noise, like unintelligible words, or foreign words...)
FW (hesitation, mostly)
<unk>


Baseline  ---> wer=12.3
the size of the top level graph is 716M

                         LINES           TIME        MEM      SIZE     WER
1)CONV   2G       250            few sec      < 1G      142M      9.4  
               3G                        few sec      < 1G      173M      8.5

2)<unk>  2G       1860           ~1 min      ~3G        1.2G       9.9
              3G                         ~1.5mins   ~5G        1.7G       9.1

3)NOISE 2G       12049         ~2.5mins    12.5G     3G         9.5
              3G                         ~4.5mins   ~20G      4.4G       8.4

4)FW      2G       9843          ~3mins       13G        3.3G      9.5
              3G                         ~4.5mins    19G        4.8G     8.7


in the case of uniform probability only over the newly added words (like in the mini librispeech demo) I get 11.8

so; maybe, being careful of certain parameters, there might be a potential to use this same framework for performing an adaptation of the decoding graph to a new domain (on the fly, for which building the big graph from scratch is not feasible, both for time and memory requirements)

Actually, I even tried, for the sake of comparison, building from sratch the "big" decoding graph, with an LM obtained by interpolation of the out-of-domain LM and the new, smaller LM (estimated over the in-domain 15hours) with a small in-domain development corpus.
The final WER was no better than those obtained with the previous "on-the-fly" adaptations (I think it was 9.9 WER)
...

Daniel Povey

unread,
Feb 2, 2019, 2:42:10 PM2/2/19
to kaldi-help
OK.  The 'make-grammar-fst' binary is not the only way to do that; the framework is designed so that it's easy to construct those combined graphs in memory, and they key thing is that you don't have to touch the big graph to do that, you can just pass it in as a const pointer.    The time taken to construct the combined GrammarFst would then be bounded by the time required to construct the small graph.

Also, make-grammar-fst has different options.  If you use --write-as-grammar=false it will expand the whole thing and write as a regular FST, which duplicates the small graph many times and is inefficient.  That option was provided for debugging purposes.

Dan

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Armando

unread,
Feb 4, 2019, 10:18:57 AM2/4/19
to kaldi-help
right; indeed as a grammar-fst, the construction is way more efficient in terms of disk and memory occupation and processing time. Since it is important for me the decoding graph to be shared by different processes, I thought i was not possible in grammar-fst format, as it does not inherit from OpenFst; but I see that the function Read loads the top-level FST as a const fst anyway, so it is possible to use the shared memory access.
though, grammar-fst needs a specific decoder that supports this type; I'm not sure if everything that can be done for constFst can be applied in a straightforward manner to grammar-fst (like, would the gpu-based decoding support grammar-fst?)
...

Daniel Povey

unread,
Feb 4, 2019, 1:46:42 PM2/4/19
to kaldi-help
No, the GPU-based decoding won't support it, for the time being at least; but it may be at least a couple months till that is merged anyway.



--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Armando

unread,
Feb 18, 2019, 5:17:31 AM2/18/19
to kaldi-help
I was looking to extend multi-threading also to GrammarFst decoding, to have a nnet3-latgen-grammar-parallel

I templated the DecodeUtteranceLatticeFasterClass on the FST type, but it always crashes in Decode called from operator (), whenever num-threads > 1
si I was actually wondering if there is, in principle, something in the structure of GrammarFst that makes it not suited for multi-threading or it is supposed to work like te other fst types
...

Armando

unread,
Feb 18, 2019, 6:58:44 AM2/18/19
to kaldi-help
oh, ok, it's written in the header file itself

""
   Caution: this class is not thread safe, i.e. you shouldn't access the same
   GrammarFst from multiple threads.  We can fix this later if needed.

Daniel Povey

unread,
Feb 18, 2019, 12:50:11 PM2/18/19
to kaldi-help
Yes, it's not thread safe.  But the GrammarFst object is very lightweight, it uses very little memory.  So you could easily construct a seprate GrammarFst object for each thread without incurring significant performance penalty.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Feb 18, 2019, 12:52:54 PM2/18/19
to kaldi-help
Oh, wait.  If you are reading the GrammarFst from disk then there might be a problem, as there is currently no way to copy that object, and reading it separately from disk will re-read all the comonent FSTs.  I will add some code right now that will let you do it fairly easily though.

Dan

Daniel Povey

unread,
Feb 18, 2019, 1:07:28 PM2/18/19
to kaldi-help

Armando

unread,
Feb 19, 2019, 5:08:15 AM2/19/19
to kaldi-help
I'm calling MakeThreadSafe after reading the grammar fst from disk,

    fst::GrammarFst fst;
    ReadKaldiObject(grammar_fst_rxfilename, &fst);
    //modif
    fst.MakeThreadSafe();

but attempt at decoding with multiple threads fail nonetheless with segmentation erros
...

Daniel Povey

unread,
Feb 19, 2019, 11:51:19 AM2/19/19
to kaldi-help
Just realized there was a bug in that code, a stray !.  Please pull and retry.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Armando

unread,
Feb 19, 2019, 12:34:49 PM2/19/19
to kaldi-help
It does not change, the code does not satisfy

if (entry_arcs_[i].empty())

condition

but
InitEntryArcs(0)
is already called in Init()

Daniel Povey

unread,
Feb 19, 2019, 12:41:27 PM2/19/19
to kaldi-help
Oh sorry, I realized that the class had more variable state than I realized.  I'll try to come up with a way to make it thread-safe.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Feb 19, 2019, 1:11:40 PM2/19/19
to kaldi-help
OK, I pushed some more changes to that branch.
Now the way to make it thread safe is different: you need to create copies of the GrammarFst objects,
which share the underlying ConstFst objects.

Dan

Armando

unread,
Feb 20, 2019, 3:06:29 AM2/20/19
to kaldi-help
thanks!

this works well
there's a slight increase in memory usage, but that's seem negligible, and anyway it does not depend on the size of the FTSs used

Armando

unread,
Mar 15, 2019, 6:57:05 AM3/15/19
to kaldi-help
I just revive the thread to add that for rescoring (I mean to subtract the lm scores determined in decoding) the G.fst to subtract can indeed be built with fstreplace 


nonterm_unk=$(grep nonterm:unk $lang/words.txt | awk '{print $2}')

fstreplace toplevel.G.fst 100000000 non-top-level.G.fst 'echo $nonterm_unk |'  > decoding.G.fst

but without applying --remove-from-output=true when using fstrmsymbols when producing toplevel.G.fst (at least the one used to generate decoding.G.fst)


although this way, decoding.G.fst would be expanded completely (but usually the G.fst for the decoding is relatively small and can be generated fast and without much resource)

Tae-young Jo

unread,
Mar 27, 2019, 9:39:31 AM3/27/19
to kaldi-help
Hello, Armando and Dan.

Actually I am doing the exactly same system as described by Armando.
build grammar FST sub-graph from n-gram arpa.

Through the recent update ` Two arcs had the same left-context phone.` issue is cleared.
But I still have epslion cycle in the combined graph. 
It means for some audio decoding is done without error. 
but for some audio it fails with 

ASSERTION_FAILED (nnet3-latgen-grammar[5.5.266~2-77ac7]:TopSortTokens():lattice-faster-decoder.cc:998) Assertion failed: (loop_count < max_loop && "Epsilon loops exist in your decoding " "graph (this is not allowed!)")

Could you let me know how to build sub-graph from ARPA successfully? so not to make epsilon path exist?

I tried like follows:

  gunzip -c $lm |\
    arpa2fst --read-symbol-table=$lang_ext/words.txt - | \
    fstprint --isymbols=$lang_ext/words.txt --osymbols=$lang_ext/words.txt - |\
    awk -F $'\t' '{
    if($3 == "<s>" )  {$3 = "#nonterm_begin"; $4 = "<eps>" }
    if($3 == "</s>")  {$3 = "#nonterm_end"; $4 = "<eps>" }
    if($3 == "<eps>") {$3 = "#0" }
    print $0 }' |\
    fstcompile --isymbols=$lang_ext/words.txt --osymbols=$lang_ext/words.txt - |\
    fstarcsort --sort_type=ilabel >$lang_ext/G.fst


2019년 1월 31일 목요일 오후 9시 48분 1초 UTC-6, Armando 님의 말:
...

Armando

unread,
Mar 27, 2019, 9:56:21 AM3/27/19
to kaldi-help
I did not use arpa2fst, because you always have a path with epsilon symbols on the input side from the intial to the final state


you can write your own lm fst to avoid that.
In the following, I did not add the backoff arcs, because I'm doing a light adaptation

cat lm | this_script.pl > G.txt.fst

then compile to fst with the appropriate symbol table to have G.fst
this is the lm fst you use to build the non-top level  HCLG 

my $ngram = 0; # the order of ngram we are visiting
my $n_ugram = 0;
my $n_bgram = 0;
my $cost;
my $cost_ugram;
my %Ustate;
my %Pugram;
my $bofstate = 4;
my $state_cnt = 5;
my $start_state = 0;
my $nonterm_begin_state = 1;
my $nonterm_end_state = 2;
my $end_state = 3;
my $unigram;
my %Bstate;

#I read the lm line-by-line; I store the unigram-bof and Prob in hashes keyed by the word id
#I print a state in ascending order starting from 5 for the first unigram and save the state info for a unigram in State hash indexed by word id
#0 -> 1 for the begin,  2 -> 3 for the end, 4 for bof state
print STDOUT "$start_state     $nonterm_begin_state     #nonterm_begin <eps>\n";
print STDOUT "$nonterm_end_state     $end_state    #nonterm_end <eps>\n";
print STDOUT "$end_state\n";
while(<STDIN>){

  $n_ugram = $1 if /ngram 1=(\d+)/;
  $n_bgram = $1 if /ngram 2=(\d+)/;
  if (m/^\\1-grams:$/) { $ngram = 1; }
  if (m/^\\2-grams:$/) { $ngram = 2; }
  if (m/^\\3-grams:$/) { $ngram = 3; }
  my @line = split(/\s+/, $_);
  if (@line > 1 && $ngram == 1 && $line[1] ne '<unk>'){
    $unigram = $line[1]; 
    $cost = -log(10**($line[2]));

    $cost_ugram = -log(10**($line[0]));
    $Pugram{$unigram} = $cost_ugram;
    $Ustate{$unigram} = $state_cnt; 

    print STDOUT "$nonterm_begin_state     $state_cnt     $unigram $unigram\n";
    print STDOUT "$state_cnt     $nonterm_end_state     <eps> <eps> $cost_ugram\n";
    $state_cnt++;
  }
  if (@line > 1 && $ngram == 2 && $line[1] ne '<unk>' && $line[2] ne '<unk>'){
    my $cost = -log(10**($line[0]));

    print STDOUT "$Ustate{$line[1]}     $state_cnt     $line[2] $line[2] $cost\n";
    print STDOUT "$state_cnt     $nonterm_end_state     <eps> <eps>\n";
    $Bstate{"$line[1] $line[2]"} = $state_cnt;
    $state_cnt++;
  }
  if (@line > 1 && $ngram == 3 && $line[1] ne '<unk>' && $line[2] ne '<unk>' && $line[3] ne '<unk>'){
    my $cost = -log(10**($line[0]));
    my $bigram = "$line[1] $line[2]";
    print STDOUT "$Bstate{$bigram} $nonterm_end_state $line[3] $line[3] $cost\n";
  }
    
}

Lucasjo Jo

unread,
Mar 27, 2019, 10:33:10 AM3/27/19
to kaldi-help
Okay I see. thank you for your brief example! Will look into this and build my one.

Anyway the key is not to make ...  any path with only epsilons on the input side? output side? slightly confused


2019년 3월 27일 수요일 오후 10시 56분 21초 UTC+9, Armando 님의 말:

Armando

unread,
Mar 27, 2019, 10:45:01 AM3/27/19
to kaldi-help
i'd say both input-output, it's an acceptor

Daniel Povey

unread,
Mar 27, 2019, 11:37:06 AM3/27/19
to kaldi-help
Or you could just use arpa2fst and then do `fstdifference` with the following FST:

0 0.0

(i.e. that only accepts the empty path).


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Mar 27, 2019, 11:39:54 AM3/27/19
to kaldi-help
Actually the `fstdifference` usage message says:

Subtracts an unweighted DFA from an FSA.

so that would only work if the arpa2fst output was an acceptor-- which it isn't, quite, because of the disambig symbol #0.  So you'd have to do `fstproject` to project on the input (copy the #0 to the output side), and then after fstdifference, remove the #0 from the output side either by piping it through an appropriate awk script, or `fstrmsymbols` with appropriate args.

Armando

unread,
Mar 27, 2019, 12:00:09 PM3/27/19
to kaldi-help
thanks, Dan, I'll also try that

Lucasjo Jo

unread,
Mar 27, 2019, 2:01:01 PM3/27/19
to kaldi-help
Thank you, Dan

I did like follows like you said.  but I still have `Epsilon loops exist in your decoding graph` error in decoder phase (not all files)

What did I miss?


  gunzip -c $lm |\

    arpa2fst --disambig-symbol=#0 --read-symbol-table=$lang_ext/words.txt - $lang_ext/G.nobeos.fst

  echo "0 1 #nonterm_begin <eps>" > $lang_ext/nonterm_begin.txt
  echo "1" >> $lang_ext/nonterm_begin.txt
  echo "0 1 #nonterm_end <eps>" > $lang_ext/nonterm_end.txt
  echo "1" >> $lang_ext/nonterm_end.txt

  fstcompile --isymbols=$lang_ext/words.txt --osymbols=$lang_ext/words.txt \
    $lang_ext/nonterm_begin.txt > $lang_ext/nonterm_begin.fst
  fstcompile --isymbols=$lang_ext/words.txt --osymbols=$lang_ext/words.txt \
    $lang_ext/nonterm_end.txt > $lang_ext/nonterm_end.fst

  fstconcat $lang_ext/nonterm_begin.fst $lang_ext/G.nobeos.fst |\
    fstconcat - $lang_ext/nonterm_end.fst | fstarcsort --sort_type=ilabel > $lang_ext/G.fst.tmp

  fstproject $lang_ext/G.fst.tmp $lang_ext/G.fst.tmp.project # copy input to output

  # acceptor only accept empty path
  cat << EOF > $lang_ext/empty.fst.txt
0 0.0
EOF
  fstcompile $lang_ext/empty.fst.txt $lang_ext/empty.fst

  fstdifference $lang_ext/G.fst.tmp.project $lang_ext/empty.fst $lang_ext/G.fst.tmp.diff

  fstprint --isymbols=$lang_ext/words.txt --osymbols=$lang_ext/words.txt \
    $lang_ext/G.fst.tmp.diff | awk '{if($4=="#0") $4 = "<eps>"; print $0}' |\
    fstcompile --isymbols=$lang_ext/words.txt --osymbols=$lang_ext/words.txt - |\
    fstarcsort --sort_type=ilabel >$lang_ext/G.fst



2019년 3월 26일 화요일 오후 6시 39분 54초 UTC-6, Dan Povey 님의 말:

Daniel Povey

unread,
Mar 27, 2019, 2:40:56 PM3/27/19
to kaldi-help
Oh.  That's probably due to disambig symbols.  
Subtracting an FST like this would probably work.

0  0  #0  #0  0.0
0 0.0

Lucasjo Jo

unread,
Mar 27, 2019, 10:47:42 PM3/27/19
to kaldi-help
It works! Dan.

Slightly tricky, though. Because fstdifference only takes FSA.

Other than ambiguous symbol, nonterm_begin and nonterm_end loop also needed.

Thanks a lot.


2019년 3월 28일 목요일 오전 3시 40분 56초 UTC+9, Dan Povey 님의 말:

Daniel Povey

unread,
Mar 27, 2019, 11:01:51 PM3/27/19
to kaldi-help
Then maybe you just need to do it earlier on in the process of creating the FST, i.e. before adding the nonterm_begin.  As I said, to make it deal with FSAs, you might have to temporarily copy the #0 to the output side.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Armando

unread,
Apr 1, 2019, 12:16:33 PM4/1/19
to kaldi-help
Hi

I see 

awk -v start=$highest_number '{print $1, NR+start}' <$tmpdir/extra_disambig.txt >>$dir/words.txt
in line 134 of

is that $dir/phones.txt instead?

Daniel Povey

unread,
Apr 1, 2019, 12:32:36 PM4/1/19
to kaldi-help
Yes you're right, can you make a PR?

You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Armando

unread,
Apr 1, 2019, 12:39:23 PM4/1/19
to kaldi-help
ok
...

Lucas Jo

unread,
Apr 2, 2019, 8:09:05 PM4/2/19
to kaldi-help
In extend_lang.sh, after adding extra ambiguous symbol into phone.txt,I think it is needed that adding the symbol into the ambiguous.txt in the extended lang folder. Or it seems reporting error in validation step

Daniel Povey

unread,
Apr 2, 2019, 8:23:24 PM4/2/19
to kaldi-help
Thanks, I hope one of you can make a PR.

On Tue, Apr 2, 2019 at 8:09 PM Lucas Jo <jty...@gmail.com> wrote:
In extend_lang.sh, after adding extra ambiguous symbol into phone.txt,I think it is needed that adding the symbol into the ambiguous.txt in the extended lang folder. Or it seems reporting error in validation step

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

Armando

unread,
Apr 4, 2019, 12:43:16 PM4/4/19
to kaldi-help
Hi

I see in
line 212
looks to me that start_state is not initialized, we can replace it with loop_state instead?


On Wednesday, April 3, 2019 at 2:23:24 AM UTC+2, Dan Povey wrote:
Thanks, I hope one of you can make a PR.

On Tue, Apr 2, 2019 at 8:09 PM Lucas Jo <jty...@gmail.com> wrote:
In extend_lang.sh, after adding extra ambiguous symbol into phone.txt,I think it is needed that adding the symbol into the ambiguous.txt in the extended lang folder. Or it seems reporting error in validation step

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Apr 4, 2019, 12:59:41 PM4/4/19
to kaldi-help
Yes you're right, please make a PR.


To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Lucas Jo

unread,
Apr 5, 2019, 3:00:32 AM4/5/19
to kaldi-help
Hi, Dan. 

Can I build grammar FST in nested way? 

I means sub-graph's sub-graph. 

If there is any technical problem, plz let know. 

Best,
Lucas


2019년 4월 3일 수요일 오후 7시 59분 41초 UTC-6, Dan Povey 님의 말:

Daniel Povey

unread,
Apr 5, 2019, 12:35:40 PM4/5/19
to kaldi-help
There is nothing that prevents you from using nested sub-grammars.  In principle you can even do that recursively, as long as you never attempt to copy to a regular FST.  (I.e. you'd need to use the grammar version of the decoder).

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Lucas Jo

unread,
Apr 5, 2019, 6:49:37 PM4/5/19
to kaldi-help
Okay :) it is very  nice  to hear  that.

But I have a curiosity here, 

If a nonterm:ABC class is defined in a sub-graph, not in top-graph,

to define a nested graph, all I have to do is just pass all the pair  of sub-graphs with its  symbol and  top-graph into GrammarFST  initializer (without any hierarchical info)? 

or  do I  have to build  it step by step like, building low-level grammar FST first  and  defining  high-level  later?


2019년 4월 6일 토요일 오전 1시 35분 40초 UTC+9, Dan Povey 님의 말:

Daniel Povey

unread,
Apr 5, 2019, 6:58:20 PM4/5/19
to kaldi-help
The order doesn't matter, except for the very top level.
There can by cyclic dependencies.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Apr 5, 2019, 6:59:24 PM4/5/19
to kaldi-help
... but you should probably make sure there is no "left-recursion", that might blow up.

Lucas Jo

unread,
Apr 5, 2019, 7:07:42 PM4/5/19
to kaldi-help
Thank you, will try and get back to you, if there is an issue!


2019년 4월 6일 토요일 오전 7시 59분 24초 UTC+9, Dan Povey 님의 말:

Armando

unread,
Apr 16, 2019, 1:46:28 PM4/16/19
to kaldi-help
well, all in all, after doing some experiments, the LM rescoring of the decoder lattices with an higher order LM is degrading the performance of the decoder alone in most cases.
I do not seem to subtract the original LM scores correctly

The lm-to-subtract is obtained like I suggested below; by using fstreplace of the toplevel G.fst on the nonterminal points with the non top level G.fst (that I used to build the non top level HCLG fst which works well in decoding).
I thought in principle that should allow to recover exactly the LM scores for an LM history as in decoding.

If I try to subtract and add the LM scores from this fst to the decoder lattices, I should have the same lattices, but they look like a smaller version, i.e. many paths have been discarded
this I can see with high verbose, for example

 VLOG[2] (lattice-lmrescore-pruned[5.5.369~121-47da0]:Compose():compose-lattice-pruned.cc:930) Input lattice had 215/186 arcs/states; output lattice has 90/84 (before pruning: 94/88)


so the final WER is worse; the degradation is observed even if I add a higher order LM fst
so, I suspect the lm-to-subtract  is just done in a wrong way.
Is there some easier way to see what's happening in rescoring than look at the code? the composition is very complicated

Daniel Povey

unread,
Apr 16, 2019, 3:15:47 PM4/16/19
to kaldi-help
The OpenFst composition algorithm is quite complicated, I think, due to capabilities for things like caching and lookahead.  In principle composition is quite simple though.
If subtracting the original LM scores is not working, you can try a different approach where you just zero them out and then reintroduce the lexicon/silence scores by composing with the lexicon FST.  There is one of the 'modes' in the lmrescore.sh that does something like that.

Dan

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Armando

unread,
Apr 17, 2019, 8:09:55 AM4/17/19
to kaldi-help
Yes, indeed, mode 4, thanks

</div
Reply all
Reply to author
Forward
0 new messages