mkgraph_lookahead.sh stuck on fstdeterminizestar for HCL for specific words

134 views
Skip to first unread message

Mustafa Ebrar Aktaş

unread,
Oct 13, 2021, 12:21:48 PM10/13/21
to kaldi-help
I am trying to create lookahead graph for my already trained position independent kaldi model. First I followed the mkgraph_lookahead.sh and it stuck on composition of Ha.fst and CL_$N_$P.fst . 
There are similar issues below:
        It is suggested to use chain model, but I am using chain model already.
- https://github.com/kaldi-asr/kaldi/issues/4143
        It is suggested to omit fstdeterminizestar

Then I have tried to narrow down the issue. I observed that using some mini lexicons allows me to create HCLr.fst and script completes successfully.

Here are the different mini lexicons and their results:
  • Words are "GO" and "I" (it has single phone pronunciation), can't complete
GO G OW
I AY
<UNK> SPN

  • Words are "GO" and "IS", completes successfully.
GO G OW
IS IH Z
<UNK> SPN
  • I tried a small hack by changing pronunciation of "I" like below, it completes successfully.
GO G OW
I AY AY
<UNK> SPN

I thought it was related to single phone pronunciations. However, the lexicon below makes it stuck, while it completes successfully if I remove anyone of the words.
AS AE Z
AS EH Z
AT AE T
FOR F AO R
FOR F ER0
FOR F R ER0
<UNK> SPN

I think it is related to transitions introduced by specific contexts, they make HCL nondeterminizable for my tree and model. Because even if I use single word lexicons with "I AY" or "I AY AY" , the latter one completes while the former fails. Because the former one has extra contexts below (you can find full lists in attached files):
SPN/AY/SIL
SPN/AY/SPN
SPN/AY/<eps>
SIL/AY/SIL
SIL/AY/SPN
SIL/AY/<eps>
<eps>/AY/SIL
<eps>/AY/SPN
<eps>/AY/<eps>

My question is that is it possible to make HCL determinizable? Because omitting fstdeterminizestar makes output HCLr.fst much bigger. (which has redundancy and slows down decoding)

I can share my model and tree if it is necessary.
i_ay_context_symbols_list_sorted.txt
i_ay_ay_context_symbols_list_sorted.txt

Daniel Povey

unread,
Oct 13, 2021, 11:11:28 PM10/13/21
to kaldi-help
Are you sure you are using a chain model, because those phone contexts you displayed are triphone, and all the examples of chain models that we have use left-biphone context.

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/0c208e4d-cf84-4dba-90de-6422e2db9184n%40googlegroups.com.

Mustafa Ebrar Aktaş

unread,
Oct 14, 2021, 2:58:17 AM10/14/21
to kaldi-help
Yes, it is a chain model. I have changed the model configuration for specific usage. Here is the output of tree-info:
num-pdfs 5872
context-width 3
central-position 1
Is the issue caused by using triphone context? 

Daniel Povey

unread,
Oct 14, 2021, 5:15:52 AM10/14/21
to kaldi-help
Yes, that's likely the problem.

Mustafa Ebrar Aktaş

unread,
Oct 14, 2021, 12:33:59 PM10/14/21
to kaldi-help
Using left-bigram seems like the solution, I could successfully created determinized HCLr with that. 

Thanks.

Reply all
Reply to author
Forward
0 new messages