How independent are final.mdl and HCLG.fst?

528 views
Skip to first unread message

jhennrich89

unread,
Jun 20, 2018, 4:18:26 AM6/20/18
to kaldi-help
Im building a subject-dependent system for online decoding (gmm-latgen-faster) that allows for swapping out different language models. My understanding is, that the trained models for the users (final.mdl) and the language models (HCLG.fst) are mostly independent of each other, meaning that I can train several user-models and language-models:

final_user1.mdl
final_user2.mdl
...

HCLG_1.fst + words_1.txt
HCLG_2.fst + words_2.txt
...

and use any triplet of them (final_user*.mdl, HCLG_*.fst, words_*.txt) for online decoding. Is this correct?

My approach to building the language-models and user-models is the following (please correct me if I got something wrong):

- use prepare_lang.sh
- run the training script several times with training data of different users to get the user-models (final_user1.mdl, final_user2.mdl, etc.) and (tree)
(the tree should be the same for each pass, right?)

- then for each new language model I change words_*.txt and run prepare_lang.sh again. I know I need to make sure to use the same phoneset (pass "--phone-symbol-table phones.txt"). Is there anything else to watch out for?
- then I run makegraph.sh using the new L_disambig.fst, G.fst, disambig.int and ANY of the (tree) and (final_user*.mdl) files to build HCLG_*.fst
- Is it a problem, if the disambiguation symbols created by prepare_lang.sh (disambig.int, L_disambig.fst) differ from the ones used for training the user-models?

Daniel Povey

unread,
Jun 20, 2018, 3:13:17 PM6/20/18
to kaldi-help
> Im building a subject-dependent system for online decoding
> (gmm-latgen-faster) that allows for swapping out different language models.
> My understanding is, that the trained models for the users (final.mdl) and
> the language models (HCLG.fst) are mostly independent of each other, meaning
> that I can train several user-models and language-models:
>
> final_user1.mdl
> final_user2.mdl
> ...
>
> HCLG_1.fst + words_1.txt
> HCLG_2.fst + words_2.txt
> ...
>
> and use any triplet of them (final_user*.mdl, HCLG_*.fst, words_*.txt) for
> online decoding. Is this correct?

That seems reasonable.


> My approach to building the language-models and user-models is the following
> (please correct me if I got something wrong):
>
> - use prepare_lang.sh
> - run the training script several times with training data of different
> users to get the user-models (final_user1.mdl, final_user2.mdl, etc.) and
> (tree)
> (the tree should be the same for each pass, right?)


You won't be able to use the same graph if the tree is not the same.
It's unusual to train models per user; this likely won't work the best
unless you have quite a lot of data per user (e.g. more than 10
hours). I would just train the normal way using adaptation, via
train_sat.sh.
If you have a reasonable amount of data per user (e.g. at least an
hour or two) it might be worthwhile to try MAP adaptation of the
speaker-independent model to the individual speaker characteristics
using train_map.sh though.

> - then for each new language model I change words_*.txt and run
> prepare_lang.sh again. I know I need to make sure to use the same phoneset
> (pass "--phone-symbol-table phones.txt"). Is there anything else to watch
> out for?

What you describe should work.

> - then I run makegraph.sh using the new L_disambig.fst, G.fst, disambig.int
> and ANY of the (tree) and (final_user*.mdl) files to build HCLG_*.fst
> - Is it a problem, if the disambiguation symbols created by prepare_lang.sh
> (disambig.int, L_disambig.fst) differ from the ones used for training the
> user-models?

That doesn't matter.


Dan

>
> --
> Go to http://kaldi-asr.org/forums.html find out how to join
> ---
> You received this message because you are subscribed to the Google Groups
> "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kaldi-help+...@googlegroups.com.
> To post to this group, send email to kaldi...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/kaldi-help/4ebfd8ba-2a0c-4f4d-9a1b-a2c7a60021f1%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

jhennrich89

unread,
Jun 21, 2018, 7:25:40 AM6/21/18
to kaldi-help
Thanks, that helped!

About the user-dependent models: I am not doing typical ASR. The data and features are quite different and since the task itself is a lot simpler than typical ASR but the differences between users are bigger I decided to try a user-dependent system first. But I will look into adaptation techniques like fMLLR in the future!

Right now I am thinking about adapting a very basic model to new users by retraining (several iterations of gmm-align-compiled + gmm-acc-stats-ali + gmm-est ) old models trained on different users.

Daniel Povey

unread,
Jun 21, 2018, 3:40:56 PM6/21/18
to kaldi-help
>
> About the user-dependent models: I am not doing typical ASR. The data and
> features are quite different and since the task itself is a lot simpler than
> typical ASR but the differences between users are bigger I decided to try a
> user-dependent system first. But I will look into adaptation techniques like
> fMLLR in the future!
>
> Right now I am thinking about adapting a very basic model to new users by
> retraining (several iterations of gmm-align-compiled + gmm-acc-stats-ali +
> gmm-est ) old models trained on different users.

What train_map.sh does is a smarter (and time-tested) way of doing a
similar thing, where it backs off to the parameters estimated on the
other users' stats if the counts are very low.

Dan
> https://groups.google.com/d/msgid/kaldi-help/c5ba59ca-21c1-4f98-ac63-81c340bb3344%40googlegroups.com.

jhennrich89

unread,
Jun 24, 2018, 9:27:45 AM6/24/18
to kaldi-help
I have another question: why are there disambiguation symbols added to the training graphs in train_mono.sh? Those graphs only contain phoneme sequences for a single word, so disambiguation symbols should not be required.

compile-train-graphs --read-disambig-syms=$lang/phones/disambig.int $dir/tree $dir/0.mdl  $lang/L.fst [...]

Daniel Povey

unread,
Jun 24, 2018, 2:02:21 PM6/24/18
to kaldi-help
Because L.fst already has them, and they need to be removed during a
certain stage of the graph compilation.
> https://groups.google.com/d/msgid/kaldi-help/5f3ee1c4-70dd-423f-a78a-bead6d43f390%40googlegroups.com.

Daniel Povey

unread,
Jun 24, 2018, 2:04:42 PM6/24/18
to kaldi-help
Actually, that's wrong: L_disambig.fst has them but L.fst does not.
It's possible that they are not needed there; I don't recall now. In
any case, they are harmless as they are only removed, not inserted.
Dan

jhennrich89

unread,
Jun 25, 2018, 5:57:16 AM6/25/18
to kaldi-help
Alright, I thought the --read-disambig-syms parameter adds the symbols.
Reply all
Reply to author
Forward
0 new messages