How to interpret .mdl file in Kaldi

3,027 views
Skip to first unread message

iam2...@gmail.com

unread,
Dec 2, 2015, 12:01:27 PM12/2/15
to kaldi-help
Hi,

I try to get the parameters in the GMM-HMM from Kaldi training (the mean, covariance, transition probability). So I go to check the details of .mdl file.

But, I am confused by the format. I can understand the topology part, which should define a prototype about the GMM-HMM. Then, it is the "<Triples>". could anyone tell me what this part is about? Next, I have 172 phone states. However, I only have 134 pdfs. How can I know which state the pdf represents? At last, how can I get the transition probability between the states?

I attach my .mdl file here. Could anyone help me interpret it, please? Thank you.
test.mdl

Daniel Povey

unread,
Dec 2, 2015, 3:48:05 PM12/2/15
to kaldi-help
Read this page of documentation:
The output of 'show-transitions' may also be helpful in interpreting the model file.
Dan


--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

iam2...@gmail.com

unread,
Dec 2, 2015, 5:31:37 PM12/2/15
to kaldi-help, dpo...@gmail.com
Thank you. It is very clear.

But a quick question. In the model file, the label says <INV_VARS> and <MEANS_INVVARS>. Also, based on the value of the number, I guess these are the inverse of variance and the inverse of means. Is that right?

Thank you.

Daniel Povey

unread,
Dec 2, 2015, 5:44:53 PM12/2/15
to iam2...@gmail.com, kaldi-help
You could have easily got this from the code, but it's the means, and the products of the means with the inverse variances.
Dan

iam2...@gmail.com

unread,
Dec 3, 2015, 10:53:43 AM12/3/15
to kaldi-help, iam2...@gmail.com, dpo...@gmail.com
Thanks, Dan. However, when I go to check the code. It says:
"mean_invvars(mix, d) contains the mean times inverse variance."
"note that mean_invvars(mix, d)*mean_invvars(mix, d)/inv_vars(mix, d) is the mean-squared times inverse variance"
"
Just want to confirm with you that they are inverse of variance (not the means) and the product of the mean and inverse variance, right?

Thank you.

Kaldi_new_uSER

unread,
Aug 31, 2017, 2:30:04 AM8/31/17
to kaldi-help, dpo...@gmail.com
I have a related question about interpretation of Triples. 

In my GMM mdl file - phone ID 6 has following triples

6 0 pdfNo1
6 0 pdfNo2
  ..
6 0 pdfNo12

6 1 pdfNo1
6 1 pdfNo2
  ..
6 1 pdfNo7

6 1 pdfNo1
6 1 pdfNo2
  ..
6 1 pdfNo13 

Question1 - How come this phone has 12 pdfs in HMMstate0, 7 pdfs in HMMstate1 and 13 pdfs in HMMstate2 ? Does that mean the GMM has 12 components in state0 of this phone and 7 components in state1 and so on?

Question2 What factor/training variable determines how many pdfs will be there in a state?

Daniel Povey

unread,
Aug 31, 2017, 2:39:10 PM8/31/17
to Kaldi_new_uSER, kaldi-help
>
> 6 0 pdfNo1
> 6 0 pdfNo2
> ..
> 6 0 pdfNo12
>
> 6 1 pdfNo1
> 6 1 pdfNo2
> ..
> 6 1 pdfNo7
>
> 6 1 pdfNo1
> 6 1 pdfNo2
> ..
> 6 1 pdfNo13
>
> Question1 - How come this phone has 12 pdfs in HMMstate0, 7 pdfs in
> HMMstate1 and 13 pdfs in HMMstate2 ? Does that mean the GMM has 12
> components in state0 of this phone and 7 components in state1 and so on?


Read http://kaldi-asr.org/doc/hmm.html for more understanding.
This has nothing to do with components (which are Gaussian mixtures),
it has to do with state clustering.

> Question2 What factor/training variable determines how many pdfs will be
> there in a state?

pdf and state are the same thing. If you mean how many pdfs per
phone, that's ultimately controlled by num-leaves. If you mean how
many mixture-components/Gaussians per pdf, that's ultimately
controlled by 'totgauss' and 'power' in the training scripts.

Kaldi_new_uSER

unread,
Aug 31, 2017, 9:29:25 PM8/31/17
to kaldi-help, dummycom...@gmail.com, dpo...@gmail.com
Thanks for information.

>pdf and state are the same thing

Is that so? The only mention of the term triple on the page http://kaldi-asr.org/doc/hmm.html says 

"Each possible triple of (phone, hmm-state, pdf) maps to a unique transition-state"

So it indicates that an hmm-state can have multiple pdfs associated with it.

I think the documentation should be more clear and possibly explain the complete mdl file with an example. Currently it only explains about the Topology part with an example, but other parts below that (triple, diaggmm etc) should be explained with a help of an example (and not in an abstract way). 

Daniel Povey

unread,
Aug 31, 2017, 9:32:32 PM8/31/17
to Kaldi_new_uSER, kaldi-help
The mdl file is not really meant to be human readable.

Daniel Povey

unread,
Aug 31, 2017, 9:46:11 PM8/31/17
to Kaldi_new_uSER, kaldi-help
You are right though, the documentation on that stuff is not very easy to understand.  I'll see if I can rework it.

Pani Prithvi Raj

unread,
Mar 15, 2018, 2:16:27 AM3/15/18
to kaldi-help
Hi. Thanks for your documentation. May I please be clarified on the following. There are several triplets corresponding to a particular (phone, hmm) each of which carry different different pdf-id(of different GMMs). What does it mean by that. How should I conceive that a particular HMM state of a phone has multiple possible emission GMMs?
Thanks in advance.

Daniel Povey

unread,
Mar 15, 2018, 3:05:25 PM3/15/18
to kaldi-help
It has to do with the decision tree.  Maybe read the HTK Book to get familiar with the basic ideas of ASR.


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/9f50d039-538d-4d14-9023-b7670d7fe137%40googlegroups.com.

kaldi-user

unread,
Mar 20, 2018, 3:12:08 AM3/20/18
to kaldi-help
As far as I understand the decision trees, in order to reduce the number of GMMs used, we tie together several HMM states with a common GMM (or pdf-id). This is clear to me. For instance, in the part of .mdl file I am attaching below, 0th state of each 6th, 7th, 8th and 9th phones are tied together with pdf-id 972. This is a consequence of the decision tree (as I understand). 
What I am not clear with is that how 0th state of 6th phone has several pdf-ids (like 1, 274, 860 and 972). I tried to find the HTK book for clarification. But could not sort out my confusion.
6 0 1 
6 0 274 
6 0 860 
6 0 972 
6 1 52 
6 1 118 
6 1 256 
....
7 0 964 
7 0 972 
Thanks for your patience in answering...
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
hmm.mdl

Daniel Povey

unread,
Mar 20, 2018, 3:58:40 PM3/20/18
to kaldi-help
It has to do with phonetic context dependency.  I don't have time to explain further.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Sidifen Koe

unread,
Mar 20, 2018, 4:55:42 PM3/20/18
to kaldi-help
Isn't it because a phone can occur in various contexts of other phones (e.g., trigram) and various combinations can have their own pdf-id - but the triples in the model file cannot represent this (they list only one single phone id, that is the center phone id), so they are assigned multiple pdf-ids?
Reply all
Reply to author
Forward
0 new messages