About adding silence phones with different HMM topologies

403 views
Skip to first unread message

賴禹邵

unread,
Aug 16, 2016, 12:12:02 AM8/16/16
to kaldi-help
Hi, I've seen that kaldi provides two different phone types - non-sil phone and sil phone - where each can contain several phone units listed in nonsilence_phones.txt and optional_silence.txt in dir data/dict/ respectively.

Now I have two sil phones which are aimed to deal with different silence conditions in speech, and for some reason I need to assign to them different kinds of HMM topologies, i.e. different #State and transition probabilities. I tried to modify utils/gen_topo.pl and successfully created a new topo file I desire, but while running prepare_lang.sh it shows me an error message below:

ERROR: data/lang/topo's silence section doesn't correspond to silence.txt

Which seemed remind me that different topo types of silence phones is not allowed, am I right?
If yes, though I know this may not easy to achieve, but is there other plan to implement this?
If any possible alternative is proposed, I would like to try it no matter what.
Thanks for advice.


Daniel Povey

unread,
Aug 16, 2016, 12:25:54 AM8/16/16
to kaldi-help
You can just modify the validate_lang.pl script by commenting out the
bit that's failing. It's just the validation script that doesn't like
this, the rest of Kaldi is OK with it.
Dan
> --
> You received this message because you are subscribed to the Google Groups "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

賴禹邵

unread,
Aug 16, 2016, 3:02:17 AM8/16/16
to kaldi-help
Thanks Dan, I've changed the content of validate_lang.pl a little bit, and the validation error is gone.
But now I got a new error message in subsequent monophone model training:

ERROR (gmm-init-mono:GetStubMap():build-tree-utils.cc:906) Assertion failed: len > 0

I'm not sure whether this involves my setting of topo:

<Topology>
<TopologyEntry>
<ForPhones>
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
</ForPhones>
<State> 0 <PdfClass> 0 <Transition> 0 0.75 <Transition> 1 0.25 </State>
<State> 1 <PdfClass> 1 <Transition> 1 0.75 <Transition> 2 0.25 </State>
<State> 2 <PdfClass> 2 <Transition> 2 0.75 <Transition> 3 0.25 </State>
<State> 3 </State>
</TopologyEntry>
<TopologyEntry>
<ForPhones>
1
</ForPhones>
<State> 0 <PdfClass> 0 <Transition> 0 0.25 <Transition> 1 0.25 <Transition> 2 0.25 <Transition> 3 0.25 </State>
<State> 1 <PdfClass> 1 <Transition> 1 0.25 <Transition> 2 0.25 <Transition> 3 0.25 <Transition> 4 0.25 </State>
<State> 2 <PdfClass> 2 <Transition> 1 0.25 <Transition> 2 0.25 <Transition> 3 0.25 <Transition> 4 0.25 </State>
<State> 3 <PdfClass> 3 <Transition> 1 0.25 <Transition> 2 0.25 <Transition> 3 0.25 <Transition> 4 0.25 </State>
<State> 4 <PdfClass> 4 <Transition> 4 0.75 <Transition> 5 0.25 </State>
<State> 5 </State>
</TopologyEntry>
</Topology>
<TopologyEntry>
<ForPhones>
2
</ForPhones>
<State> 0 <PdfClass> 0 </State>
<State> 0 <PdfClass> 0 <Transition> 0 0.75 <Transition> 1 0.25 </State>
<State> 1 </State>
</TopologyEntry>
</Topology>

The phone unit with ID 2 is another sil phone I talked about (just call it "sil_2") which differs from original sil phone (phone ID=1, called "sil") in HMM topology. Yes it has only one state, which may not be a good idea but just want to try this for some reason.
Also, I had my prepared file involving sil, including lexicon.txt, silence_phones.txt and optional_silence.txt, as follows:

lexicon.txt:
zhi zh FNULL1
chi ch FNULL1
...
...
ei eh yi3
me m e
sil sil
sil_2 sil_2

silence_phones.txt:
sil
sil_2

optional_silence.txt:
sil_2

Did I miss something to successfully train a monophone model?
Thanks.

賴禹邵

unread,
Aug 16, 2016, 10:43:52 AM8/16/16
to kaldi-help
I just found that the formmat of topo file is wrong.
After correcting it the program runs perfectly.
It's no problem now, thanks.

Daniel Povey

unread,
Aug 16, 2016, 3:15:05 PM8/16/16
to kaldi-help
I just looked at the Read() function in hmm-topology.cc and it looks
like it should have rejected your bad topology file [see line 75 of
hmm-topology.cc]. So I'm surprised it even got that far.
Dan
Reply all
Reply to author
Forward
0 new messages