Re: [kaldi-help] Merge sil phone in state-level alignment

213 views
Skip to first unread message
Message has been deleted

Daniel Povey

unread,
Jul 27, 2019, 11:41:10 PM7/27/19
to kaldi-help
Without knowing precisely how you obtained that information you're showing I couldn't say.
Likely there was some assumption in the code you wrote that was not satisfied.
It might be because the silence models have a non-left-to-right topology.

On Sat, Jul 27, 2019 at 7:43 PM Hao LIANG <howe....@outlook.com> wrote:
Hello everyone,

I'm trying to get the state-level alignments for text-to-speech training from Kaldi. 

The problem I got is that for silence among the sentence, I got multiple silence phones identified (marked blue as below). 

I want to know how to merge these SIL so I could got only one SIL (with 5 states in my example) in the middle of two non-silence phones.

My procedure: 1) feature extraction and force alignment (from i.e. tri2 model); 2) extract state alignment by ali-to-pdf and show-transitions;
When performs phone-level alignment, I could simply merge SIL(1) to (5) by getting the SIL[0]_start (of SIL(1)) and SIL[4]_end  (of SIL(5)), however how could I put that in state-level? 

start end phone hmm-state
3.11 3.23 u4 [0]
3.23 3.31 u4 [1]
3.31 3.32 u4 [2]
3.32 3.34 u4 [3]
3.34 3.35 u4 [4]
3.35 3.35 sil [0]    SIL(1)
3.35 3.36 sil [1]
3.36 3.38 sil [4]
3.38 3.39 sil [0]    SIL(2)
3.39 3.47 sil [1]
3.47 3.51 sil [2]
3.51 3.65 sil [1]    SIL(3)
3.65 3.68 sil [2]
3.68 3.76 sil [3]
3.76 3.78 sil [2]    SIL(4)
3.78 3.89 sil [1]    SIL(5)
3.89 3.98 sil [2]
3.98 3.98 sil [4]
3.98 4.02 f [0]
4.02 4.03 f [1]
4.03 4.05 f [2]
4.05 4.05 f [3]
4.05 4.07 f [4]

Any suggestions or solutions?

Many thanks for your help.
Hao.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/ae4bfcbe-0e9f-4556-8cc0-73b3a0b2480a%40googlegroups.com.

Hao LIANG

unread,
Jul 29, 2019, 4:30:30 AM7/29/19
to kaldi-help
thanks Dan.

I first convert the alignments into pdf-ids by ali-to-pdf, then use show-transitions to get the corresponding phone and states. Start/end time information is calculated by ali-to-pdf output.
The sentence-middle silence are shown as below in ali-to-pdf output file:

... 367 0 1 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 0 1 1 1 1 1 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 4 4 4 4 4 4 4 4 4 4 2707 ...

where in show-transitions output it is:
Transition-state 1: phone = sil hmm-state = 0 pdf = 0
Transition-state 2: phone = sil hmm-state = 1 pdf = 1
Transition-state 3: phone = sil hmm-state = 2 pdf = 2
Transition-state 4: phone = sil hmm-state = 3 pdf = 3
Transition-state 5: phone = sil hmm-state = 4 pdf = 4

so this generates the results like:
xxx xxx sil[0]
xxx xxx sil[1]
xxx xxx sil[4]
xxx xxx sil[0]
xxx xxx sil[1]
xxx xxx sil[2]
...

I checked the model and found the (default) silence model does not has a left-to-right topology. On the other hand, all "real" phones do not has this issue.
If this is the reason, I guess in my case I need to modify the topology of silence model or create by hand, but no idea if this will lead to accuracy loss.

<TopologyEntry>
<ForPhones>
1
</ForPhones>
<State> 0 <PdfClass> 0 <Transition> 0 0.25 <Transition> 1 0.25 <Transition> 2 0.25 <Transition> 3 0.25 </State>
<State> 1 <PdfClass> 1 <Transition> 1 0.25 <Transition> 2 0.25 <Transition> 3 0.25 <Transition> 4 0.25 </State>
<State> 2 <PdfClass> 2 <Transition> 1 0.25 <Transition> 2 0.25 <Transition> 3 0.25 <Transition> 4 0.25 </State>
<State> 3 <PdfClass> 3 <Transition> 1 0.25 <Transition> 2 0.25 <Transition> 3 0.25 <Transition> 4 0.25 </State>
<State> 4 <PdfClass> 4 <Transition> 4 0.75 <Transition> 5 0.25 </State>
<State> 5 </State>
</TopologyEntry>


在 2019年7月28日星期日 UTC+8上午11:41:10,Dan Povey写道:
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Jul 29, 2019, 7:17:41 PM7/29/19
to kaldi-help
I doubt that it will affect the accuracy.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/b9f797ea-44b9-45a7-94f6-b5ec1a97630c%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages