about phones-to-prons

210 views
Skip to first unread message

何怡

unread,
Aug 18, 2016, 9:49:28 AM8/18/16
to kaldi-help

Hi, 

I have a problem when using phones-to-prons.

The usage is: phones-to-prons [options] <L_align.fst> <word-start-sym> <word-end-sym> <phones-rspecifier> <words-rspecifier> <prons-wspecifier>
My command is: phones-to-prons ../../../data/1000h_fbank/lang_uniword/L.fst 3468 3469 ark:phone.test ark:text.test ark:prons.test

phone.test is:
tsh102_seg_3013_2_19_65_3190cc1929f51d2c_0 84 84 84 84 84 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 1 1 1 1 1 1 36 36 36 36 36 36 55 55 55 55 55 1 1 1 1 1 2 2 2 2 30 30 30 30 30 30 19 19 19 19 25 25 25 82 82 82 82 82 82 82 82 82 82 1 1 1 1 1 57 57 57 57 57 57 57 57 57 7 7 7 7 1 1 1 1 1 60 60 60 30 30 30 19 19 19 25 25 25 25 100 100 100 100 100 100 18 18 18 18 61 61 61 61 61 61 52 52 52 19 19 19 19 19 19 19 98 98 98 98 98 98 98 98 4 4 4 4 4 4 4 57 57 57 57 57 57 57 57 57 57 57 57 57 57 57 57 57 57 57 84 84 84 84 84 84
text.test is:
tsh102_seg_3013_2_19_65_3190cc1929f51d2c_0 668 3132 231 7 2066 231 1131 39 1623

So what is <words-rspecifier>, I have tried word list above, and also words.txt, but none succeeded.
Message is phnx2word FST for utterance tsh102_seg_3013_2_19_65_3190cc1929f51d2c_0is empty (either decoding for this utterance did not reach end-state, or mismatched lexicon.)


Look forward replies!
Thanks

Daniel Povey

unread,
Aug 18, 2016, 1:08:44 PM8/18/16
to kaldi-help
Firstly, I suspect you don't really need the answer to this question
because you have misunderstood the purpose of the program
'phones-to-prons'.
It seems to only be used in egs/babel/s5c/local/ali_to_rttm.sh, which
is a very special-purpose script.
You can read the Kaldi docs at kaldi-asr.org/doc/, the section about
Kaldi I/O, to understand what 'rspecifier' means.
I won't answer further because knowing the answer to your question
will not help you.
Dan
> --
> You received this message because you are subscribed to the Google Groups
> "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kaldi-help+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

何怡

unread,
Aug 18, 2016, 11:14:00 PM8/18/16
to kaldi-help, dpo...@gmail.com
Thanks for your reply, Dan.

Actually I am not running the examples of kaldi scripts, I just tried to get word alignments from state alignments.
So I find the binary prons-to-wordali. And with the usage, I get ali-to-phones.

After quickly search of the codes, I guess the format of words-rspecifier of ali-to-phones is key word_id list, just like the align scripts.
But I failed.

So is there other methods to get word alignments from state alignments?




在 2016年8月19日星期五 UTC+8上午1:08:44,Dan Povey写道:

Daniel Povey

unread,
Aug 19, 2016, 12:28:02 AM8/19/16
to 何怡, kaldi-help
It's probably easiest to get the word level alignment it by converting
the alignment to a 1-best path by doing linear-to-nbest, and then
doing lattice alignment with lattice-align-words [or
lattice-align-words-lexicon if you're not using position-dependent
phones-- see steps/get_ctm.sh for example.]

Actually, the script steps/get_train_ctm.sh does exactly this.
Dan

何怡

unread,
Aug 19, 2016, 2:48:03 AM8/19/16
to kaldi-help, sagit...@126.com, dpo...@gmail.com
Thanks, Dan.
I have got what I want.

在 2016年8月19日星期五 UTC+8下午12:28:02,Dan Povey写道:

andrew...@gmail.com

unread,
Jan 3, 2019, 1:36:28 AM1/3/19
to kaldi-help
hello,heyi
I have come scross some problem with you about “Message is phnx2word FST for utterance tsh102_seg_3013_2_19_65_3190cc1929f51d2c_0is empty (either decoding for this utterance did not reach end-state, or mismatched lexicon.)”
I want to know how you solve this trouble
thank you

在 2016年8月18日星期四 UTC+8下午9:49:28,何怡写道:
Reply all
Reply to author
Forward
0 new messages