How to visualize test data alignment

Saranya V

unread,

Sep 12, 2019, 2:30:46 AM9/12/19

to kaldi-help

Hi All,

Can anyone please help me to understand how to visualize test data acoustic model phone level alignment output.

Regards,
Saranya

Hang Lyu

unread,

Sep 12, 2019, 2:53:53 AM9/12/19

to kaldi-help

If you want a human-readable form, please use "bin/show-alignment" which will provide the phone level information.

Actually, alignment is an int vector. When you generate it, you can use "ark,t:" to create the text format. Or you can use the binary "copy-int-vector" to convert the binary file to text file. For the above two method, you should understand the "TransitionId".

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/ccc06042-29b8-414b-a11b-e3576c2fa557%40googlegroups.com.

Saranya V

unread,

Sep 12, 2019, 5:55:19 AM9/12/19

to kaldi...@googlegroups.com

Thanks for the reply.

how to get test data phone level alignments(ali.1.gz) for the neural network model?

Regards,

Saranya

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAEGQdu%3D2d%3DDWXbqy2btOgfGt7dzgEwS_CmMdX6WS66pcy7rwkw%40mail.gmail.com.

Hang Lyu

unread,

Sep 12, 2019, 8:13:19 AM9/12/19

to kaldi-help

The "nnet3-align-compiled" binary is the key. You can use steps/nnet3/align.sh.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAGWA%2Buc63napTcWRSoEQRMTRe7b0ZaFriCTZmH8AhP_y2RtrEg%40mail.gmail.com.

Saranya V

unread,

Sep 16, 2019, 6:02:54 AM9/16/19

to kaldi...@googlegroups.com

Thanks for the reply.

I think "nnet3-align-compiled" alignments will use the 'text' file in the testing directory so the testing data would be treated as just another dataset with supervision.

I don't want to make use of the supervision, i.e. I want the alignments to be derived from the decoding output for the test data.

Regards,

Saranya

Hang Lyu

unread,

Sep 16, 2019, 6:55:04 AM9/16/19

to kaldi-help

Oh, in this circumstance, I think you can use "nnet3bin/nnet3-latgen-faster" to decode the test set and generate the "alignment" [check the usage] at the same time. Bear in mind, the "alignment" comes from the best path. After that, you can use "show-alignment" to get the phone level information.

Hang

--

Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAGWA%2BudU%3D2upzeHrSYH-0q5hY10R0Un70if13X9NKwZx5u5o3g%40mail.gmail.com.

Saranya V

unread,

Sep 16, 2019, 7:14:36 AM9/16/19

to kaldi...@googlegroups.com

Thanks for the immediate response. But "nnet3bin/nnet3-latgen-faster" used to generate lattices.

Basically I want to see the acoustic level output for the test data to check the acoustic model performance.(Without LM how its performing and after adding the LM how the performance is changing based on LM).

Do you have any suggestions?

Saranya

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAEGQduk6cz%2BSkp4-e3YvkqhNdwco%2BqxTFszvGkBzjwBUuAYnvA%40mail.gmail.com.

Hang Lyu

unread,

Sep 16, 2019, 7:53:06 AM9/16/19

to kaldi-help

Yes, "nnet3bin/nnet3-latgen-faster" is used to generate lattices. Lattice is the result of decoding and contains the whole information of decoding. From the lattice, you can get the best path and so on. Actually, the alignment is the sequence of "ilabel" of the lattice. With the binary, you can achieve the alignment [i.e. the sequence of ilabel of the best path]. Check the usage, we have already provide the alignment option.

Why do you want to check the acoustic model performance with "test set" rather than "dev set"?

I think the key is that, if you want to check the performance, you need to have a criterion to define what's good.

Obviously, the accuracy of frame and WER are the most straightforward ways to judge it. Bear in mind, there is a positive correlation between frame-accuracy and WER rather than direct ratio in both ML and DT models.

In addition, maybe you can use some kind of "confidence measure" to help you check the performance.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAGWA%2Bufo92fD9z8ZyBkxq4yRtcy6FStO6vB2EdSoRagdT9qwQg%40mail.gmail.com.

Saranya V

unread,

Sep 17, 2019, 1:07:32 AM9/17/19

to kaldi...@googlegroups.com

Thanks.

Saranya

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAEGQdumCxtSp6DVm1gucqw2vuXPkDmNBvX%3DUuU%3DaeA%3DtckfzJA%40mail.gmail.com.

Reply all

Reply to author

Forward