Test the Accuracy of ASpIRE Chain Model

Kayas

unread,

Jul 5, 2018, 1:30:41 PM7/5/18

to kaldi-help

Hi All,

I am very new to Kaldi.

I just want to measure the accuracy of the newest model of Kaldi (ASpIRE Chain Model) using LibriSpeech datasets and find out the WER(word error rate).

From that link: https://chrisearch.wordpress.com/2017/03/11/speech-recognition-using-kaldi-extending-and-using-the-aspire-model/, I learned to use the model.

I use the below command to get the transcript of a .wav file:

online2-wav-nnet3-latgen-faster \
--online=true \
--do-endpointing=false \
--frame-subsampling-factor=1 \
--config=exp/tdnn_7b_chain_online/conf/online.conf \
--max-active=7000 \
--beam=15.0 \
--lattice-beam=6.0 \
--acoustic-scale=1.0 \
--word-symbol-table=exp/tdnn_7b_chain_online/graph_pp/words.txt \
exp/tdnn_7b_chain_online/final.mdl \
exp/tdnn_7b_chain_online/graph_pp/HCLG.fst \
'ark:echo utterance-id1 utterance-id1|' \
'scp:echo utterance-id1 <wavpath>|' \
'ark:/dev/null'

Now, I have several questions and I will be very happy if you can help me to find out the answers.

1. I can not find any place where the parameters are well described (e.g., --lattice-beam=6.0, 'ark:echo utterance-id1 utterance-id1|', 'ark:echo utterance-id1 utterance-id1|'). Is there any documentation describing the parameters to invoke the binary "online2-wav-nnet3-latgen-faster". Because the features of the binary "online2-wav-nnet3-latgen-faster" is also not very clear. For example, if I want to write the transcript into a text file, is it supported?

2. I want to write some scripts (python/c++/bash) that can automate the accuracy measurement process. I already have some python scripts that can read the ground truth from LibriSpeech dataset. Now I need to write some scripts that can generate the transcript from the audio file using "ASpIRE Chain Model". Is there any official support to measure the accuracy of that model?

Thanks in advance for your time.

Regards,

Kayas

Daniel Povey

unread,

Jul 5, 2018, 4:13:48 PM7/5/18

to kaldi-help

I think your question is based on a misunderstanding of what Kaldi is all about.

Kaldi isn't a single speech recognition model that we sometimes "update", it is a set of tools and we provide example scripts for different setups. And it doesn't really make sense to evaluate the accuracy of the Aspire model on Librispeech, when kaldi actually has scripts to build models trained with Librispeech and we report results for it.

E.g. just look at the file

egs/librispeech/s5/local/chain/run_tdnn.sh

which has results as follows:

# local/chain/compare_wer.sh exp/chain_cleaned/tdnn_1b_sp exp/chain_cleaned/tdnn_1c_sp

# System tdnn_1b_sp tdnn_1c_sp

# WER on dev(fglarge) 3.77 3.35

# WER on dev(tglarge) 3.90 3.49

# WER on dev(tgmed) 4.89 4.30

# WER on dev(tgsmall) 5.47 4.78

# WER on dev_other(fglarge) 10.05 8.76

# WER on dev_other(tglarge) 10.80 9.26

# WER on dev_other(tgmed) 13.07 11.21

# WER on dev_other(tgsmall) 14.46 12.47

# WER on test(fglarge) 4.20 3.87

# WER on test(tglarge) 4.28 4.08

# WER on test(tgmed) 5.31 4.80

# WER on test(tgsmall) 5.97 5.25

# WER on test_other(fglarge) 10.44 8.95

# WER on test_other(tglarge) 11.05 9.41

# WER on test_other(tgmed) 13.36 11.52

# WER on test_other(tgsmall) 14.90 12.66

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/5e43de10-b84b-41c6-9775-2b2bb310887e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kayas

unread,

Jul 5, 2018, 5:57:37 PM7/5/18

to kaldi-help

Thanks for the answer. If I want to measure WER for Aspire model, what should I do?

And one more thing, when I try to run:

egs/librispeech/s5/local/chain/run_tdnn.sh

I get the following error:

run_tdnn.sh: 61: .: Can't open ./cmd.sh

Do I need to download the model that is trained with Librispeech? Can you please tell me which model used Librispeech for training?

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

Daniel Povey

unread,

Jul 5, 2018, 6:22:50 PM7/5/18

to kaldi-help

It's supposed to be run from the directory where it is. But to run it you should be on a big server-type machine, or preferably a cluster running GridEngine.

Maybe try the "Kaldi for dummies" tutorial first.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/7bdcf3c8-bfb6-461c-94eb-cc6c6f20b5da%40googlegroups.com.

Reply all

Reply to author

Forward