Hi all,
I am a new to Kaldi. I run with voxforge example. This is a part in egs/voxforge/s5/run.sh . I run with it and I have some questions.
How to calculate Total Hours Training?
I donot know how many features used in train monophone, tri1(first triphone pass), tri2a (delta+delta-deltas)?
After i run it, I see in exp/mono/decode, exp/tri1/decode and exp/tri2a/decode which have some files: wer9, wer10, wer11,,,wer20. What is index (9,10,11..20)? And what is word error rate for test?
Please help to answer my question.
Thank you so much.
#!/bin/bash
# Copyright 2012 Vassil Panayotov
# Apache 2.0
# NOTE: You will want to download the data set first, before executing this script.
# This can be done for example by:
# 1. Setting the DATA_ROOT variable to point to a directory with enough free
# space (at least 20-25GB currently (Feb 2014))
# 2. Running "getdata.sh"
# The second part of this script comes mostly from egs/rm/s5/run.sh
# with some parameters changed
. ./path.sh || exit 1
# If you have cluster of machines running GridEngine you may want to
# change the train and decode commands in the file below
. ./cmd.sh || exit 1
# The number of parallel jobs to be started for some parts of the recipe
# Make sure you have enough resources(CPUs and RAM) to accomodate this number of jobs
njobs=2
#delta_opts=0
# The number of randomly selected speakers to be put in the test set
nspk_test=2
# Test-time language model order
lm_order=2
# Word position dependent phones?
pos_dep_phones=true
# Removing previously created data (from last run.sh execution)
rm -rf exp mfcc data/train/spk2utt data/train/cmvn.scp data/train/feats.scp data/train/split1 data/test/spk2utt data/test/cmvn.scp data/test/feats.scp data/test/split1 data/local/lang data/lang data/local/tmp data/local/dict/lexiconp.txt data/local/lm.arpa data/local/spk2gender data/local/spk2gender.tmp data/local/test.spk2utt data/local/test.utt2spk data/local/test_wav.scp data/local/train.spk2utt data/local/train.utt2spk data/local/train_wav.scp
# The user of this script could change some of the above parameters. Example:
# /bin/bash run.sh --pos-dep-phones false
. utils/parse_options.sh || exit 1
[[ $# -ge 1 ]] && { echo "Unexpected arguments"; exit 1; }
# Initial normalization of the data
local/voxforge_data_prep.sh --nspk_test ${nspk_test} ${DATA_ROOT} || exit 1
# Prepare ARPA LM and vocabulary using SRILM
local/voxforge_prepare_lm.sh --order ${lm_order} || exit 1
# Prepare data/lang and data/local/lang directories
utils/prepare_lang.sh --position-dependent-phones $pos_dep_phones \
data/local/dict '!SIL' data/local/lang data/lang || exit 1
# Prepare G.fst and data/{train,test} directories
local/voxforge_format_data.sh || exit 1
# Now make MFCC features.
# mfccdir should be some place with a largish disk where you
# want to store MFCC features.
mfccdir=${DATA_ROOT}/mfcc
for x in train test; do
steps/make_mfcc.sh --cmd "$train_cmd" --nj $njobs \
data/$x exp/make_mfcc/$x $mfccdir || exit 1;
steps/compute_cmvn_stats.sh data/$x exp/make_mfcc/$x $mfccdir || exit 1;
done
# Train monophone models on a subset of the data
utils/subset_data_dir.sh data/train 1400 data/train.1k4 || exit 1;
steps/train_mono.sh --nj $njobs --cmd "$train_cmd" data/train.1k4 data/lang exp/mono || exit 1;
# Monophone decoding
utils/mkgraph.sh --mono data/lang_test exp/mono exp/mono/graph || exit 1
# note: local/decode.sh calls the command line once for each
# test, and afterwards averages the WERs into (in this case
# exp/mono/decode/
steps/decode.sh --config conf/decode.config --nj $njobs --cmd "$decode_cmd" \
exp/mono/graph data/test exp/mono/decode
# Get alignments from monophone system.
steps/align_si.sh --nj $njobs --cmd "$train_cmd" \
data/train data/lang exp/mono exp/mono_ali || exit 1;
# train tri1 [first triphone pass]
steps/train_deltas.sh --cmd "$train_cmd" \
2000 11000 data/train data/lang exp/mono_ali exp/tri1 || exit 1;
# decode tri1
utils/mkgraph.sh data/lang_test exp/tri1 exp/tri1/graph || exit 1;
steps/decode.sh --config conf/decode.config --nj $njobs --cmd "$decode_cmd" \
exp/tri1/graph data/test exp/tri1/decode
#draw-tree data/lang/phones.txt exp/tri1/tree | dot -Tps -Gsize=8,10.5 | ps2pdf - tree.pdf
# align tri1
steps/align_si.sh --nj $njobs --cmd "$train_cmd" \
--use-graphs true data/train data/lang exp/tri1 exp/tri1_ali || exit 1;
# train tri2a [delta+delta-deltas]
steps/train_deltas.sh --cmd "$train_cmd" 2000 11000 \
data/train data/lang exp/tri1_ali exp/tri2a || exit 1;
# decode tri2a
utils/mkgraph.sh data/lang_test exp/tri2a exp/tri2a/graph
steps/decode.sh --config conf/decode.config --nj $njobs --cmd "$decode_cmd" \
exp/tri2a/graph data/test exp/tri2a/decode
#score
for x in exp/*/decode*; do [ -d $x ] && grep WER $x/wer_* | utils/best_wer.sh; done