tri2 or tri3 Decoding

solr...@gmail.com

unread,

Apr 29, 2019, 5:26:32 AM4/29/19

to kaldi-help

Hi All,

I have train the model by referring to the mini_librispeech example.

When I tried to use the trained model for decoding wav file to text, the "gmm-decode-faster " decoder is working on my tri1 model.

But when to decode with the tri2 and tri3 model, I faced the Dim mismatch error.

Here is my decoding script :

gmm-latgen-faster \
--word-symbol-table=exp/tri2b/graph/words.txt \
exp/tri2b/final.mdl \
exp/tri2b/graph/HCLG.fst \
ark:Decode/decode_comparison/delta-feats-VL190102104706108.ark \
ark,t:Decode/decode_comparison/gmm-decode-VL190102104706108.ark

Here is the error message that I get:

ERROR (gmm-decode-faster[5.4.198~1-be7c1]:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50) Dim mismatch: data dim = 39 vs. model dim = 40

[ Stack-Trace: ]
0   libkaldi-base.dylib                 0x0000000105871b77 _ZN5kaldiL18KaldiGetStackTraceEv + 263
1   libkaldi-base.dylib                 0x00000001058717be _ZN5kaldi13MessageLogger13HandleMessageERKNS_18LogMessageEnvelopeEPKc + 2926
2   libkaldi-base.dylib                 0x0000000105870bfb _ZN5kaldi13MessageLoggerD2Ev + 1499
3   libkaldi-base.dylib                 0x0000000105871a65 _ZN5kaldi13MessageLoggerD1Ev + 21
4   libkaldi-gmm.dylib                  0x00000001054cd436 _ZN5kaldi26DecodableAmDiagGmmUnmapped22LogLikelihoodZeroBasedEii + 870
5   gmm-decode-faster                   0x00000001040baa55 _ZN5kaldi24DecodableAmDiagGmmScaled13LogLikelihoodEii + 85
6   libkaldi-decoder.dylib              0x000000010449f396 _ZN5kaldi13FasterDecoder15ProcessEmittingEPNS_18DecodableInterfaceE + 582
7   libkaldi-decoder.dylib              0x000000010449f12c _ZN5kaldi13FasterDecoder6DecodeEPNS_18DecodableInterfaceE + 92
8   gmm-decode-faster                   0x00000001040acd2e main + 3774
9   libdyld.dylib                       0x00007fff9b1555c9 start + 1
10  ???                                 0x000000000000000a 0x0 + 10

WARNING (gmm-decode-faster[5.4.198~1-be7c1]:~HashList():util/hash-list-inl.h:117) Possible memory leak: 1021 != 1024: you might have forgotten to call Delete on some Elems

Can someone advice me on how to decode with the tri2 and tri3 model?

Thank you in advance.

Regards,

YiYang

Daniel Povey

unread,

Apr 29, 2019, 1:11:11 PM4/29/19

to kaldi-help

You need to use commands that are based on the commands used in the decoding scripts for those respective directories; you can't use a command from one for the other.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/e6d402ef-5c6e-486b-b2f9-cc26b5fa73fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

solr...@gmail.com

unread,

Apr 29, 2019, 9:49:59 PM4/29/19

to kaldi-help

Hi Dan,

Yes, for the decoding script, it is referring to their respective directories but the error for the Dim mismatch is encountered at tri2 and tri3.

Here is my decoding script for tri2:

gmm-decode-faster \
--word-symbol-table=exp/tri2b/graph_nosp_tgsmall/words.txt \
exp/tri2b/final.mdl \
exp/tri2b/graph_nosp_tgsmall/HCLG.fst \
ark:Decode/decode_comparison/delta-feats-VL190102104706108.ark \
ark,t:Decode/decode_comparison/gmm-decode-VL190102104706108.ark

Decoding script for tri3:

gmm-decode-faster \
--word-symbol-table=exp/tri3b/graph_tgsmall/words.txt \
exp/tri3b/final.mdl \
exp/tri3b/graph_tgsmall/HCLG.fst \
ark:Decode/decode_comparison/delta-feats-VL190102104706108.ark \
ark,t:Decode/decode_comparison/gmm-decode-VL190102104706108.ark

Thanks and Regards,

YiYang

Daniel Povey

unread,

Apr 29, 2019, 9:54:13 PM4/29/19

to kaldi-help

tri3 probably doesn't use delta feats, it probably uses LDA+MLLT; check its decoding log output.

--

Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/673cb21f-2976-46e9-a08c-27bd6a177414%40googlegroups.com.

solr...@gmail.com

unread,

Apr 30, 2019, 1:38:13 AM4/30/19

to kaldi-help

Hi Dan,

I try to apply the transform-feats to the wav file I try to decode, and I encounter this error:

WARNING (transform-feats[5.4.198~1-be7c1]:main():transform-feats.cc:110) Transform matrix for utterance VL190102104706108 has bad dimension 40x91 versus feat dim 117

Here is my script:

compute-mfcc-feats \
--config=conf/mfcc.conf \
'scp:echo VL190102104706108 ./Decode/decode_comparison/VL190102104706108.wav|' \
ark,scp:Decode/decode_comparison/VL190102104706108.ark,Decode/decode_comparison/VL190102104706108.scp

splice-feats \
scp:Decode/decode_comparison/VL190102104706108.scp \
ark:Decode/decode_comparison/splice-feats-VL190102104706108.ark

transform-feats \
exp/tri3b/final.mat \
ark:Decode/decode_comparison/splice-feats-VL190102104706108.ark \
ark:Decode/decode_comparison/transform-feats-VL190102104706108.ark

Is it the right way for me to generate the LDA+MLLT for the wav file I want to decode?

Regards,

YiYang

solr...@gmail.com

unread,

May 2, 2019, 7:52:26 AM5/2/19

to kaldi-help

Hi All,

I am able to decode with the Tri2 and Tri3 model now, but the decoding output for my Tri3 model is worsen than the Tri2 model.

Is it suppose to have better accuracy on decoding output for Tri3 model than Tri2 model?

Below is my decoding script for tri2:


splice-feats --left-context=3 --right-context=3 \
scp:Decode/decode_comparison/VL190102102426150.scp \
ark:Decode/decode_comparison/splice-feats-VL190102102426150.ark

transform-feats \
exp/tri2b/final.mat \
ark:Decode/decode_comparison/splice-feats-VL190102102426150.ark \
ark:Decode/decode_comparison/tri2/transform-feats-VL190102102426150.ark

gmm-decode-faster \
--beam=16.0 --max-active=4000 \
--acoustic-scale=0.0769 --allow-partial=false \
--word-symbol-table=exp/tri2b/graph/words.txt \
exp/tri2b/final.mdl \
exp/tri2b/graph/HCLG.fst \
ark:Decode/decode_comparison/tri2/transform-feats-VL190102102426150.ark \
ark,t:Decode/decode_comparison/tri2/gmm-decode-VL190102102426150_tri2.ark

Below is my decoding script for tri3:

transform-feats \
exp/tri3b/final.mat \
ark:Decode/decode_comparison/splice-feats-VL190102102426150.ark \
ark:Decode/decode_comparison/tri3/transform-feats-VL190102102426150.ark

gmm-decode-faster \
--beam=16.0 --max-active=4000 \
--acoustic-scale=0.0769 --allow-partial=false \
--word-symbol-table=exp/tri3b/graph/words.txt \
exp/tri3b/final.mdl \
exp/tri3b/graph/HCLG.fst \
ark:Decode/decode_comparison/tri3/transform-feats-VL190102102426150.ark \
ark,t:Decode/decode_comparison/tri3/gmm-decode-VL190102102426150_tri3.ark

Regards,

YiYang

Yi Yang

unread,

May 17, 2019, 1:55:09 AM5/17/19

to kaldi-help

Hi All,

For my wav files decoding with tri3 model which is getting lower accuracy of decoding output than tri2 model, is it because of tri3 model is use fMLLR adaptation and it cannot decode one wav file at a time?

Below is the script I use for decoding a wav file to text with the tri3 model:

compute-mfcc-feats \
--config=conf/mfcc.conf \
'scp:echo wav_1 ./Decode/decode_comparison/wav_1.wav|' \
ark,scp:Decode/decode_comparison/wav_1.ark,Decode/decode_comparison/wav_1.scp

splice-feats --left-context=3 --right-context=3 \
scp:Decode/decode_comparison/wav_1.scp \
ark:Decode/decode_comparison/splice-feats-wav_1.ark

transform-feats \
exp/tri3b/final.mat \
ark:Decode/decode_comparison/splice-feats-wav_1.ark \
ark:Decode/decode_comparison/tri3/transform-feats-wav_1.ark


gmm-decode-faster \
--beam=16.0 --max-active=4000 \
--acoustic-scale=0.0769 --allow-partial=false \
--word-symbol-table=exp/tri3b/graph/words.txt \
exp/tri3b/final.mdl \
exp/tri3b/graph/HCLG.fst \

ark:Decode/decode_comparison/tri3/transform-feats-wav_1.ark \
ark,t:Decode/decode_comparison/tri3/gmm-decode-wav_1_tri3.ark

Thanks and regards,

YiYang

Daniel Povey

unread,

May 17, 2019, 1:41:29 PM5/17/19

to kaldi-help

It looks to me like you aren't including any fMLLR transform. That model is intended to be used with adaptation, which requires two decoding passes (and at least one command that estimates the fMLLR transform matrix in between).

Look at what the standard decoding script (decode_fmllr.sh) does. Not all of what it does is necessary, though, e.g. the fact that it does 2 passes of fMLLR estimation instead of just one.

Dan

--

Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/7b2f5f12-c92c-400d-b83e-dbdd6c589e81%40googlegroups.com.

Yi Yang

unread,

Jun 2, 2019, 10:31:25 PM6/2/19

to kaldi-help

Hi Dan,

I have try to decode with the tri3 model by first using the "gmm-latgen-faster" then apply then the fMLLR transform and then get the decoding output with "gmm-decode-faster".

The decoding output is much better now, but it take up to 10 minutes to decode a wave file with duration around 1 minutes and 30 seconds.

What can I do in order to decode the wave file faster?

Here is the script I use to decode the wav file with the tri3 model:

...

gmm-latgen-faster --beam=16.0 --acoustic-scale=0.0769 exp/tri3b/final.mdl \
exp/tri3b/graph/HCLG.fst ark:Decode/tri3/wavFiles/wav1_transform-feats.ark "ark,t:|gzip -c > Decode/tri3/wavFiles/wav1.lats.gz"

silphonelist=`cat exp/tri3b/graph/phones/silence.csl`
splice_opts=`cat exp/tri3b/splice_opts 2>/dev/null`
cmvn_opts=`cat exp/tri3b/cmvn_opts 2>/dev/null`
sifeats="ark,s,cs:apply-cmvn $cmvn_opts --utt2spk=ark:Decode/tri3/wavFiles/utt2spk scp:Decode/tri3/wavFiles/wav1_cmvn.scp scp:Decode/tri3/wavFiles/wav1_feats.scp ark:- | splice-feats $splice_opts ark:- ark:- | transform-feats exp/tri3b/final.mat ark:- ark:- |";
run.pl --max-jobs-run 25 Decode/tri3/wavFiles/wav1_fmllr_pass1.log \
gunzip -c Decode/tri3/wavFiles/wav1.lats.gz \| \
lattice-to-post --acoustic-scale=0.0769 ark:- ark:- \| \
weight-silence-post 0.01 $silphonelist exp/tri3b/final.alimdl ark:- ark:- \| \
gmm-post-to-gpost exp/tri3b/final.alimdl "$sifeats" ark:- ark:- \| \
gmm-est-fmllr-gpost --fmllr-update-type=full \
--spk2utt=ark:Decode/tri3/wavFiles/spk2utt exp/tri3b/final.mdl "$sifeats" ark,s,cs:- \
ark:Decode/tri3/wavFiles/pre_trans.wav1.fmllr

feats="$sifeats transform-feats --utt2spk=ark:Decode/tri3/wavFiles/utt2spk ark:Decode/tri3/wavFiles/pre_trans.wav1.fmllr ark:- ark:- |"
run.pl --max-jobs-run 25 Decode/tri3/wavFiles/wav1_fmllr_pass2.log \
gmm-decode-faster --beam=16.0 --acoustic-scale=0.0769 \
--word-symbol-table=exp/tri3b/graph/words.txt exp/tri3b/final.mdl exp/tri3b/graph/HCLG.fst \
"$feats" ark,t:Decode/tri3/wavFiles/wav1_fmmlr.tra ark,t:Decode/tri3/wavFiles/wav1_fmmlr.ali 2>Decode/tri3/wavFiles/wav1_fmmlr_decode.log

Thanks and regards,

YiYang

Daniel Povey

unread,

Jun 3, 2019, 1:34:40 PM6/3/19

to kaldi-help

It shouldn't take that long. I wouldn't be able to say why it's so slow until you figure out which specific program is taking most of the time, and show some of its log output.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/d36b5229-748d-4f67-a8a8-0893b63cea5f%40googlegroups.com.

Yi Yang

unread,

Jun 3, 2019, 9:41:43 PM6/3/19

to kaldi-help

Hi Dan,

I manage to make it decode faster by adding the "--max-active=4000" option.

And for the decoding is it ok if I done it like that?

Thanks and regards,

YiYang

Daniel Povey

unread,

Jun 3, 2019, 10:18:27 PM6/3/19

to kaldi-help

Yes, it's fine. You may get a very small degradation in WER; you would have to check.

I suspect the reason for the slowness is that you are using an acoustic scale of about 0.075,

which is on the low side; so you should reduce the beam proportionally. E.g. I'd use a beam of

maybe 12 with that.

--

Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/138c9a6e-58cb-49af-8e9c-cf38b349b78c%40googlegroups.com.

Reply all

Reply to author

Forward