Word timings of nnet3 decoding

656 views
Skip to first unread message

sandeep cb

unread,
May 24, 2018, 12:03:19 AM5/24/18
to kaldi-help

I am trying to get the word timings of the decoded output using the chain model.
The output og nbest-to-ctm seems to be wrong. Word timings are not correct.
What am I doing wrong?

Here is the script, I am running to get the word timings.

online2-wav-nnet3-latgen-faster --do-endpointing=false \
   
--online=false \
   
--config=conf/decode.config \
   
--max-active=7000 --beam=15.0 --lattice-beam=6.0 \
 
--mfcc-config=conf/mfcc_hires.conf \
 
--feature-type=mfcc --frame-subsampling-factor=3 \
   
--acoustic-scale=1.0 --word-symbol-table=data/decode_460/tdnn_sp/words.txt \
 
--feature-type=mfcc \
 
--ivector-extraction-config=data/decode_460/conf/ivector_extractor.conf \
   data
/decode_460/tdnn_sp/final.mdl data/decode_460/tdnn_sp/HCLG.fst \
 
"ark:echo utterance-id1 utterance-id1|" "scp:echo utterance-id1 /home/gnani/Downloads/SoundRecord-2018-05-08-12-07-02_8k.wav|"  \
  ark:| lattice-1best ark:- ark: | \
  lattice
-align-words data/lang/phones/word_boundary.int data/decode_460/tdnn_sp/final.mdl ark:- ark:- | \
  nbest
-to-ctm --frame-shift=0.01 --print-silence=true ark:- - | \
  utils
/int2sym.pl -f 5 data/decode_460/tdnn_sp/words.txt

The output I am getting is this:

LOG (online2-wav-nnet3-latgen-faster[5.4.122~3-08012]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG
(online2-wav-nnet3-latgen-faster[5.4.122~3-08012]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG
(online2-wav-nnet3-latgen-faster[5.4.122~3-08012]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 1 orphan nodes.
LOG
(online2-wav-nnet3-latgen-faster[5.4.122~3-08012]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 2 orphan components.
LOG
(online2-wav-nnet3-latgen-faster[5.4.122~3-08012]:Collapse():nnet-utils.cc:1314) Added 1 components, removed 2
LOG
(online2-wav-nnet3-latgen-faster[5.4.122~3-08012]:CompileLooped():nnet-compile-looped.cc:334) Spent 0.187792 seconds in looped compilation.
utterance
-id1 WHICH COINCIDENTALLY PRETTY MUCH MATCHES THE WORDS DEFINITION
LOG
(online2-wav-nnet3-latgen-faster[5.4.122~3-08012]:main():online2-wav-nnet3-latgen-faster.cc:286) Decoded utterance utterance-id1
LOG
(online2-wav-nnet3-latgen-faster[5.4.122~3-08012]:Print():online-timing.cc:55) Timing stats: real-time factor for offline decoding was 0.304333 = 1.64477 seconds  / 5.4045 seconds.
LOG
(online2-wav-nnet3-latgen-faster[5.4.122~3-08012]:main():online2-wav-nnet3-latgen-faster.cc:292) Decoded 1 utterances, 0 with errors.
LOG
(online2-wav-nnet3-latgen-faster[5.4.122~3-08012]:main():online2-wav-nnet3-latgen-faster.cc:294) Overall likelihood per frame was 2.32917 per frame over 180 frames.
utterance
-id1 1 0.000 0.380 <eps>
utterance
-id1 1 0.380 0.110 WHICH
utterance
-id1 1 0.490 0.270 COINCIDENTALLY
utterance
-id1 1 0.760 0.110 PRETTY
utterance
-id1 1 0.870 0.090 MUCH
utterance
-id1 1 0.960 0.130 MATCHES
utterance
-id1 1 1.090 0.040 THE
utterance
-id1 1 1.130 0.160 WORDS
utterance
-id1 1 1.290 0.030 <eps>
utterance
-id1 1 1.320 0.250 DEFINITION
utterance
-id1 1 1.570 0.230 <eps>
LOG
(lattice-1best[5.4.122~3-08012]:main():lattice-1best.cc:92) Done converting 1 to best path, 0 had errors.
LOG
(lattice-align-words[5.4.122~3-08012]:main():lattice-align-words.cc:125) Successfully aligned 1 lattices; 0 had errors.
LOG
(nbest-to-ctm[5.4.122~3-08012]:main():nbest-to-ctm.cc:119) Converted 1 linear lattices to ctm format; 0 had errors.

The sox output of the file:

Input File     : '/home/gnani/Downloads/SoundRecord-2018-05-08-12-07-02_8k.wav'
Channels       : 1
Sample Rate    : 8000
Precision      : 16-bit
Duration       : 00:00:05.40 = 43236 samples ~ 405.337 CDDA sectors
File Size      : 86.5k
Bit Rate       : 128k
Sample Encoding: 16-bit Signed Integer PCM




Daniel Povey

unread,
May 24, 2018, 12:39:40 AM5/24/18
to kaldi-help
from `--acoustic-scale=1.0` I deduce that that's a chain model.
should be frame-shift=0.03
> --
> Go to http://kaldi-asr.org/forums.html find out how to join
> ---
> You received this message because you are subscribed to the Google Groups
> "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kaldi-help+...@googlegroups.com.
> To post to this group, send email to kaldi...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/kaldi-help/1fec9bfc-9cc1-41ce-9f32-6f5fbab56089%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

sandeep cb

unread,
May 24, 2018, 5:37:23 AM5/24/18
to kaldi-help
Thanks Dan,
That worked pretty well.
Reply all
Reply to author
Forward
0 new messages