ctm file with words, times and confidences for each n-best paths

Haritz Arzelus

unread,

Oct 9, 2019, 10:45:42 AM10/9/19

to kaldi-help

Hi all,

I want to do some experiments using CTM files that contain words, timecodes and confidence score for each N-best paths. I can create the file using the lattice and the script steps/get_ctm_conf.sh but the CTM file contains all I need for 1-best path. I'm also interested on the other paths but I'm not sure how to do this (e.g. 100).

Until now, I tried to modify steps/get_ctm_conf.sh:

    1) I inserted lattice-to-nbest in the pipeline, without more changes:
        lattice-prune ... | lattice-to-nbest --n=100 ... | lattice-align-words ... | lattice-align-words ... | lattice-to-ctm-conf ... | ...

    2) I modified the pipeline:
        lattice-prune ... | lattice-to-nbest --n=100 ... | nbest-to-ctm ... | ...

1) creates the CTM file correctly but there's something wrong because the transcriptions are worse. For instance, the WER of the 1-best transcription of that CTM file is 6 points higher than the WER I obtain from the original CTM file, created with steps/get_ctm_conf.sh without modifications.

2) creates the CTM file without confidence scores and the transcriptions are worse too.

I checked the forum and found similar questions (https://groups.google.com/forum/#!searchin/kaldi-help/lattice-to-nbest%7Csort:date/kaldi-help/II24CNQYihc/ZcMqCBHJAAAJ) but I don't see clearly how this should be done.

Thank you for any help or advice.

Daniel Povey

unread,

Oct 9, 2019, 10:58:20 AM10/9/19

to kaldi-help

> I want to do some experiments using CTM files that contain words, timecodes and confidence score for each N-best paths. I can create the file using the lattice and the script steps/get_ctm_conf.sh but the CTM file contains all I need for 1-best path. I'm also interested on the other paths but I'm not sure how to do this (e.g. 100).

> Until now, I tried to modify steps/get_ctm_conf.sh:

> 1) I inserted lattice-to-nbest in the pipeline, without more changes:
> lattice-prune ... | lattice-to-nbest --n=100 ... | lattice-align-words ... | lattice-align-words ... | lattice-to-ctm-conf ... | ...

> 2) I modified the pipeline:
> lattice-prune ... | lattice-to-nbest --n=100 ... | nbest-to-ctm ... | ...

> 1) creates the CTM file correctly but there's something wrong because the transcriptions are worse. For instance, the WER of the 1-best transcription of that CTM file is 6 points higher than the WER I obtain from the original CTM file, created with steps/get_ctm_conf.sh without modifications.

Perhaps you forgot to set the acoustic scale correctly for lattice-to-nbest.

> 2) creates the CTM file without confidence scores and the transcriptions are worse too.

You actually still need to use lattice-to-ctm-conf. You should use
the n-best output as the 2nd input of lattice-to-ctm-conf, but the 1st
input should be the original lattice.

> I checked the forum and found similar questions (https://groups.google.com/forum/#!searchin/kaldi-help/lattice-to-nbest%7Csort:date/kaldi-help/II24CNQYihc/ZcMqCBHJAAAJ) but I don't see clearly how this should be done.

Thank you for any help or advice.

> --
> Go to http://kaldi-asr.org/forums.html find out how to join
> ---
> You received this message because you are subscribed to the Google Groups "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/2099197f-b809-42d4-b841-1b5bb376eddb%40googlegroups.com.

Haritz Arzelus

unread,

Oct 9, 2019, 11:28:02 AM10/9/19

to kaldi-help

I created a modified script to do 1), based on steps/get_ctm_conf.sh. The script contains a new line (the others are exactly the same):

...

min_lmwt=5
max_lmwt=20

...

if [ -f $lang/phones/word_boundary.int ]; then
    $cmd LMWT=$min_lmwt:$max_lmwt $dir/scoring/log/get_ctm.LMWT.log \
      mkdir -p $dir/score_LMWT/ '&&' \
      lattice-prune --inv-acoustic-scale=LMWT --beam=5 "ark:gunzip -c $dir/lat.*.gz|" ark:- \| \
      lattice-to-nbest --n=1000 ark:- ark:- \| \
      lattice-align-words $lang/phones/word_boundary.int $model ark:- ark:- \| \
      lattice-to-ctm-conf $frame_shift_opt --decode-mbr=true --inv-acoustic-scale=LMWT ark:- - \| \
      utils/int2sym.pl -f 5 $lang/words.txt \| \
      $filter_cmd '>' $dir/score_LMWT/$name.nbest.ctm || exit 1;
...

I call the script in the same way I call steps/get_ctm_conf.sh:

steps/get_ctm_conf_NBEST1000.sh --cmd "run.pl" \
--use-segments false \
../data/test \
../data_trigram/lang \
../exp/chain/tdnn_sp/decode_test

Is this pipeline the correct way to do what I want?

Thanks

> To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Daniel Povey

unread,

Oct 9, 2019, 11:34:56 AM10/9/19

to kaldi-help

That's not right. Read what I wrote more carefully.

On Wed, Oct 9, 2019 at 8:28 AM 'Haritz Arzelus' via kaldi-help

> To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/629606cf-92ee-4c6e-a3f8-bd4118022668%40googlegroups.com.

Message has been deleted

Haritz Arzelus

unread,

Oct 11, 2019, 5:45:26 AM10/11/19

to kaldi-help

I set the acoustic scale for lattice-to-nbest and it fixed the problem.

Thanks!

> To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/629606cf-92ee-4c6e-a3f8-bd4118022668%40googlegroups.com.

ml colab

unread,

Aug 19, 2020, 5:43:12 AM8/19/20

to kaldi-help

Hey hi dan , i have have transcripts of audio and i want the confidence scores of words in actual transcripts not of predicted ones

for examples one i used steps/conf/get_ctm_conf.sh now i got something like this

user_t1 1 0.58 0.47 GROINING 1.00

user_t1 1 1.05 0.36 SERF 0.47

user_t1 1 1.41 0.21 MY 1.00

user_t1 1 1.62 0.18 SCYLLA 0.55

but the actual transcripts are "Good Morning Sir Myself" so the output should be of this format

user_t1 1 0.58 0.47 Good 0.80

user_t1 1 1.05 0.36 Morning 0.87

user_t1 1 1.41 0.21 Sir 0.79

user_t1 1 1.62 0.18 Myself 0.55

can we get the required results

Daniel Povey

unread,

Aug 20, 2020, 12:53:27 AM8/20/20

to kaldi-help

Looks like maybe you were running without a language model or with LM weight set to zero.

AFAIK there isn't a script that will get confidence on another transcript, although it's possible to do by using the

<1best-rspecifier> option to lattice-to-ctm-confs, you'd have to get the alignments as lattices using

steps/align_fmllr_lats.sh or steps/nnet3/align_lats.sh.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/450abc2f-720f-4a94-a437-8407e16f7ed4n%40googlegroups.com.