Different results from lattice-align-words and lattice-mbr-decode

587 views
Skip to first unread message

Oleksandr Chekmez

unread,
Aug 30, 2019, 9:13:22 AM8/30/19
to kaldi-help
Dear community,

I used the following command to get recognized text with timings and confidence:

lattice-align-words exp/$model/graph/phones/word_boundary.int \
     exp
/$model/final.mdl \
     ark
:"gunzip -c exp/$model/decode_result_$input_dir/lat.1.gz |" ark:- | \
lattice
-to-ctm-conf  --decode-mbr=false --acoustic-scale=1.0 --lm-scale=1.0 ark:- - | \
utils
/int2sym.pl -f 5 exp/$model/graph/words.txt >$input_dir/ctm_with_conf.txt


and the following command to get sausage:
lattice-mbr-decode --acoustic-scale=1.0 --lm-scale=1.0 --one-best-times=false \
 ark
:"gunzip -c exp/$model/decode_result_$input_dir/lat.1.gz |" \
 ark
,t:/dev/null ark:/dev/null \
 ark
,t:|int2symBrackets.pl -f 1- exp/$model/graph/words.txt > $input_dir/sausage.txt

When I compare results I have differences which I don`t understand:

ctm_with_conf.txt:
r36udm13pevo9e4scwqoxbt9 1 2.51 0.10 but 1.00 
r36udm13pevo9e4scwqoxbt9 1 2.61 0.12 both 1.00 
r36udm13pevo9e4scwqoxbt9 1 2.73 0.13 butter 0.24 
r36udm13pevo9e4scwqoxbt9 1 3.02 0.14 bowl 0.11 
r36udm13pevo9e4scwqoxbt9 1 3.38 0.20 both 1.00 
r36udm13pevo9e4scwqoxbt9 1 3.71 0.21 butterflies 0.99 
r36udm13pevo9e4scwqoxbt9 1 3.94 0.08 at 0.94 

sausage.txt:
but 1 
<eps> 1 
both 1 
<eps> 1 
butter 0.243563|bearer 0.1430227|better 0.1097486...
<eps> 0.9916875|it 0.004323407|you 0.001646348...
but 0.4604309|for 0.2363734|bowl 0.1035512....
<eps> 0.5185794|for 0.2392604|full 0.05286708...
both 0.9996883|for 0.0002321559|<eps> 7.743229e-05...
<eps> 0.999949|both 5.098683e-05 
butterflies 0.9853628|butterfly's 0.01124306|verifies 0.002124506...
<eps> 1 
at 0.9416133|<eps> 0.02277535|laugh 0.01777043...

I assume that lines which starts with <eps> from sausage.txt are just removed by lattice-align-words. But why lattice-align-words returns word "bowl" with conf 0.11, and sausage has word "but" with conf 0.46 at that moment, word "bow" is only 3rd option with conf 0.1. 
Both commands executed with the same acoustic-scale and lm-scale (1.0). 

Thank you for your help! 

Daniel Povey

unread,
Aug 30, 2019, 2:36:43 PM8/30/19
to kaldi-help
There are a couple of differences, except for the format.

(1) lattice-to-ctm-conf is outputting the 1-best (viterbi) path, not the minimum bayes risk sequence which would  correspond to the most likely word in each bin-- since you use --decode-mbr=false

(3) lattice-mbr-decode is of course including the epsilon positions between words, since you have --one-best-times=false.

(3) The one-best path is different (because MBR vs. Viterbi), so the Levenshtein alignment of paths to that 1-best may be different; that explains why the posteriors are not exactly the same.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/a00b382e-6c74-430b-9f6f-45b00acdfa31%40googlegroups.com.

Aleksandr

unread,
Sep 16, 2019, 6:13:15 AM9/16/19
to kaldi...@googlegroups.com
Dear Dan,

Thank you for pointing that out, I adjusted lattice-to-ctm-conf and now call it with --decode-mbr=true, but results is still different:

lattice-align-words exp/$model/graph/phones/word_boundary.int \
     exp
/$model/final.mdl \
     ark
:"gunzip -c exp/$model/decode_result_$input_dir/lat.1.gz |" ark:- | \

lattice
-to-ctm-conf  --decode-mbr=true --acoustic-scale=1.0 --lm-scale=1.0 ark:- - | \

utils
/int2sym.pl -5 exp/$model/graph/words.txt >$input_dir/ctm_with_conf.txt  

ctm_with_conf.txt:
r36udm13pevo9e4scwqoxbt9 1 2.51 0.10 but 1.00 
r36udm13pevo9e4scwqoxbt9 1 2.61 0.12 both 1.00
r36udm13pevo9e4scwqoxbt9 1 2.88 0.07 but 0.44
r36udm13pevo9e4scwqoxbt9 1 3.02 0.14 for 0.48
r36udm13pevo9e4scwqoxbt9 1 3.38 0.20 both 1.00
r36udm13pevo9e4scwqoxbt9 1 3.71 0.21 butterflies 0.99 

sausage.txt: 
but 1
<eps> 1
both 1
<eps> 1
butter 0.2435969|bearer 0.1430226|better 0.1097463|buried 0.09421764...
<eps> 0.9916852|it 0.004323332|you 0.00164771|the 0.0008785275...
but 0.4604488|for 0.2363635|bowl 0.1035855|the 0.03903381|ball 0.02843666...
<eps> 0.5185506|for 0.2392767|full 0.05286919|paul 0.02364405...
both 0.9996886|for 0.0003079867|<eps> 2.564478e-06...
<eps> 0.999949|both 5.098998e-05
butterflies 0.9854357|butterfly's 0.01117846|verifies 0.002125125... 

Daniel Povey

unread,
Sep 16, 2019, 6:37:59 AM9/16/19
to kaldi-help
Can you be more specific about what you expected to be the same, that
is different? Bear in mind that rounding may affect the results
(lattice-to-ctm-conf takes an option --num-digits or something, to
control that).
> To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAGZDUwV%2BYmWhLh35vR8yjSCUv_bi-vr3d-6MWmGqfQwQf36Z-g%40mail.gmail.com.

Aleksandr

unread,
Sep 16, 2019, 8:13:44 AM9/16/19
to kaldi...@googlegroups.com
When I run lattice-to-ctm-conf I received recognized text with confidences and after word "both" I received word "but" with conf 0.44

At the same time, when I use lattice-mbr-decode I receive the following set of possible words for that position: "butter 0.2435969|bearer 0.1430226|better 0.1097463...". I expected that this list of possible word will have word "but" as first with highest confidence. 

Daniel Povey

unread,
Sep 16, 2019, 9:57:55 AM9/16/19
to kaldi-help
I think you are comparing the bins, there is a "but" in the next bin.
Probably in that bin, the 1-best was just epsilon.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAGZDUwXcGM%2B9Xc5ARgoFMbcLSqE1anpiCoV_%2BTEUM%2BoeCed%3Dtw%40mail.gmail.com.

Aleksandr

unread,
Sep 17, 2019, 12:47:17 AM9/17/19
to kaldi...@googlegroups.com
Here is my recognition result in a table, to see bins from both results side by side:
BIN # Text from lattice-to-ctm-conf Text from lattice-mbr-decode
1   <eps> 1 
2 r36udm13pevo9e4scwqoxbt9 1 0.2400000 0.1399825 water 0.7058309  water 0.7058311|<eps> 0.1304389|lawyer 0.06727263…
3   <eps> 0.9943253|of 0.005190216|or 0.0004845159 
4 r36udm13pevo9e4scwqoxbt9 1 0.3815345 0.2683378 flies 0.9950237  flies 0.9950237|lies 0.004976265 
5   <eps> 1 
6 r36udm13pevo9e4scwqoxbt9 1 0.6500000 0.0735111 and 0.9410028  and 0.9410028|advanced 0.01638742|at 0.009151404|<eps> 0.007322763…
7   <eps> 1 
8 r36udm13pevo9e4scwqoxbt9 1 0.7235111 0.2064805 laws 0.2529200  laws 0.2529201|mars 0.2187698|mosques 0.1760647|loss 0.08817837…
9   <eps> 0.9988577|our 0.001142287 
10 r36udm13pevo9e4scwqoxbt9 1 0.9300991 0.0899001 are 0.8966429  are 0.896643|our 0.09158084|<eps> 0.004654972|hour 0.003700392…
11   <eps> 0.9376202|in 0.06237984 
12 r36udm13pevo9e4scwqoxbt9 1 1.0199993 0.2100008 insights 0.4904164  insights 0.4904158|insects 0.2674308|safe 0.08533115…
13   <eps> 0.9510205|safe 0.0363067|says 0.003859417|sex 0.003821309…
14 r36udm13pevo9e4scwqoxbt9 1 1.3500000 0.1396857 serve 0.6525384  serve 0.6525372|some 0.1198894|so 0.06815724|saw 0.06580546…
15   <eps> 0.9999602|some 3.977039e-05 
16 r36udm13pevo9e4scwqoxbt9 1 1.4896857 0.1803143 people 1.0000000  people 1 
17   <eps> 1 
18 r36udm13pevo9e4scwqoxbt9 1 1.6700000 0.1199950 get 1.0000000  get 1 
19   <eps> 1 
20 r36udm13pevo9e4scwqoxbt9 1 1.7900000 0.1200000 the 0.9982793  the 0.9982793|though 0.0006112218|their 0.0006033547|<eps> 0.0005061025 
21   <eps> 1 
22 r36udm13pevo9e4scwqoxbt9 1 1.9799999 0.2199133 confused 0.9847925  confused 0.9847925|fees 0.01042392|feast 0.004177736|fields 0.0006058776 
23   <eps> 1 
24 r36udm13pevo9e4scwqoxbt9 1 2.1999133 0.0600867 with 1.0000000  with 1 
25   <eps> 1 
26 r36udm13pevo9e4scwqoxbt9 1 2.2600000 0.0700000 each 1.0000000  each 1 
27   <eps> 1 
28 r36udm13pevo9e4scwqoxbt9 1 2.3299999 0.0900000 other 1.0000000  other 1 
29   <eps> 1 
30 r36udm13pevo9e4scwqoxbt9 1 2.5100000 0.1000000 but 1.0000000  but 1 
31   <eps> 1 
32 r36udm13pevo9e4scwqoxbt9 1 2.6099999 0.1202353 both 1.0000000  both 1 
33   <eps> 1 
34 r36udm13pevo9e4scwqoxbt9 1 2.8792298 0.0700555 but 0.4379970  butter 0.2435969|bearer 0.1430226|better 0.1097463|buried 0.09421764…
35   <eps> 0.9916852|it 0.004323332|you 0.00164771|the 0.0008785275…
36 r36udm13pevo9e4scwqoxbt9 1 3.0200000 0.1398901 for 0.4767899  but 0.4604488|for 0.2363635|bowl 0.1035855|the 0.03903381…
37   <eps> 0.5185506|for 0.2392767|full 0.05286919|paul 0.02364405….
38 r36udm13pevo9e4scwqoxbt9 1 3.3799996 0.2000000 both 1.0000000  both 0.9996886|for 0.0003079867|<eps> 2.564478e-06|also 5.54811e-07…
39   <eps> 0.999949|both 5.098998e-05 
40 r36udm13pevo9e4scwqoxbt9 1 3.7099998 0.2100000 butterflies 0.9854556  butterflies 0.9854357|butterfly's 0.01117846|verifies 0.002125125|<eps> 0.0009222387….
41   <eps> 1 
42 r36udm13pevo9e4scwqoxbt9 1 3.9399998 0.0800000 at 0.9418062  at 0.9418063|<eps> 0.05603137|butterfly's 0.001223754…
43   <eps> 1 
44 r36udm13pevo9e4scwqoxbt9 1 4.0200000 0.1500000 bath 0.9392448  bath 0.9392449|<eps> 0.02396945|laugh 0.01915774….
45   <eps> 1 
46 r36udm13pevo9e4scwqoxbt9 1 4.2700000 0.1007605 sir 0.9014515  sir 0.9014517|sow 0.03798443|<eps> 0.03339657….

As you can see bins 1-33? 38-46 have perfect match, but 34, 36 are different. Words are different, confidences are different. 
Am I doing something wrong? Some parameter missing or it is possible situation? 

Just in case I duplicate commands I used:
to get recognized text with timings and confidence:
lattice-align-words exp/$model/graph/phones/word_boundary.int \
     exp
/$model/final.mdl \
     ark
:"gunzip -c exp/$model/decode_result_$input_dir/lat.1.gz |" ark:- | \

lattice
-to-ctm-conf  --decode-mbr=true --acoustic-scale=1.0 --lm-scale=1.0 ark:- - | \
utils
/int2sym.pl -5 exp/$model/graph/words.txt >$input_dir/ctm_with_conf.txt

to get "sausage":
lattice-mbr-decode --acoustic-scale=1.0 --lm-scale=1.0 --one-best-times=false \
 ark
:"gunzip -c exp/$model/decode_result_$input_dir/lat.1.gz |" \
 ark
,t:/dev/null ark:/dev/null \
 ark
,t:|int2symBrackets.pl -1- exp/$model/graph/words.txt > $input_dir/sausage.txt


Daniel Povey

unread,
Sep 17, 2019, 8:52:44 AM9/17/19
to kaldi-help
I'm not sure; it could be that there was a tie in the Levenshtein alignment algorithm and due to numerical roundoff, it took one path vs. the
other.  (I thought we had a mechanism to make this deterministic, but  I may have that wrong). If it's some kind of tie, I'd expect that the 
MBR objective for that sentence would be quite close for both versions.  (May be visible at high enough debug level.)

Dan



Amol Bole

unread,
Feb 27, 2020, 12:58:56 AM2/27/20
to kaldi-help
Dear All,

I also trying to get word level confidence but failed because I could not able to generate lat.1.gz file, for my single utterance.
Here is my code. Can you help me in this?

online2-wav-nnet3-latgen-faster --config=$dir/conf/online.conf --do-endpointing=false --frames-per-chunk=20 --extra-left-context-initial=0 --online=true --frame-subsampling-factor=3 --max-active=7000 --beam=15.0 --lattice-beam=6.0 --online=true --acoustic-scale=1.0 --word-symbol-table=$graph_dir/words.txt $dir/final.mdl $graph_dir/HCLG.fst ark:$wav_dir/spk2utt scp:$wav_dir/wav.scp ark,t:$wav_dir/trans.txt >$wav_dir/log.txt 2> $wav_dir/out.txt

lattice-align-words $graph_dir/phones/word_boundary.int \
     $dir/final.mdl \
     ark:"gunzip -c $dir/decode/lat.1.gz |" ark:- | \

lattice-to-ctm-conf  --decode-mbr=false --acoustic-scale=1.0 --lm-scale=1.0 ark:- - | \
utils/int2sym.pl -f 5 $graph_dir/words.txt > $wav_dir/word_level_conf.txt

Can you help me to generate lat.1.gz ?
>> >>> To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

>> >>> To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/a00b382e-6c74-430b-9f6f-45b00acdfa31%40googlegroups.com.
>> >>
>> >> --
>> >> Go to http://kaldi-asr.org/forums.html find out how to join
>> >> ---
>> >> You received this message because you are subscribed to the Google Groups "kaldi-help" group.
>> >> To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

>> >> To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAEWAuyS%2BybEmO_6aszwK_BeFvwgst%2Ba3scFaRjH2YQwtmceq3w%40mail.gmail.com.
>> >
>> > --
>> > Go to http://kaldi-asr.org/forums.html find out how to join
>> > ---
>> > You received this message because you are subscribed to the Google Groups "kaldi-help" group.
>> > To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

>> > To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAGZDUwV%2BYmWhLh35vR8yjSCUv_bi-vr3d-6MWmGqfQwQf36Z-g%40mail.gmail.com.
>>
>> --
>> Go to http://kaldi-asr.org/forums.html find out how to join
>> ---
>> You received this message because you are subscribed to the Google Groups "kaldi-help" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

>> To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAEWAuyTnL8ENudqsW8U66Te-PPk8AVVfmrJVstvoK31tb0QzBg%40mail.gmail.com.
>
> --
> Go to http://kaldi-asr.org/forums.html find out how to join
> ---
> You received this message because you are subscribed to the Google Groups "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAGZDUwXcGM%2B9Xc5ARgoFMbcLSqE1anpiCoV_%2BTEUM%2BoeCed%3Dtw%40mail.gmail.com.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Jan Trmal

unread,
Feb 27, 2020, 11:48:03 AM2/27/20
to kaldi-help
Hard to say when you keep the log for yourself :)
Look into the log file and the reason might be there...
y.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/e5010854-d074-4264-a6e1-a8da8ca91ca2%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages