Zero duration words with mbr decode and print-silence=true

55 views
Skip to first unread message

re...@speechmatics.com

unread,
Sep 7, 2022, 11:45:38 AM9/7/22
to kaldi-help
Hi,

In some occasions, when I run a command such as:

lattice-to-ctm-conf --frame-shift=0.03 --decode-mbr=true --print-silence=true ark:zero-bug.lat out.ctm

I get words with zero duration in the output. I've attached a lattice that reproduces the issue.
I get that output with it:

zero-bug 1 0.00 0.01 0 1.00
zero-bug 1 0.01 2.49 19 0.99
zero-bug 1 2.50 0.24 0 0.52
zero-bug 1 2.74 0.88 99246 0.72
zero-bug 1 3.61 0.00 0 0.91
zero-bug 1 3.61 0.00 188695 0.73
zero-bug 1 3.64 0.27 0 1.00
zero-bug 1 3.91 0.17 94502 0.73
zero-bug 1 4.08 0.00 0 1.00
zero-bug 1 4.08 0.20 125521 0.71
zero-bug 1 4.28 0.00 0 0.99
zero-bug 1 4.28 0.13 154265 0.71
zero-bug 1 4.52 0.28 0 1.00
zero-bug 1 4.80 0.51 210161 1.00
zero-bug 1 5.31 0.00 0 1.00
zero-bug 1 5.31 0.24 77196 1.00
zero-bug 1 5.55 0.00 0 1.00
zero-bug 1 5.55 0.12 216084 1.00
zero-bug 1 5.67 0.00 0 1.00
zero-bug 1 5.67 0.60 195114 1.00
zero-bug 1 6.27 0.00 0 1.00
zero-bug 1 6.27 0.51 197104 1.00
zero-bug 1 6.78 0.00 0 1.00
zero-bug 1 6.78 0.57 87856 1.00
zero-bug 1 7.35 0.36 0 0.60
zero-bug 1 7.71 0.39 43190 1.00
zero-bug 1 8.10 0.00 0 1.00
zero-bug 1 8.10 0.51 79149 1.00
zero-bug 1 8.61 0.00 0 1.00
zero-bug 1 8.61 0.18 216084 1.00
zero-bug 1 8.79 0.00 0 1.00
zero-bug 1 8.79 0.45 123646 1.00
zero-bug 1 9.24 0.00 0 1.00
zero-bug 1 9.24 0.60 103807 1.00
zero-bug 1 9.84 0.00 0 1.000 1.00
zero-bug 1 8.79 0.45 123646 1.00
zero-bug 1 9.24 0.00 0 1.00
zero-bug 1 9.24 0.60 103807 1.00
zero-bug 1 9.84 0.00 0 1.00

In that example, the word-id 188695  has a zero duration.

What's the best way to fix that?
zero-bug.lat

Rémi Francis

unread,
Sep 20, 2022, 9:19:20 AM9/20/22
to kaldi...@googlegroups.com
Anyone?

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to a topic in the Google Groups "kaldi-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kaldi-help/BplNkBqOLe0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/59e1a864-1a7b-4ae0-94ab-4380ec01acbfn%40googlegroups.com.

nshm...@gmail.com

unread,
Sep 20, 2022, 12:42:53 PM9/20/22
to kaldi-help
The whole confusion network / MBR thing doesn't play well with durations/timestamps. In a bucket of words the times are not precise and don't map directly to actual signal since the algorithm tries to align words one to other.

For that reason we (Vosk) moved away of MBR in the code that required more precise time and confidence. It is better to stick with lattices or nbest with exact timestamps and compute confidence on top of that.

Reply all
Reply to author
Forward
0 new messages