lattice explosion

Arkadi

unread,

Sep 5, 2024, 3:02:06 PM9/5/24

to kaldi-help

Hi,

I have a task to recognize people counting in English with a closed vocabulary (numbers 0-29) in a supermarket. I collected around 30 hours of field data and chose to fine-tune a pre-trained model due to time and resource limitations.

The fine-tuning worked well (2% WER compared to 26% WER with the original model), but one issue persists.

During lattice generation in training, I get numerous WARNINGS like: "Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes..." even after reducing the beam to 4.

I tried augmenting with noise, but no improvement.
Verified no data mismatch.

How can a small vocabulary cause such lattice growth, especially when the model achieves 2% WER?

Any ideas on the issue?

1060147127

unread,

Sep 5, 2024, 3:02:32 PM9/5/24

to 'Arkadi' via kaldi-help

这是来自QQ邮箱的假期自动回复邮件。

您好，我最近正在休假中，无法亲自回复您的邮件。我将在假期结束后，尽快给您回复。

Daniel Povey

unread,

Sep 7, 2024, 9:00:06 AM9/7/24

to kaldi...@googlegroups.com

Very large lattices are typically caused by situations where a single word is repeated an unclear or variable number of times.

In many situations it's not really necessary to generate a lattice, and just a 1-best would be OK, you could choose a different decoder that does not even generate lattices.

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/3e6d7a5a-425b-4955-9ee8-197fa9058484n%40googlegroups.com.

1060147127

unread,

Sep 7, 2024, 9:00:18 AM9/7/24

to Daniel Povey

Arkadi

unread,

Sep 7, 2024, 1:32:13 PM9/7/24

to kaldi-help

Thank you for your response, Dan.

The scenario you described is indeed reflected in my training set. That said, the steps/nnet3/chain/train.py command requires the --lat-dir parameter, so I’m wondering how the training can proceed without providing the lattices. Is it possible to run the training process without generating lattices?

One workaround I’ve considered is to allow partial results (by setting --allow-partial=true when calling nnet3-latgen-faster in align_lats.sh) to prevent the issue from escalating.

Daniel Povey

unread,

Sep 8, 2024, 4:16:02 AM9/8/24

to kaldi...@googlegroups.com

If this is happening during training, it usually will indicate a problem, e.g. there was a problem with capitalization and all the words got mapped to OOV.

Warnings should be printed in some of the logs about this. Either that or some of the utterances had a huge number of consecutive repeats of certain words.

--

Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/e8383c7b-3eab-40b6-a9df-912a1a0b0f07n%40googlegroups.com.