Re: [kaldi-help] tdnn3 model training problem

Daniel Povey

unread,

Jul 22, 2021, 9:20:22 AM7/22/21

to kaldi...@googlegroups.com

responded in another thread; that script seems to deal poorly with what happens if you retrain the base system. we will fix it.

On Thursday, July 22, 2021, Sergio Ornaque <sergio.orna...@gmail.com> wrote:

I'm trying to run the script mini_librispeech -> s5 -> run.sh to train a model and familiarize myself with the script, but it fails on the last step while training the model using the script local/chain2/run_tdnn.sh

I'm getting the following error:
run.pl: job failed, log is in exp/chain2/tdnn1a_sp/den_fsts/log/make_den_fst.log

This is the error log 'make_den_fst.log':
Number of states and arcs in phone-LM FST is 6342 and 41194
Number of states and arcs in context-dependent LM FST is 6342 and 41194
ERROR TransitionModel::TupleToTransitionState, tuple not found. (incompatible tree and model?)

The only thing I've changed is 'queue.pl' to 'run.pl' in the cmd.sh file.

Any help is appreciated

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/fd2a0953-7d29-4a6b-ad40-e24c52b73f92n%40googlegroups.com.

Gwel DG

unread,

Dec 1, 2021, 7:53:12 AM12/1/21

to kaldi-help

Hi,

I stumbled on the same problem while running mini_librispeech recipe on my own tiny dataset, with "--use-gpu no" option set (for steps/chain2/train.sh script) and with --nj set to 1 everywhere.
Any progress on that front ?

(Sorry, I couldn't find the other thread you mentioned)

Thanks !

Srikanth R Madikeri

unread,

Dec 1, 2021, 10:26:40 AM12/1/21

to kaldi...@googlegroups.com

Hello,

Is the problem that you are not able to rerun the script, or that it doesn't run with run.pl?

Srikanth

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/38c2c168-b8dd-4b3b-9968-edb2f4c3da00n%40googlegroups.com.

Gwel DG

unread,

Dec 3, 2021, 4:55:14 PM12/3/21

to kaldi-help

Hi,

The script runs smoothly up to the chain model training stage (last stage in run.sh).
It fails in "run_tdnn.sh" script ("run_tdnn_nogpu.sh" is modified with "--use-gpu no" option set) at stage 16 :

local/chain2/run_tdnn_nogpu.sh: creating denominator FST

run.pl: job failed, log is in exp/chain2/tdnn1a_sp/den_fsts/log/make_den_fst.log

Here's the error in exp/chain2/tdnn1a_sp/den_fsts/log/make_den_fst.log :

LOG (chain-make-den-fst[5.5.990~1-6e03a]:CreateDenominatorFst():chain-den-graph.cc:306) Number of states and arcs in phone-LM FST is 3730 and 12551
LOG (chain-make-den-fst[5.5.990~1-6e03a]:CreateDenominatorFst():chain-den-graph.cc:335) Number of states and arcs in context-dependent LM FST is 3730 and 12551
ERROR (chain-make-den-fst[5.5.990~1-6e03a]:TupleToTransitionState():transition-model.cc:262) TransitionModel::TupleToTransitionState, tuple not found. (incompatible tree and model?)

It seems like the exact same problem Sergio Ornaque had at the beginning of this thread.

Thanks !

Daniel Povey

unread,

Dec 4, 2021, 11:35:11 PM12/4/21

to kaldi-help

Usually this would be some issue where you re-ran an earlier stage, overwriting something a later stage was using, like alignments, without re-running some intermediate stage.

So look at file times and the "--stage" parameter/variable.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/db62854f-a26f-4113-be5b-6ac8f5a58d93n%40googlegroups.com.

Gwel DG

unread,

Dec 14, 2021, 9:35:40 AM12/14/21

to kaldi-help

Thank you Dan, that helped !

I could go one stage further after cleaning all the intermediate data manually. I thought the script did that already at stage 0 but then noticed that I commented that line out an forgot about it...

Reply all

Reply to author

Forward