I got stuck in training the tdnn.
2021-02-26 06:28:45,178 [steps/nnet3/chain/e2e/train_e2e.py:462 - train - INFO ] Iter: 80/80 Jobs: 1 Epoch: 2.96/3.0 (98.8% comple
te) lr: 0.000003
2021-02-26 06:29:10,468 [steps/nnet3/chain/e2e/train_e2e.py:515 - train - INFO ] Doing final combination to produce final.mdl
2021-02-26 06:29:10,885 [steps/libs/nnet3/train/chain_objf/acoustic_model.py:571 - combine_models - INFO ] Combining set([64, 65, 66,
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 62, 63]) models.
run.pl: job failed, log is in exp/chain/e2e_tdnn_1a/log/combine.log
Traceback (most recent call last): File "steps/nnet3/chain/e2e/train_e2e.py", line 558, in main
train(args, run_opts)
File "steps/nnet3/chain/e2e/train_e2e.py", line 524, in train
run_opts=run_opts)
File "steps/libs/nnet3/train/chain_objf/acoustic_model.py", line 622, in combine_models
scp_or_ark=scp_or_ark, egs_suffix=egs_suffix))
File "steps/libs/common.py", line 158, in execute_command
p.returncode, command))
Exception: Command exited with status 1:
run.pl --mem 4G --gpu 1 exp/chain/e2e_tdnn_1a/log/combine.log nnet3-chain-com
bine --max-objective-evaluations=30 --l2-regularize=0.0 --leaky-hmm-coefficient=0.1 --
verbose=3 --use-gpu=wait exp/chain/e2e_tdnn_1a/den.fst exp/chain/e2e_tdnn_1a/81.mdl exp/chain/e2e_tdnn_1a/80.mdl exp/chain/e2e_tdnn_1a
/79.mdl exp/chain/e2e_tdnn_1a/78.mdl exp/chain/e2e_tdnn_1a/77.mdl exp/chain/e2e_tdnn_1a/76.mdl exp/chain/e2e_tdnn_1a/75.mdl exp/chain/
e2e_tdnn_1a/74.mdl exp/chain/e2e_tdnn_1a/73.mdl exp/chain/e2e_tdnn_1a/72.mdl exp/chain/e2e_tdnn_1a/71.mdl exp/chain/e2e_tdnn_1a/70.mdl