"Refusing to split data for number of speakers"

2,094 views
Skip to first unread message

h75...@gmail.com

unread,
Jul 22, 2015, 8:46:37 PM7/22/15
to kaldi-help
Hello, Dan.
I have finished the training step, and get the final.nnet.
But at the decode step, an error message shows that
"Refusing to split data because number of speakers 25 is less than the number of output .scp files 30 at utils/split_scp.pl line 114, <I> line 8257."
I have already generated the file tr90 and cv10 before training, so I don't understand why it still has the problem...
Is it about utt2spk data splited error?
Thank you.

Jan Trmal

unread,
Jul 22, 2015, 8:53:07 PM7/22/15
to kaldi...@googlegroups.com
You didn't provide enough info, but in general, you cannot split the directory in more parts than the number of speakers is.
So if you called the decoding with -nj 30 and you have 25 speakers (you can count lines of the spk2utt file) this is the error you receive.
y.
 

--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

h75...@gmail.com

unread,
Jul 22, 2015, 10:54:36 PM7/22/15
to kaldi-help, jtr...@gmail.com
Yenda, thanks for your kindly help.
But another problem is that latgen-faster-mapped-parallel error, Expected token "<TransitionModel>", got instead "<eps>".
It seems that it's about the error in words amd G.fst..

Yenda於 2015年7月23日星期四 UTC+8上午8時53分07秒寫道:

Jan Trmal

unread,
Jul 22, 2015, 11:02:03 PM7/22/15
to h75...@gmail.com, kaldi-help
Send the complete command line.
To me, it looks like you switched the order of arguments.
y.

h75...@gmail.com

unread,
Jul 22, 2015, 11:20:32 PM7/22/15
to kaldi-help, jtr...@gmail.com
I already finished the pre-train and train steps, so now I just set the commands to decode:
#Decode step (the path is OK.)
steps/nnet/decode.sh --nj 20 --cmd "$decode_cmd" --config conf/decode_dnn.config --acwt 0.2 \ $graph-dir $test_data-dir $decode-dir/

#The error part in decode.log.
ERROR (latgen-faster-mapped-parallel:ExpectToken():io-funcs.cc:197) Expected token "<TransitionModel>", got instead "<eps>".
ERROR (latgen-faster-mapped-parallel:ExpectToken():io-funcs.cc:197) Expected token "<TransitionModel>", got instead "<eps>".

I trace the code part in decode.sh, it calls "latgen-faster-mapped-parallel" at:
#
   latgen-faster-mapped$thread_string --min-active=$min_active --max-active=$max_active --max-mem=$max_mem    --beam=$beam \
   --lattice-beam=$lattice_beam --acoustic-scale=$acwt --allow-partial=true --word-symbol-table=$graphdir/words.txt \
   $model $graphdir/HCLG.fst ark:- "ark:|gzip -c > $dir/lat.JOB.gz" || exit 1;

Yenda於 2015年7月23日星期四 UTC+8上午11時02分03秒寫道:

Xingyu Na

unread,
Jul 22, 2015, 11:25:12 PM7/22/15
to kaldi...@googlegroups.com, jtr...@gmail.com
Is it possible you are using incompatible versions of scripts and binaries?

X.

h75...@gmail.com

unread,
Jul 23, 2015, 4:26:35 AM7/23/15
to kaldi-help, jtr...@gmail.com, asr.na...@gmail.com
I find that the problem is on the "$decode_cmd" in the decode comand.
Now I am trying to understanding how it works.
Thanks you all.

Xingyu Na於 2015年7月23日星期四 UTC+8上午11時25分12秒寫道:
Reply all
Reply to author
Forward
0 new messages