I was running yesno example on espnet, but i got this error.
stage -1: Data Download
stage 0: Data preparation
Preparing train and test data
stage 1: Feature Generation
steps/make_fbank_pitch.sh --nj 1 --write_utt2num_frames true data/train_yesno exp/make_fbank/train_yesno fbank
steps/make_fbank_pitch.sh: moving data/train_yesno/feats.scp to data/train_yesno/.backup
utils/validate_data_dir.sh: WARNING: you have only one speaker. This probably a bad idea.
Search for the word 'bold' in
http://kaldi-asr.org/doc/data_prep.html for more information.
utils/validate_data_dir.sh: Successfully validated data-directory data/train_yesno
steps/make_fbank_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_fbank_pitch.sh: Succeeded creating filterbank and pitch features for train_yesno
fix_data_dir.sh: kept all 31 utterances.
fix_data_dir.sh: old files are kept in data/train_yesno/.backup
steps/compute_cmvn_stats.sh data/train_yesno exp/make_fbank/train_yesno fbank
Succeeded creating CMVN stats for train_yesno
steps/make_fbank_pitch.sh --nj 1 --write_utt2num_frames true data/test_yesno exp/make_fbank/test_yesno fbank
utils/validate_data_dir.sh: WARNING: you have only one speaker. This probably a bad idea.
Search for the word 'bold' in
http://kaldi-asr.org/doc/data_prep.html for more information.
utils/validate_data_dir.sh: Error: in data/test_yesno, utterance-ids extracted from utt2spk and utt2dur file
utils/validate_data_dir.sh: differ, partial diff is:
--- /tmp/kaldi.aAyz/utts 2022-08-16 22:43:18.654183147 +0800
+++ /tmp/kaldi.aAyz/utts.utt2dur 2022-08-16 22:43:18.686182895 +0800
@@ -29,3 +29 @@
1_1_1_1_1_1_1_1
-README
-README~
...
[Lengths are /tmp/kaldi.aAyz/utts=31 versus /tmp/kaldi.aAyz/utts.utt2dur=29]