Hi,
I have 23 speakers in total, 12 for training and 11 for test data. My data/local/train.spk2utt (Attached) file is in right format and data/local/spk2gender (Attached) has all speakers i.e train and test. But when i reach this piece of code
# Now make MFCC features.
# mfccdir should be some place with a largish disk where you
# want to store MFCC features.
mfccdir=${DATA_ROOT}/mfcc
for x in train test; do
steps/make_mfcc.sh --cmd "$train_cmd" --nj $njobs \
data/$x exp/make_mfcc/$x $mfccdir || exit 1;
steps/compute_cmvn_stats.sh data/$x exp/make_mfcc/$x $mfccdir || exit 1;
done
The utils/validate_data_dir.sh gives me this error:
utils/validate_data_dir.sh: Error: in data/train, speaker lists extracted from spk2utt and spk2gender
utils/validate_data_dir.sh: differ, partial diff is:
1,12d0
< ahmed
< ayesha
< fatima
< iqra
< izza
...
< khatija
< laiba
< lozina
< mubarra
< muneeb
< shiza
[Lengths are /tmp/kaldi.2TxI/speakers=12 versus /tmp/kaldi.2TxI/speakers.spk2gender=0]
Any help is appreciated!
Thanks.