Hi all, I am trying to train a xvector extractor for language identification based on ser16/v2 recipe and I get this:
amontalvo@cen-is-amontalvo:/media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID$ ./run_xvectorLID.sh
utils/data/get_utt2num_frames.sh: data/train/utt2num_frames already present!
steps/data/reverberate_data_dir.py --rir-set-parameters 0.5, /media/amontalvo/Storage/Databases/MUSAN/RIRS_NOISES/simulated_rirs/smallroom/rir_list --rir-set-parameters 0.5, /media/amontalvo/Storage/Databases/MUSAN/RIRS_NOISES/simulated_rirs/mediumroom/rir_list --speech-rvb-probability 1 --pointsource-noise-addition-probability 0 --isotropic-noise-addition-probability 0 --num-replications 1 --source-sampling-rate 8000 data/train data/train_reverb
Number of RIRs is 40000
utils/copy_data_dir.sh: copied data from data/train_reverb to data/
train_reverb.newutils/validate_data_dir.sh: Successfully validated data-directory data/
train_reverb.newPreparing data/musan...
In music directory, processed 645 files; 0 had missing wav data
In speech directory, processed 426 files; 0 had missing wav data
In noise directory, processed 930 files; 0 had missing wav data
utils/fix_data_dir.sh: file data/musan/utt2spk is not in sorted order or not unique, sorting it
utils/fix_data_dir.sh: file data/musan/wav.scp is not in sorted order or not unique, sorting it
fix_data_dir.sh: kept all 2001 utterances.
fix_data_dir.sh: old files are kept in data/musan/.backup
utils/subset_data_dir.sh: reducing #utt from 2001 to 645
utils/subset_data_dir.sh: reducing #utt from 2001 to 426
utils/subset_data_dir.sh: reducing #utt from 2001 to 930
fix_data_dir.sh: kept all 645 utterances.
fix_data_dir.sh: old files are kept in data/musan_music/.backup
fix_data_dir.sh: kept all 426 utterances.
fix_data_dir.sh: old files are kept in data/musan_speech/.backup
fix_data_dir.sh: kept all 930 utterances.
fix_data_dir.sh: old files are kept in data/musan_noise/.backup
utils/data/get_utt2dur.sh: segments file does not exist so getting durations from wave files
utils/data/get_utt2dur.sh: could not get utterance lengths from sphere-file headers, using wav-to-duration
utils/data/get_utt2dur.sh: computed data/musan_speech/utt2dur
utils/data/get_utt2dur.sh: segments file does not exist so getting durations from wave files
utils/data/get_utt2dur.sh: could not get utterance lengths from sphere-file headers, using wav-to-duration
utils/data/get_utt2dur.sh: computed data/musan_speech/utt2dur
utils/data/get_utt2dur.sh: segments file does not exist so getting durations from wave files
utils/data/get_utt2dur.sh: could not get utterance lengths from sphere-file headers, using wav-to-duration
utils/data/get_utt2dur.sh: computed data/musan_noise/utt2dur
utils/data/get_utt2dur.sh: segments file does not exist so getting durations from wave files
utils/data/get_utt2dur.sh: could not get utterance lengths from sphere-file headers, using wav-to-duration
utils/data/get_utt2dur.sh: computed data/musan_music/utt2dur
steps/data/augment_data_dir.py --utt-suffix noise --fg-interval 1 --fg-snrs 15:10:5:0 --fg-noise-dir data/musan_noise data/train data/train_noise
steps/data/augment_data_dir.py --utt-suffix music --bg-snrs 15:10:8:5 --num-bg-noises 1 --bg-noise-dir data/musan_music data/train data/train_music
steps/data/augment_data_dir.py --utt-suffix babble --bg-snrs 20:17:15:13 --num-bg-noises 3:4:5:6:7 --bg-noise-dir data/musan_speech data/train data/train_babble
utils/combine_data.sh data/train_aug data/train_reverb. data/train_noise data/train_music data/train_babble
utils/combine_data.sh: combined utt2uniq
utils/combine_data.sh [info]: not combining segments as it does not exist
utils/combine_data.sh: combined utt2spk
utils/combine_data.sh [info]: **not combining utt2lang as it does not exist everywhere**
utils/combine_data.sh [info]: not combining utt2dur as it does not exist
utils/combine_data.sh [info]: **not combining reco2dur as it does not exist everywhere**
utils/combine_data.sh [info]: not combining feats.scp as it does not exist
utils/combine_data.sh [info]: not combining text as it does not exist
utils/combine_data.sh [info]: not combining cmvn.scp as it does not exist
utils/combine_data.sh: combined vad.scp
utils/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/combine_data.sh: combined wav.scp
utils/combine_data.sh [info]: not combining spk2gender as it does not exist
fix_data_dir.sh: kept all 925460 utterances.
fix_data_dir.sh: old files are kept in data/train_aug/.backup
utils/subset_data_dir.sh: reducing #utt from 925460 to 200000
fix_data_dir.sh: kept all 200000 utterances.
fix_data_dir.sh: old files are kept in data/train_aug_200k/.backup
steps/make_mfcc.sh --mfcc-config conf/mfcc.conf --nj 20 --cmd
run.pl data/train_aug_200k exp/make_mfcc /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/train_aug_200k
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC features for train_aug_200k
utils/combine_data.sh data/train_combined data/train_aug_200k data/train
utils/combine_data.sh: combined utt2uniq
utils/combine_data.sh [info]: not combining segments as it does not exist
utils/combine_data.sh: combined utt2spk
utils/combine_data.sh [info]: **not combining utt2lang as it does not exist everywhere**
utils/combine_data.sh [info]: not combining utt2dur as it does not exist
utils/combine_data.sh [info]: **not combining reco2dur as it does not exist everywhere**
utils/combine_data.sh: combined feats.scp
utils/combine_data.sh [info]: not combining text as it does not exist
utils/combine_data.sh [info]: not combining cmvn.scp as it does not exist
utils/combine_data.sh: combined vad.scp
utils/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/combine_data.sh: combined wav.scp
utils/combine_data.sh [info]: not combining spk2gender as it does not exist
fix_data_dir.sh: kept all 431365 utterances.
fix_data_dir.sh: old files are kept in data/train_combined/.backup
local/nnet3/xvector/prepare_feats_for_egs.sh --nj 20 --cmd
run.pl data/train_combined data/train_combined_no_sil exp/train_combined_no_sil
local/nnet3/xvector/prepare_feats_for_egs.sh: Succeeded creating xvector features for train_combined
fix_data_dir.sh: kept 431361 utterances out of 431365
fix_data_dir.sh: old files are kept in data/train_combined_no_sil/.backup
fix_data_dir.sh: kept all 233980 utterances.
fix_data_dir.sh: old files are kept in data/train_combined_no_sil/.backup
fix_data_dir.sh: kept all 233980 utterances.
fix_data_dir.sh: old files are kept in data/train_combined_no_sil/.backup
run_xvector.sh
local/nnet3/xvector/run_xvector.sh: Getting neural network training egs
sid/nnet3/xvector/get_egs.sh --cmd
run.pl --nj 8 --stage 0 --frames-per-iter 1000000000 --frames-per-iter-diagnostic 100000 --min-frames-per-chunk 200 --max-frames-per-chunk 400 --num-diagnostic-archives 3 --num-repeats 50 data/train_combined_no_sil /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/egs
feat-to-dim scp:data/train_combined_no_sil/feats.scp -
sid/nnet3/xvector/get_egs.sh: Preparing train and validation lists
sid/nnet3/xvector/get_egs.sh: Producing 10 archives for training
sid/nnet3/xvector/get_egs.sh: Allocating training examples
sid/nnet3/xvector/get_egs.sh: Allocating training subset examples
sid/nnet3/xvector/get_egs.sh: Allocating validation examples
sid/nnet3/xvector/get_egs.sh: Generating training examples on disk
sid/nnet3/xvector/get_egs.sh: Generating training subset examples on disk
sid/nnet3/xvector/get_egs.sh: Generating validation examples on disk
sid/nnet3/xvector/get_egs.sh: Shuffling order of archives on disk
sid/nnet3/xvector/get_egs.sh: Finished preparing training examples
local/nnet3/xvector/run_xvector.sh: creating neural net configs using the xconfig parser
steps/nnet3/xconfig_to_configs.py --xconfig-file /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/configs/network.xconfig --config-dir /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/configs/
nnet3-init /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/configs//ref.config /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/configs//ref.raw
LOG (nnet3-init[5.5]:main():nnet3-init.cc:80) Initialized raw neural net and wrote it to /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/configs//ref.raw
nnet3-info /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/configs//ref.raw
nnet3-init /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/configs//ref.config /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/configs//ref.raw
LOG (nnet3-init[5.5]:main():nnet3-init.cc:80) Initialized raw neural net and wrote it to /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/configs//ref.raw
nnet3-info /media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/configs//ref.raw
2019-10-04 23:13:53,281 [steps/nnet3/train_raw_dnn.py:34 - <module> - INFO ] Starting raw DNN trainer (train_raw_dnn.py)
steps/nnet3/train_raw_dnn.py --stage=-1 --cmd=
run.pl --trainer.optimization.proportional-shrink 10 --trainer.optimization.momentum=0.5 --trainer.optimization.num-jobs-initial=3 --trainer.optimization.num-jobs-final=8 --trainer.optimization.initial-effective-lrate=0.001 --trainer.optimization.final-effective-lrate=0.0001 --trainer.optimization.minibatch-size=64 --trainer.srand=123 --trainer.max-param-change=2 --trainer.num-epochs=3 --trainer.dropout-schedule=0,0...@0.20,0...@0.50,0 --trainer.shuffle-buffer-size=1000 --egs.frames-per-eg=1 --egs.dir=/media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/egs --cleanup.remove-egs false --cleanup.preserve-model-interval=10 --use-gpu=true --dir=/media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a
['steps/nnet3/train_raw_dnn.py', '--stage=-1', '--cmd=
run.pl', '--trainer.optimization.proportional-shrink', '10', '--trainer.optimization.momentum=0.5', '--trainer.optimization.num-jobs-initial=3', '--trainer.optimization.num-jobs-final=8', '--trainer.optimization.initial-effective-lrate=0.001', '--trainer.optimization.final-effective-lrate=0.0001', '--trainer.optimization.minibatch-size=64', '--trainer.srand=123', '--trainer.max-param-change=2', '--trainer.num-epochs=3', '--trainer.dropout-schedule=0,0...@0.20,0...@0.50,0', '--trainer.shuffle-buffer-size=1000', '--egs.frames-per-eg=1', '--egs.dir=/media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/egs', '--cleanup.remove-egs', 'false', '--cleanup.preserve-model-interval=10', '--use-gpu=true', '--dir=/media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a']
2019-10-04 23:13:53,308 [steps/nnet3/train_raw_dnn.py:187 - train - INFO ] Arguments for the experiment
{'backstitch_training_interval': 1,
'backstitch_training_scale': 0.0,
'cleanup': True,
'cmvn_opts': None,
'combine_sum_to_one_penalty': 0.0,
'command': '
run.pl',
'compute_average_posteriors': False,
'compute_per_dim_accuracy': False,
'dir': '/media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a',
'do_final_combination': True,
'dropout_schedule': '0,0...@0.20,0...@0.50,0',
'egs_command': None,
'egs_dir': '/media/amontalvo/Voz_1TB/kaldi-master/egs/xVector_LID/xvector_nnet_1a/egs',
'egs_opts': None,
'egs_stage': 0,
'email': None,
'exit_stage': None,
'feat_dir': None,
'final_effective_lrate': 0.0001,
'frames_per_eg': 1,
'image_augmentation_opts': None,
'initial_effective_lrate': 0.001,
'input_model': None,
'max_lda_jobs': 10,
'max_models_combine': 20,
'max_objective_evaluations': 30,
'max_param_change': 2.0,
'minibatch_size': '64',
'momentum': 0.5,
'nj': 4,
'num_epochs': 3.0,
'num_jobs_compute_prior': 10,
'num_jobs_final': 8,
'num_jobs_initial': 3,
'online_ivector_dir': None,
'preserve_model_interval': 10,
'presoftmax_prior_scale_power': -0.25,
'prior_subset_size': 20000,
'proportional_shrink': 10.0,
'rand_prune': 4.0,
'remove_egs': False,
'reporting_interval': 0.1,
'samples_per_iter': 400000,
'shuffle_buffer_size': 1000,
'srand': 123,
'stage': -1,
'targets_scp': None,
'train_opts': [],
'use_dense_targets': True,
'use_gpu': 'yes'}
2019-10-04 23:13:53,309 [steps/nnet3/train_raw_dnn.py:307 - train - INFO ] Preparing the initial network.
Traceback (most recent call last):
File "steps/nnet3/train_raw_dnn.py", line 491, in main
train(args, run_opts)
File "steps/nnet3/train_raw_dnn.py", line 326, in train
args.num_jobs_final)
File "steps/libs/nnet3/train/common.py", line 596, in get_model_combine_iters
num_iters + 1))
TypeError: 'float' object cannot be interpreted as an integer
thanks in advance for any hint..