Error while training GMM using bob speaker verification toolkit

141 views
Skip to first unread message

Achintya Sarkar

unread,
Nov 6, 2013, 9:23:54 AM11/6/13
to bob-...@googlegroups.com
Hello All,

I am trying to train gmm with bob tool as,
 bin/spkverif_gmm.py  -d config/database/2004.py -p config/preprocessing/energy.py -f config/features/lfcc_60.py  -t config/tools/ubm_gmm/ubm_gmm_512G.py -b ubm_gmm -z  --user-directory ./spkr --temp-directory ./TEMP

I am getting following errors:

<module 'tool_chain' from 'config/tools/ubm_gmm/ubm_gmm_512G.pyc'>

preprocess 1 wave from directory ./2004SPH to directory../TEMP/ubm_gmm/preprocessed
Input wave file./2004SPH/tabf.sph
No handlers could be found for logger "bob.c++"
After Energy-based VAD there are 21205 frames remaining over 30994
Training Projector ./TEMP/ubm_gmm/Projector.hdf5' using 1 training files:

Traceback (most recent call last):
  File "bin/spkverif_gmm.py", line 20, in <module>
    sys.exit(xbob.spkrec.script.spkverif_isv.main())
  File "./xbob/spkrec/script/spkverif_isv.py", line 492, in main
    speaker_verify(args)
  File "./xbob/spkrec/script/spkverif_isv.py", line 460, in speaker_verify
    executor.execute_tool_chain()
  File "./xbob/spkrec/script/spkverif_isv.py", line 85, in execute_tool_chain
    self.m_tool_chain.train_projector(self.m_tool, force = self.m_args.force)
  File "./xbob/spkrec/toolchain/ToolChain.py", line 263, in train_projector
    tool.train_projector(train_features, str(projector_file))
  File "./xbob.spkrec-master/eggs/facereclib-1.2.0-py2.7.egg/facereclib/tools/UBMGMM.py", line 190, in train_projector
    array = numpy.vstack(train_features)
  File "./python2.7/site-packages/numpy/core/shape_base.py", line 226, in vstack
    return _nx.concatenate(map(atleast_2d,tup),0)
ValueError: need at least one array to concatenate
------------------------------

"./TEMP/ubm_gmm/preprocessed" folder is empty..
But "*.hdf5" file exists in directory "./2004SPH" which contains "1/0"


Need  help  in this issue.

Thanks,
Achintya




Elie Khoury

unread,
Nov 6, 2013, 12:05:49 PM11/6/13
to bob-...@googlegroups.com
Dear Achintya Sarkar,

Thanks for reporting this issue.

I tried the same example here with the same wav file from NIST SRE 2004.

bin/spkverif_gmm.py  -d config/database/nist_test.py -p config/preprocessing/energy.py -f config/features/lfcc_60.py  -t config/tools/ubm_gmm/ubm_gmm_512G.py -b ubm_gmm -z  --user-directory ./spkr --temp-directory ./TEMP/ --groups dev
preprocess 1 wave from directory WAV/ to directory ./TEMP/ubm_gmm/preprocessed
Input wave file: /idiap/temp/ekhoury/NIST_DATA/WAV/mix04/tabfA.sph

No handlers could be found for logger "bob.c++"
After Energy-based VAD there are 21205 frames remaining over 30994
extract 1 features from wav directory /idiap/temp/ekhoury/NIST_DATA/WAV/ to directory ./TEMP/ubm_gmm/features
Input wave file : /idiap/temp/ekhoury/NIST_DATA/WAV/mix04/tabfA.sph

Training Projector './TEMP/ubm_gmm/Projector.hdf5' using 1 training files:
bob.c++@2013-11-06 17:51:46,359 -- INFO: # KMeansTrainer:
bob.c++@2013-11-06 17:51:49,529 -- INFO: # Iteration 1: 52.576 -> 39.7798
bob.c++@2013-11-06 17:51:51,108 -- INFO: # Iteration 2: 39.7798 -> 38.0391
bob.c++@2013-11-06 17:51:52,688 -- INFO: # Iteration 3: 38.0391 -> 37.3474
bob.c++@2013-11-06 17:51:54,267 -- INFO: # Iteration 4: 37.3474 -> 36.9792
bob.c++@2013-11-06 17:51:55,869 -- INFO: # Iteration 5: 36.9792 -> 36.7836
bob.c++@2013-11-06 17:51:57,447 -- INFO: # Iteration 6: 36.7836 -> 36.6507
bob.c++@2013-11-06 17:51:59,025 -- INFO: # Iteration 7: 36.6507 -> 36.5472
bob.c++@2013-11-06 17:52:00,606 -- INFO: # Iteration 8: 36.5472 -> 36.4826
bob.c++@2013-11-06 17:52:02,184 -- INFO: # Iteration 9: 36.4826 -> 36.4373
bob.c++@2013-11-06 17:52:03,764 -- INFO: # Iteration 10: 36.4373 -> 36.4033
bob.c++@2013-11-06 17:52:03,764 -- INFO: # EM terminated: maximum number of iterations reached.
bob.c++@2013-11-06 17:52:05,387 -- INFO: # EMTrainer:
bob.c++@2013-11-06 17:52:24,068 -- INFO: # Iteration 1: -72.0365 -> -71.5065
bob.c++@2013-11-06 17:52:33,368 -- INFO: # Iteration 2: -71.5065 -> -71.2148
bob.c++@2013-11-06 17:52:42,661 -- INFO: # Iteration 3: -71.2148 -> -71.0308
bob.c++@2013-11-06 17:52:51,978 -- INFO: # Iteration 4: -71.0308 -> -70.9041
bob.c++@2013-11-06 17:53:01,292 -- INFO: # Iteration 5: -70.9041 -> -70.8094
bob.c++@2013-11-06 17:53:10,601 -- INFO: # Iteration 6: -70.8094 -> -70.742
bob.c++@2013-11-06 17:53:19,902 -- INFO: # Iteration 7: -70.742 -> -70.6914
bob.c++@2013-11-06 17:53:29,200 -- INFO: # Iteration 8: -70.6914 -> -70.65
bob.c++@2013-11-06 17:53:38,513 -- INFO: # Iteration 9: -70.65 -> -70.6154
bob.c++@2013-11-06 17:53:47,815 -- INFO: # Iteration 10: -70.6154 -> -70.5848
bob.c++@2013-11-06 17:53:47,815 -- INFO: # EM terminated: maximum number of iterations reached.


So the problem is that the feature extraction step (highlighted in bold) is missing in your case. May you please give more details? what version of spkrec tool are you using? Your configuration file for the database, etc.?

Thanks,
Elie




--
-- You received this message because you are subscribed to the Google Groups bob-devel group. To post to this group, send email to bob-...@googlegroups.com. To unsubscribe from this group, send email to bob-devel+...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/bob-devel or directly the project website at http://idiap.github.com/bob/
---
You received this message because you are subscribed to the Google Groups "bob-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bob-devel+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Achintya Sarkar

unread,
Nov 6, 2013, 12:59:18 PM11/6/13
to bob-...@googlegroups.com
Dear Elie khoury,

Thank you very much for your earliest response.
I am using "xbob.spkrec-master.zip" filelist.

I observed that it generates ouput after energy detector in the same folder of ".sph" files.
After that it calls "extract_features" in xbob/spkrec/toolchain/ToolChain.py from  xbob/spkrec/script/spkverif_isv.py ==> "self.m_tool_chain.extract_features".

In "extract_features"  [xbob/spkrec/toolchain/ToolChain.py]
it does not call "feature = extractor(wav_file, vad_file)".
Since it reads "*.hdf5" file  generated by energy detector  as "feature_file".
Though 
print("extract %d features from wav directory %s to directory %s" %(len(index_range), self.m_file_selector.m_config.wav_input_dir, self.m_file_selector.m_config.features_dir))
line prints
"
extract 1 features from wav directory ./2004SPH to directory ./TEMP/lfcc/features"



Now if  I forced it to call "feature = extractor(wav_file, vad_file)".
It generates the feature and overwrite the  ".hdf5" (generated by energy detector) (
directory: ./2004SPH).
As per command line option " --user-directory ./spkr --temp-directory ./TEMP" remains empty (without features).
Therefore next step does not get the features and gives error. 



My config files are as follows:

1. config/database/2004.py
=======================

 #!/usr/bin/env python
import xbob.db.verification.filelist

# 0/ The database to use
name = '2004'
db =  xbob.db.verification.filelist.Database('./protocols/2004/')
protocol = None

# directory where the wave files are stored
wav_input_dir = './2004SPH'
wav_input_ext = '.sph'

2. config/features/lfcc_60.py
=========================
#!/usr/bin/env python

import xbob.spkrec
import numpy

feature_extractor = xbob.spkrec.feature_extraction.Cepstral

# Cepstral parameters
win_length_ms = 20
win_shift_ms = 10
n_filters = 24
dct_norm = False
f_min = 0.0
f_max = 4000
delta_win = 2
mel_scale = False
withEnergy = True
withDelta = True
withDeltaDelta = True
withDeltaEnergy = True
withDeltaDeltaEnergy = True
n_ceps = 19 # 0-->18
pre_emphasis_coef = 0.95
energy_mask = n_ceps # 19
features_mask = numpy.arange(0,60) # Cepstral features + Energy + 1st and 2nd derivatives
normalizeFeatures = True # Normalization

3.config/preprocessing/energy.py
=============================
#!/usr/bin/env python

import xbob.spkrec

preprocessor = xbob.spkrec.preprocessing.Energy

# Cepstral parameters
win_length_ms = 20
win_shift_ms = 10

# VAD parameters
alpha = 2
max_iterations = 10
smoothing_window = 10 # This corresponds to 100ms


Regards,
Achintya




You received this message because you are subscribed to a topic in the Google Groups "bob-devel" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bob-devel/TW7nR1v9EVc/unsubscribe.
To unsubscribe from this group and all of its topics, send an email to bob-devel+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages