using a custom timit

Maria Garcia

unread,

Feb 23, 2015, 12:45:14 PM2/23/15

to bob-...@googlegroups.com

Hi,

I am now using a customized timit database of my own.

I named it /protocols/timit/3

I have made a very simple test and created a UBM of 12 files. (protocols/timit/3/norm/train_world.lst)

in /protocols/timit/3/dev I have:

$ wc -l *.lst
1 for_models.lst
2 for_probes.lst
2 for_znorm.lst

$ cat for_models.lst
/drive2/audio/babgko 30002 30002

$ cat for_probes.lst
/drive2/audio/tdubeo 30002
/drive2/audio/tcpkph 30002

$ cat for_znorm.lst
/drive2/audio/tdubeo 30002
/drive2/audio/tcpkph 30002

in /protocols/timit/3/eval, I have:

$ cat for_models.lst
/drive2/audio/babgko 30002 30002

$ cat for_probes.lst
/drive2/audio/tdubeo 30002
/drive2/audio/tcpkph 30002

$ cat for_znorm.lst
/drive2/audio/tdubeo 30002
/drive2/audio/tcpkph 30002

my timit.py (config/database/timit.py) is very simple:

$ more config/database/timit.py
#!/usr/bin/env python

import xbob.db.verification.filelist

# 0/ The database to use
name = 'timit'
db = xbob.db.verification.filelist.Database('protocols/timit/3/')
protocol = None

# directory where the wave files are stored
wav_input_dir = "/drive2/audio/"
wav_input_ext = ".wav" # default extension

However when I run my script here, it crashes:

#!/bin/bash
echo "Perform a custom timit test"
# Source: https://pypi.python.org/pypi/bob.spear

# go
USRDIR=/home/user/bobspear117/bob.spear-1.1.7/tim1
TMPDIR=/home/user/bobspear117/bob.spear-1.1.7/tim1

# scores end up under $USRDIR
# corpora built by TestEngineBob.py
# using custom Timit.py corpora

./bin/spkverif_gmm.py -d config/database/timit.py -p config/preprocessing/energy.py \
-f config/features/mfcc_60.py -t config/tools/ubm_gmm/ubm_gmm_256G.py -b ubgm -z \
--user-directory $USRDIR --temp-directory $TMPDIR

This is the output from the run of the script: it ends with a crash. What do I need to do to make it work?

Perform a custom timit test
No handlers could be found for logger "bob.c++"
file[config/database/timit.py]
function[database]
config: <module 'database' from 'config/database/timit.py'>

file[config/tools/ubm_gmm/ubm_gmm_256G.py]
function[tool_chain]
config: <module 'tool_chain' from 'config/tools/ubm_gmm/ubm_gmm_256G.pyc'>

file[config/preprocessing/energy.py]
function[preprocessor]
config: <module 'preprocessor' from 'config/preprocessing/energy.pyc'>

file[config/features/mfcc_60.py]
function[feature_extractor]
config: <module 'feature_extractor' from 'config/features/mfcc_60.pyc'>

preprocess 15 wave from directory /drive2/audio/ to directory /home/user/bobspear117/bob.spear-1.1.7/tim1/ubm_gmm/preprocessed
Input wave file: /drive2/audio/ltcpnh.wav
After Energy-based VAD there are 652 frames remaining over 1043
Input wave file: /drive2/audio/meqdho.wav
After Energy-based VAD there are 8328 frames remaining over 30056
Input wave file: /drive2/audio/mljlxo.wav
After Energy-based VAD there are 14681 frames remaining over 30392
Input wave file: /drive2/audio/sidgio.wav
After Energy-based VAD there are 18641 frames remaining over 30629
Input wave file: /drive2/audio/sluiuo.wav
After Energy-based VAD there are 15394 frames remaining over 30168
Input wave file: /drive2/audio/sprxbo.wav
After Energy-based VAD there are 6929 frames remaining over 29995
Input wave file: /drive2/audio/teomrh.wav
After Energy-based VAD there are 16667 frames remaining over 30360
Input wave file: /drive2/audio/tgpcso.wav
After Energy-based VAD there are 22618 frames remaining over 30222
Input wave file: /drive2/audio/tsmwyh.wav
After Energy-based VAD there are 16765 frames remaining over 30216
Input wave file: /drive2/audio/txtdxo.wav
After Energy-based VAD there are 16303 frames remaining over 30425
extract 15 features from wav directory /drive2/audio/ to directory /home/user/bobspear117/bob.spear-1.1.7/tim1/ubm_gmm/features
Training Projector '/home/user/bobspear117/bob.spear-1.1.7/tim1/ubm_gmm/Projector.hdf5' using 12 training files:
Traceback (most recent call last):
File "./bin/spkverif_gmm.py", line 21, in <module>
    sys.exit(spear.script.spkverif_isv.main())
File "/home/user/bobspear117/bob.spear-1.1.7/spear/script/spkverif_isv.py", line 492, in main
    speaker_verify(args)
File "/home/user/bobspear117/bob.spear-1.1.7/spear/script/spkverif_isv.py", line 460, in speaker_verify
    executor.execute_tool_chain()
File "/home/user/bobspear117/bob.spear-1.1.7/spear/script/spkverif_isv.py", line 85, in execute_tool_chain
    self.m_tool_chain.train_projector(self.m_tool, force = self.m_args.force)
File "/home/user/bobspear117/bob.spear-1.1.7/spear/toolchain/ToolChain.py", line 257, in train_projector
    tool.train_projector(train_features, str(projector_file))
File "/usr/lib/python2.7/site-packages/facereclib-1.2.3-py2.7.egg/facereclib/tools/UBMGMM.py", line 190, in train_projector
    array = numpy.vstack(train_features)
File "/usr/lib64/python2.7/site-packages/numpy/core/shape_base.py", line 226, in vstack
    return _nx.concatenate(map(atleast_2d,tup),0)
ValueError: need at least one array to concatenate

Elie Khoury

unread,

Feb 23, 2015, 1:30:19 PM2/23/15

to bob-...@googlegroups.com

Hello,

I need more detail about your wave files using to train your UBM.

It seems to me that your wave files used in your train_world.lst do not contain any speech.

To be sure about your output:

can you please remove your tmp directory:

/home/user/bobspear117/bob.spear-1.1.7/tim1/ubm_gmm/

and run the script again.

Thank you,

Elie

--
-- You received this message because you are subscribed to the Google Groups bob-devel group. To post to this group, send email to bob-...@googlegroups.com. To unsubscribe from this group, send email to bob-devel+...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/bob-devel or directly the project website at http://idiap.github.com/bob/
---
You received this message because you are subscribed to the Google Groups "bob-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bob-devel+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Maria Garcia

unread,

Feb 23, 2015, 3:59:15 PM2/23/15

to bob-...@googlegroups.com

hi,

Thank you for your help. my wav files are nist files from the 2012 trials They started as sphere files which I converted to single channel wave files. I will try again.

I ran and got the same error. what does this sort of error mean?

when I ran sox on one of the ubm files in my definition I got this:

$ sox --i /drive2/audio/teomrh.wav

Input File     : '/drive2/audio/teomrh.wav'
Channels       : 1
Sample Rate    : 8000
Precision      : 14-bit
Duration       : 00:05:03.61 = 2428880 samples ~ 22770.8 CDDA sectors
File Size      : 2.43M
Bit Rate       : 64.0k
Sample Encoding: 8-bit u-law

One of the 12 files is 10 seconds. most are 5 minutes long. 2 are 8 minutes long

Maria Garcia

unread,

Feb 23, 2015, 4:41:37 PM2/23/15

to bob-...@googlegroups.com

Hi,

I added some info to UBMGMM.py and it now displays this:

preprocess 15 wave from directory /drive2/audio/ to directory /home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ubgm/preprocessed
extract 15 features from wav directory /drive2/audio/ to directory /home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ubgm/features
Training Projector '/home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ubgm/Projector.hdf5' using 12 training files:
-> Training UBM model with 0 training files []

**Problem need at least one array to concatenate

Traceback (most recent call last):
File "./bin/spkverif_gmm.py", line 21, in <module>
    sys.exit(spear.script.spkverif_isv.main())
File "/home/user/bobspear117/bob.spear-1.1.7/spear/script/spkverif_isv.py", line 492, in main
    speaker_verify(args)
File "/home/user/bobspear117/bob.spear-1.1.7/spear/script/spkverif_isv.py", line 460, in speaker_verify
    executor.execute_tool_chain()
File "/home/user/bobspear117/bob.spear-1.1.7/spear/script/spkverif_isv.py", line 85, in execute_tool_chain
    self.m_tool_chain.train_projector(self.m_tool, force = self.m_args.force)
File "/home/user/bobspear117/bob.spear-1.1.7/spear/toolchain/ToolChain.py", line 257, in train_projector
    tool.train_projector(train_features, str(projector_file))

File "/usr/lib/python2.7/site-packages/facereclib-1.2.3-py2.7.egg/facereclib/tools/UBMGMM.py", line 196, in train_projector
raise ValueError(msg)
ValueError: **Problem need at least one array to concatenate

so I can see that even though my data has a person speaking in it I may need to adjust how the configuration works.

Where can I find documentation on how to adjust the feature extraction part? I am not certain what part does feature extraction. I am guessing it is one of the three from how I called:

./bin/spkverif_gmm.py -d config/database/timit.py -p config/preprocessing/energy.py \
-f config/features/mfcc_60.py -t config/tools/ubm_gmm/ubm_gmm_256G.py -b ubgm -z \
--user-directory $USRDIR --temp-directory $TMPDIR

either:

config/preprocessing/energy.py

config/features/mfcc_60.py

config/tools/ubm_gmm/ubm_gmm_256G.py

Where do I find documentation on each of these and how do I go about learning how to change the parameters other than blindly guessing at what settings to set each of the settings to ?

Thanks

elie khoury

unread,

Feb 24, 2015, 3:01:39 AM2/24/15

to bob-...@googlegroups.com

Hello,

I guess the problem is that your wave files are not in the good place, so the system is skipping them.

Your wave files defined in all the lists (for_models.lst, for_probes.lst, for_znorm.lst, train_world.lst, etc.) should co-exist in the same directory:

wav_input_dir = "/drive2/audio/"

that you defined in your config file. and the file path inside those lists should be relative path

That is, suppose you have a file:

/drive2/audio/tdubeo

that you want to use in your for_probes.lst,.

Then, your list should contain something like this:

tdubeo 30002

Best regards,

Elie

Maria Garcia

unread,

Feb 24, 2015, 10:17:19 AM2/24/15

to bob-...@googlegroups.com

Thank you for your help. you have helped a lot so far. I did some digging and added some print statements and found out they are being found but not being included. the shape of the features is wrong.

the shape of each file is something like this, when I print the contents of a given feature vector it looks like a bunch of ones and zeros. I'm including a dump here the 'after load x is where I dump the vector'

preprocess 15 wave from directory /drive2/audio/ to directory /home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ubgm/preprocessed
extract 15 features from wav directory /drive2/audio/ to directory /home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ubgm/features
Training Projector '/home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ubgm/Projector.hdf5' using 12 training files:

Loading training files ['/drive2/audio/bafrpo.hdf5', '/drive2/audio/ltcpnh.hdf5', '/drive2/audio/meqdho.hdf5', '/drive2/audio/mljlxo.hdf5', '/drive2/audio/sidgio.hdf5', '/drive2/audio/sluiuo.hdf5', '/drive2/audio/sprxbo.hdf5', '/drive2/audio/teomrh.hdf5', '/drive2/audio/tgpcso.hdf5', '/drive2/audio/tjkfbh.hdf5', '/drive2/audio/tsmwyh.hdf5', '/drive2/audio/txtdxo.hdf5']
after load x is [1 1 1 ..., 1 1 1]
after load x is [1 1 1 ..., 1 1 1]
after load x is [0 0 0 ..., 1 1 1]
after load x is [0 0 1 ..., 1 1 1]
after load x is [0 0 0 ..., 1 1 1]
after load x is [0 0 0 ..., 0 0 0]
after load x is [0 0 0 ..., 0 0 0]
after load x is [1 1 1 ..., 1 1 1]
after load x is [1 1 1 ..., 1 1 1]
after load x is [0 0 0 ..., 1 1 1]
after load x is [0 0 0 ..., 0 0 0]
after load x is [0 0 0 ..., 1 1 1]
projector_file /home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ubgm/Projector.hdf5
train_features []

-> Training UBM model with 0 training files []

Traceback (most recent call last):
File "./bin/spkverif_gmm.py", line 21, in <module>
    sys.exit(spear.script.spkverif_isv.main())
File "/home/user/bobspear117/bob.spear-1.1.7/spear/script/spkverif_isv.py", line 492, in main
    speaker_verify(args)
File "/home/user/bobspear117/bob.spear-1.1.7/spear/script/spkverif_isv.py", line 460, in speaker_verify
    executor.execute_tool_chain()
File "/home/user/bobspear117/bob.spear-1.1.7/spear/script/spkverif_isv.py", line 85, in execute_tool_chain
    self.m_tool_chain.train_projector(self.m_tool, force = self.m_args.force)

File "/home/user/bobspear117/bob.spear-1.1.7/spear/toolchain/ToolChain.py", line 263, in train_projector
tool.train_projector(train_features, str(projector_file))
File "/usr/lib/python2.7/site-packages/facereclib-1.2.3-py2.7.egg/facereclib/tools/UBMGMM.py", line 192, in train_projector

array = numpy.vstack(train_features)
File "/usr/lib64/python2.7/site-packages/numpy/core/shape_base.py", line 226, in vstack
return _nx.concatenate(map(atleast_2d,tup),0)
ValueError: need at least one array to concatenate

when I was looking in the code it only adds the feature vector to the train_features list under the following conditions

x appears to be the wrong shape I guess (nothing is added) ==> [0 0 0 ..., 1 1 1]

if x.shape[0] > 0 and len(x.shape) ==2:
train_features.append(x)

I found that in spear/toolchain/ToolChain.py

so I guess my question is, what do my wav files need to look like in order to be processed correctly? Is there a minimum specification? I started with sph files which I converted to a mono channel wav.

elie khoury

unread,

Feb 26, 2015, 5:23:57 AM2/26/15

to bob-...@googlegroups.com

Hello,

The version of Spear you’re using supports .wav as well as .sph.

But if you correctly converted the audio file using sox, there should be no issue generating the features.

1. Did you get any warning while creating the wave files? Something like “"Warning: no speech found "

2. Otherwise, it seems like the system is using the VAD outputs (in /home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ubgm/preprocessed) instead of the Cepstral features.

Would it be possible to send me in a private message an example of your wave files, so I can look on?

Maria Garcia

unread,

Mar 2, 2015, 4:01:41 PM3/2/15

to bob-...@googlegroups.com

Hi,

I can send you a private copy of some wave files, but I think maybe its because the Universal background model was too small. I was reading how someone else here didn't have enough samples in their model. I only had 12. I have now put about 500 into my universal background model and I am not getting that crash anymore. However I now get a different crash.

using a modification of the timit approach:

I invoke the timit as follows:

#!/bin/bash

C8=`date +'%Y%m%d%H%M'`

nohup ./bin/spkverif_ivector.py -d config/database/timit.py -p config/preprocessing/energy.py \
-f config/features/mfcc_60.py -t config/tools/ivec/ivec_256g_t100_cosine.py -z -b ivectorcosine \
--user-directory $USRDIR --temp-directory $TMPDIR --preprocessed-features-directory preprocessed \
--features-directory features --projected-directory projected &> timit${C8}.out&

config/database/timit.py: (its custom for my stuff)

#!/usr/bin/env python

import xbob.db.verification.filelist

# 0/ The database to use
name = 'timit'

db = xbob.db.verification.filelist.Database('protocols/timit/33/')
protocol = None

# directory where the wave files are stored

wav_input_dir = "/flex25/wav/"

wav_input_ext = ".wav" # default extension

everything else above in the config py files is the same

In my norm directory (protocols/timit/33/norm/) I have a

train_world.lst

train_world_optional_1.lst

train_world_optional_2.lst

all 3 files are the same. -- I did this pattern because voxforge uses this approach -- I couldn't really find any documentation as to the purpose of the 'optional' files or why they were there or how to use them.

When I run I see this on the console (here is a snippet):

Enroler '/home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ivectorcosine/WhiteEnroler.hdf5' already exists.
project 575 projected_ivector i-vectors to directory /home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ivectorcosine/whitened_ivector using Whitening Enroler
project 575 whitened_ivector i-vectors to directory /home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ivectorcosine/lnorm_ivector
Training LDA Projector '/home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ivectorcosine/LDAProjector.hdf5' using 572 identities:
Skipping one client since the number of client files is only 1
Skipping one client since the number of client files is only 1

... (the 'Skipping' message occurs a total of 556 times)

Skipping one clien) = 4096
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x2c} ---
+++ killed by SIGSEGV (core dumped) +++

error 139

====

It gets interesting. If I run voxforge out of the box, it works. So I compared the config/database/voxforge.py to config/database/timit.py . There isn't much of a difference except one uses xbob.db.verification.filelist and the other uses xbob.db.voxforge.Database . Voxforge uses its own implementation of verification.filelist and perhaps other aspects as well. Voxforge installs the 'lookup' for its database to an egg in the site-packages area and expects to find the look ups at that location. On my vm its located here:

/usr/lib/python2.7/site-packages/xbob.db.voxforge-0.1.0-py2.7.egg/xbob/db/voxforge
mv lists lists.hold
ln -s /home/user/bobspear117/bob.spear-1.1.7/protocols/timit/33 lists

The next time I do a voxforge run I want it to use my custom data and not the voxforge data. So I went ahead and did a voxforge run the next time.

nohup ./bin/spkverif_ivector.py -d config/database/voxforge.py -p config/preprocessing/energy.py \
-f config/features/mfcc_60.py   -t config/tools/ivec/ivec_256g_t100_cosine.py -z -b ivector_cosine \
--user-directory $USRDIR --temp-directory $TMPDIR --groups eval   \
--preprocessed-features-directory preprocessed \
--features-directory features --projected-directory projected &> timit${C8}.out&

I ran the above and after a long run, it crashed with the same error as before only this time there was no segfault.

the last lines on the console are:
Skipping one client since the number of client files is only 1
Skipping one client since the number of client files is only 1
I looked into the source code and found the line appears to be coming from lda_read_data which is called by lda_train_projector.

So my question is what is causing this error?

about my files:

my UBM has 500 files of 6 -10 seconds each
in my eval directory I have:
$ wc -l *

1 for_models.lst
2 for_probes.lst
2 for_znorm.lst

5 total

in my dev directory
$ wc -l *

1 for_models.lst
2 for_probes.lst
2 for_znorm.lst

5 total

for the purposes of my testing the contents of dev and eval are identical

my for_probes.lst and for_znorms.lst files are identical.
my for_models.lst has one file.
my for_probes.lst has 2 files (2 tests).

I ran it with strace and saw a bit more about the problem:
but the problem is I don't know if what I was is a problem or not. I saw where it
stat("/home/user/bobspear117/bob.spear-1.1.7/tim1tmp/ivectorcosine/LDAProjector.hdf5", 0x7fff619a4eb0) = -1 ENOENT (No such file or directory)
then it tried to look at the timit/33/dev, then timit/33/eval, then
stat("/home/user/bobspear117/bob.spear-1.1.7/protocols/timit/33/norm/train_world.lst", {st_mode=S_IFREG|0666, st_size=8008, ...}) = 0
stat("/home/user/bobspear117/bob.spear-1.1.7/protocols/timit/33/norm/train_optional_world_1.lst", {st_mode=S_IFREG|0666, st_size=8008, ...}) = 0
stat("/home/user/bobspear117/bob.spear-1.1.7/protocols/timit/33/norm/train_optional_world_2.lst", {st_mode=S_IFREG|0666, st_size=8008, ...}) = 0

Elie Khoury

unread,

Mar 2, 2015, 10:33:53 PM3/2/15

to bob-...@googlegroups.com

Hello,

Briefly, the error is coming from the fact that your training data has only one speech utterance (i.e. one wave file) per speaker. That is why LDA training module is skipping all training speakers, and ended up being empty, and thus, crashing.

For your info, training LDA, WCCN and PLDA requires several sessions per speaker.

If this condition is not available (which looks weird to me), you can only do LNorm and apply cosine distance.

To do so, just skip those steps (--skip-lda-projection --skip-lda-train-projector, etc., you may check the help command for others)

and replace in your script (spear/spkverif_ivector.py):

cur_type = 'wccn_projected_ivector'

by:

cur_type = 'lnorm_ivector'

Thanks,

Elie

Maria Garcia

unread,

Mar 3, 2015, 8:53:26 AM3/3/15

to bob-...@googlegroups.com

Hi Elie,
I will give that a try. my training file is 5 minutes long so i thought that would give it sufficient time to pick segments out of that to get trained.

I saw the cur_type flag in the script but didn't know what that was about. i will give that a try. so I need to do both: set the command line parameters and the cur_type flag ?

Thank you for your help

Maria

elie khoury

unread,

Mar 4, 2015, 5:31:48 AM3/4/15

to bob-...@googlegroups.com

Hello,

Yes please, set both command line arguments and cur_type for this workaround solution.