sre10 using TIMIT issue with what is there in data/sre

Mythili Pala

unread,

Oct 9, 2017, 8:10:44 AM10/9/17

to kaldi-help

Hi,

i am using the SRE10 recipe with TIMIT database for spkr recognition.....

i am new to spkr recognition ...

the timit is having only train and test data bases which i am using but sre10 recipe is having four different data bases( data divisions) sre ,sre10_train , sre10_test , train.....

in place of sre10_train and train in data folders i am using training data only...
and in place of sre,sre10_test i am using the test data from the TIMIT database....
i am using the trails file like the following
FADG0 FADG0_SA1 TARGET
FAWF0 FADG0_SA1 IMPOSTOR
....
.....

can you please help me how i can use the TIMIT database to understand spkr recognition flow in kaldi and from there i can learn speaker recognition ....

thanks

David Snyder

unread,

Oct 9, 2017, 11:03:51 AM10/9/17

to kaldi-help

I'm not familiar with using TIMIT for speaker recognition, so I'm not sure how the evaluation is set up. It sounds like you might have only evaluation data and nothing to train your models with. Hopefully someone who has used TIMIT for this purpose can comment more. If you don't have any training data, you could try using the Librispeech corpus (look at the recipe in egs/ for more info).

You need at least the following datasets:

+ Training data. This is used to train the UBM, i-vector extractor and PLDA model. It should be non-overlapping with the other datasets. In the sre10 recipe, it corresponds to the "train" and "sre" data. The "sre" data is just a subset of "train" used to train the PLDA model, but it doesn't have to be that way in general.

+ Enrollment data. This is a subset of the evaluation data in which you know the identity of the speaker in the recording. Using the models created in the previous step, i-vectors are generated from this data. If you have multiple enrollment recordings per speaker, you might average their i-vectors to get speaker-level representations. In the sre10 recipe, this dataset is called "sre10_train."

+ Test data. This is also part of the evaluation data, and consists of recordings for which you don't know the identity of the speaker. These are compared (using the PLDA model or cosine distance) with the i-vectors created from the enrollment data. This dataset is called "sre10_test" in the recipe. The set of comparisons is defined by the "trials" file.

MythiliSharan

unread,

Oct 10, 2017, 1:01:16 AM10/10/17

to kaldi...@googlegroups.com

thanks david...

i am have training data and testing data from timit. i will separate them as Training data ,Enrollment data and Test data as per the description in your mail.

i will also look into the Librispeech recipe also.

once again thanks alot

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to a topic in the Google Groups "kaldi-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kaldi-help/mprIXxOQxSY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kaldi-help+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Mythili Sharan
Osmania University
Hyderabad
mobile 9493135757

Omid Sadjadi

unread,

Oct 10, 2017, 12:11:22 PM10/10/17

to kaldi-help

I thought I already shared my lists for training and testing a speaker recognition system using TIMIT:

http://www.utdallas.edu/~sadjadi/lists.tar.gz

In this tar ball file you can find a list for training the UBM and T matrix (ubm.lst), and a list for training the LDA/PLDA model (ubm_with_inds.lst).

From your email, it look like you have already figure out the trials and speaker model maps which are used for speaker enrollment and tests.

Regards,
Omid

To unsubscribe from this group and all its topics, send an email to kaldi-help+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

MythiliSharan

unread,

Oct 17, 2017, 7:58:00 AM10/17/17

to kaldi...@googlegroups.com

Hi,

i am able to run the complete recipe and finally i got the result as ind pooled: ...... and ind female: .....

where i can see the which speaker identified as who.... it has plda scores in local folder.....

thanks

To unsubscribe from this group and all its topics, send an email to kaldi-help+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward