Minimum requirements for speaker recognition training:

201 views
Skip to first unread message

Tamir Tapuhi

unread,
Apr 6, 2016, 1:54:57 AM4/6/16
to kaldi-help
Hi all
I plan to work with the recipes in sre08 and sre10, and I saw a lot of LDC DBs which are needed for this,

I have two questions:
  1. Does anyone can tell me if there is any minimal requirement for this recipes? the cost of all those DBs is too expensive
  2. From where do i get the SRE 2010 DB and its cost? 
Here is the list of DBs i saw in the scripts:
     Corpus              LDC Catalog No.
     SWBD2 Phase 2       LDC99S79
     SWBD2 Phase 3       LDC2002S06
     SWBD Cellular 1     LDC2001S13
     SWBD Ceullar 2      LDC2004S07
     SRE2004             LDC2006S44
     SRE2005 Train       LDC2011S01
     SRE2005 Test        LDC2011S04
     SRE2006 Train       LDC2011S09
     SRE2006 Test 1      LDC2011S10
     SRE2006 Test 2      LDC2012S01
     SRE2008 Train       LDC2011S05
     SRE2008 Test        LDC2011S08
     Fisher speech       LDC2004S13, LDC2005S13 
     Fisher test         LDC2004T19, LDC2005T19   
     NIST SRE 2010 training set
     NIST SRE 2010 test set

Tamir

David Snyder

unread,
Apr 6, 2016, 10:55:27 AM4/6/16
to kaldi-help
I'm not sure where you can get the NIST SRE 2010 evaluation. I think that it was just distributed to institutions participating in it, and it may not be publicly available. If no one here can provide a pointer to it, you could try contacting someone at the LDC, to see if they have plans to make it available.

If it's hard to get the SRE10 data, my suggestion is to focus on the NIST SRE 2008 evaluation for now. The data is available from the LDC.

We never came up with a minimal datalist, and I don't think we ever compared results using different subsets of the available corpra. If I have to guess, I think Fisher might be the single most helpful dataset, followed by the older SRE datasets (e.g., SRE2003, 2005, etc).

Tamir Tapuhi

unread,
Apr 10, 2016, 3:32:59 AM4/10/16
to kaldi-help
Thank you very much for the quick response.

I have one more question, tit looks like the Fisher data is being used to train a speaker independent model, am i wrong? is there any use in the process with the speaker info. in from Fisher, or only the SRE's are being used?

Thanks again,
Tamir

בתאריך יום רביעי, 6 באפריל 2016 בשעה 17:55:27 UTC+3, מאת David Snyder:

David Snyder

unread,
Apr 10, 2016, 4:35:57 PM4/10/16
to kaldi-help
Yes, that's true. Both the sre08 and sre10 examples, the training speaker labels are only taken into account during PLDA or LDA training.
Reply all
Reply to author
Forward
0 new messages