regarding K means and GMM

30 views

Skip to first unread message

abhipo

unread,

Jun 12, 2017, 2:49:00 AM6/12/17

to bob-devel

Hi ,

I have questions regarding the following training options

1. What is the ,meaning of these options. Are the intermediate values of GMM and Kmeans iterations stored somewhere, so that I can continue from a specific iteration in case of KMEANS_START_ITERATION and GMM_START_ITERATION ?

2. If I choose LIMIT_TRAINING_EXAMPLES as N, are only top 'N' utterances considered for GMM and Kmeans ?

3. How to train with plda option ? Do I have to supply male and female files separately ?

[-l LIMIT_TRAINING_EXAMPLES]

[-K KMEANS_TRAINING_ITERATIONS]

[-k KMEANS_START_ITERATION]

[-M GMM_TRAINING_ITERATIONS]

[-m GMM_START_ITERATION]

Thanks and Regards,

Manuel Günther

unread,

Jun 16, 2017, 10:48:33 AM6/16/17

to bob-devel

Hi,

1. Indeed, this is the meaning of these options. The intermediate files during K-Means and GMM training are stored, and you can start from the given iteration.

2. Please read "train_gmm.py --help". There it is clearly written: "Limit the number of training examples used for KMeans initialization and the GMM initialization". This means that this only limits the training examples for the "initialization" steps, not for the rest of the training steps.

3. I don't think that we currently support training GMMs with PLDA. But I am not the GMM expert, so I cannot tell you more about that.

@Idiapers: I have seen that the bob.bio.gmm package is still very undocumented, the pages: http://pythonhosted.org/bob.bio.gmm/implementation.html and http://pythonhosted.org/bob.bio.gmm/parallel.html contain almost no information. I think we have missed these during the Hackathon...