PLDA enroll under bob.kaldi

120 views
Skip to first unread message

Chang Pan

unread,
Apr 13, 2018, 12:15:09 AM4/13/18
to bob-devel
Hi,

I found it really strange that if I enroll one utterance of multiple speakers, then the plda score would be very accurate on speaker identiftication.
However, if I enroll multiple utterances of multiple speakers, the scores would be meaningless. 
I'm wondering if the package only supports enroll one utterance for each speaker. But the comment of plda_enroll() function is:
 feats : numpy.ndarray
        A 2D numpy ndarray object containing iVectors (of a single speaker).
My code is:
     def enroll_helper(self, ivector):
        return bob.kaldi.plda_enroll(ivector, self.plda_parameter[1])

    def enroll_all(self, nProcess=8, save=True):
        pool = Pool(nProcess)
        self.enrolled = pool.map(self.enroll_helper, self.train_data)
        pool.close()
        pool.join()

Could anyone tell what goes wrong here?
Any help or suggestion would be helpful!

Milos

unread,
Apr 13, 2018, 1:50:58 AM4/13/18
to bob-devel
Hi Chang,

I guess that current implementation supports only one utterance per speaker. The function plda_enroll() is general, valid for multiple utterance as well, however, the function plda_score() assumes that only one utterance has been used for enrollment.

More specifically, bob.kaldi calls the ivector-plda-scoring (the Kaldi binary) with default parameters. To support variable number of utterances, probably this comment of ivector-plda-scoring is valid:

For training examples, the input is the iVectors averaged over speakers;

a separate archive containing the number of utterances per speaker may be

optionally supplied using the --num-utts option; this affects the PLDA

scoring (if not supplied, it defaults to 1 per speaker).


Best,
Milos

Sebastien Marcel

unread,
Apr 13, 2018, 4:56:23 AM4/13/18
to bob-devel
bob.kaldi is using the PLDA implementation from Kaldi I think but since you are in the bob ecosystem you can also use the native PLDA implementation from bob itself
from the bob.learn.em package https://www.idiap.ch/software/bob/docs/bob/bob.learn.em/stable/py_api.html 


we are using it for our face recognition experiments http://pythonhosted.org/bob.bio.face/baselines.html#the-algorithms where we have several examples per class.

similarly it is used for speaker recognition in bob.spear


On Friday, April 13, 2018 at 6:15:09 AM UTC+2, Chang Pan wrote:

Amir Mohammadi

unread,
Apr 13, 2018, 8:26:39 AM4/13/18
to bob-...@googlegroups.com

--
-- You received this message because you are subscribed to the Google Groups bob-devel group. To post to this group, send email to bob-...@googlegroups.com. To unsubscribe from this group, send email to bob-devel+...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/bob-devel or directly the project website at http://idiap.github.com/bob/
---
You received this message because you are subscribed to the Google Groups "bob-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bob-devel+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Message has been deleted

Chang Pan

unread,
Apr 13, 2018, 3:05:20 PM4/13/18
to bob-devel
Thanks Marcel!
At this moment I would try to fix the bob.kaldi since most of my code is completed.
I'll try bob.learn later!

Chang Pan

unread,
Apr 13, 2018, 3:05:41 PM4/13/18
to bob-devel
Thanks Amir!

Chang Pan

unread,
Apr 13, 2018, 3:12:46 PM4/13/18
to bob-devel
Hi Milos,

Thank you for your advice. I think you're totally correct! I'm dealing with it.

Chang Pan

unread,
Apr 13, 2018, 6:45:27 PM4/13/18
to bob-devel
Even if I save num_utts.ark by plda_enroll() and add '--num-utts=ark:ivector_pkl/num_utts.ark' within plda_score, the score was still really strange :(
Thanks anyway...

Chang Pan

unread,
Apr 13, 2018, 9:20:02 PM4/13/18
to bob-devel

Hi Marcel,


I just tried to use bob.learn.em for my code. Even if I only enroll one utterance, the result is strange already.... I'm pretty sure my i-vector are extractor correctly since I've checked them by cosine function.

Could you check my implementation? Any suggestion would be helpful.


Thank you



On Friday, April 13, 2018 at 1:56:23 AM UTC-7, Sebastien Marcel wrote:

captain...@gmail.com

unread,
May 5, 2018, 10:59:27 AM5/5/18
to bob-devel
Hi Chang,

     I am doing speaker recognition by using bob.kaldi as well. I am a little confused about ivector+PLDA training, Do you use wechat, Is it possible that we can add 

friends on wechat?

    If you are glad to add friends, my wechat ID is : whiskywithrocks
   
    Thank you.
Reply all
Reply to author
Forward
0 new messages