help! about I-vectors in kaldi

汪子锐

unread,

Mar 22, 2016, 10:27:00 AM3/22/16

to kaldi...@googlegroups.com

Dear:

In steps/online/nnet2. train_ivec_extractor.sh is used to train i-vectors extractor,I want to know when we train i-vectors extractor，if we train total variability T and every sentemce is treated as different speaker ? just like the paper "front-factor analysis for speaker verfication"? or can you introduce a paper about your i-vectors?

The second question is when we have T,we use extract i-vectors.sh.. It need utt2spk,I want to know a certain speaker,for example he has many sentence.Wherthere every sentence is extracted a i-vectors and average or all sentence of him is extracted a i-vector.

Thanks very much!

Daniel Povey

unread,

Mar 22, 2016, 2:24:10 PM3/22/16

to kaldi-help

Dear:
In steps/online/nnet2. train_ivec_extractor.sh is used to train i-vectors extractor,I want to know when we train i-vectors extractor，if we train total variability T and every sentemce is treated as different speaker ? just like the paper "front-factor analysis for speaker verfication"? or can you introduce a paper about your i-vectors?

Every utterance is considered as a different speaker, it's just like normal iVectors.

The second question is when we have T,we use extract i-vectors.sh.. It need utt2spk,I want to know a certain speaker,for example he has many sentence.Wherthere every sentence is extracted a i-vectors and average or all sentence of him is extracted a i-vector.

That script extracts them per utterance, and then also computes per-speaker ones using this command:

if [ $stage -le 2 ]; then

# Be careful here: the speaker-level iVectors are now length-normalized,

# even if they are otherwise the same as the utterance-level ones.

echo "$0: computing mean of iVectors for each speaker and length-normalizing"

$cmd $dir/log/speaker_mean.log \

ivector-normalize-length scp:$dir/ivector.scp ark:- \| \

ivector-mean ark:$data/spk2utt ark:- ark:- ark,t:$dir/num_utts.ark \| \

ivector-normalize-length ark:- ark,scp:$dir/spk_ivector.ark,$dir/spk_ivector.scp || exit 1;

fi

Dan

Thanks very much!

--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

汪子锐

unread,

Mar 22, 2016, 8:59:00 PM3/22/16

to kaldi...@googlegroups.com

Thank you dear Dan!

Just as you say "ivector-extract" extract i-vectors from every sentence,so what's the meaning "--spk2utt"..It says

It says "supply this option if you want ivectors to be output at the per-speaker level,estimated using stats accumulated from multiple utterances"

It is not the average for each speaker？

Daniel Povey

unread,

Mar 22, 2016, 9:04:00 PM3/22/16

to kaldi-help

That would sum the stats and compute it from the summed stats... it gives a different answer, and I assume that it was not better than averaging the iVectors, or we'd be using that in the scripts.
Dan

ZiRui W

unread,

Mar 23, 2016, 12:22:16 AM3/23/16

to kaldi...@googlegroups.com

Thanks Dan.

Sorry to bother you again!

The last question:

when we train i-vectors extracor ,if covariance matrix is also updated or just the same with UBM. I note he covariance matrix is first transformed into a full matrix from diag matrix,why?

Daniel Povey

unread,

Mar 23, 2016, 1:51:47 AM3/23/16

to kaldi-help

Thanks Dan.
Sorry to bother you again!
The last question:
when we train i-vectors extracor ,if covariance matrix is also updated or just the same with UBM

It is updated but it's not 100% clear that this is helpful. There is a flag in ivector-extractor-acc-stats (IIRC) which can turn this off.

I note he covariance matrix is first transformed into a full matrix from diag matrix,why?

To handle the general case when it is full.
Dan

Reply all

Reply to author

Forward