What is the significance of ivectors?

304 views

Skip to first unread message

Jaskaran Singh Puri

unread,

Apr 23, 2019, 12:57:57 AM4/23/19

to kaldi-help

Can someone explain conceptually in a couple of line, what exactly are ivectors?

And what does it mean when we are training ivectors on our data + rirs noise files, like how is it different from nnet training ?

David Snyder

unread,

Apr 23, 2019, 10:36:43 AM4/23/19

to kaldi-help

An i-vector is a mapping from a variable-length speech segment to a fixed-dimensional representation that captures the long-term characteristics of the audio, such as the speaker characteristics or recording device. Originally this representation was developed for speaker recognition/verification, but its now used in many areas of speech processing. In ASR it is provides an additional input (along with the acoustic features, such as MFCCs) to the DNN acoustic models, that helps the DNN learn to be robust to speaker and channel variations.

And what does it mean when we are training ivectors on our data + rirs noise files, like how is it different from nnet training ?

There's a separate system that extracts i-vectors from audio segments. This is usually trained on a clean subset of the data the ASR DNN is trained on.

Reply all

Reply to author

Forward

0 new messages