How can I use cmvn in decoding?
cmvn is calculated with trained data(using spk2utt etc)
but in test, only audio file we have.
in the guide, below explanation I found, but I dont understand and how to use.
The basic solution we use is to do "moving-window" cepstral mean normalization. We accumulate the mean over a moving window of, by default, 6 seconds (see the "--cmn-window" option to programs in online2bin/, which defaults to 600). The options class for this computation, OnlineCmvnOptions, also has extra configuration variables, speaker-frames (default: 600), and global-frames (default: 200). These specify how we make use of prior information from the same speaker, or a global average of the cepstra, to improve the estimate for the first few seconds of each utterance. The program apply-cmvn-online can apply this normalization as part of a training pipeline so that we can can train on matched features.
please tell me how to do and if there is an example, give me the link.
thanks