Hi all,
I want to understand how fmllr transformation matrix is estimated. In order to understand fmllr, I want to point out about MAP estimation (just to compare and understand) , where we train the model parameters on training data and in order to adapt to new speaker, we use some portion of test data (suppose 25 % of test data) and estimate new model parameters (i.e weight, mean and covariance).
I want to know which of the following is correct
1. fmllr is linear transformation of features, we estimate the transformation matrix from training data and use that on test data or
2. We estimate transformation matrix from training data and use some portion of the test data for re-estimating the transformation matrix , which is then used for testing. (This looks similar to MAP estimation)
(tri3b)
When we do (LDA+MLLT+SAT) training, we estimate fmllr transformation matrix from training data using train_sat.sh
Then when I run decode_fmllr.sh, do we use any test speaker data (like 25% of test data) for re-estimating the transformation matrix or we just use the same transformation matrix from training data.
I went through the script and as per my understanding, I don't see I have given how much of the test data is required for adaptation.
I want to know what is happening, when I run
steps/train_sat.sh
steps/decode_fmllr.sh
steps/align_fmllr.sh
-Suhas