Hello All,
This is my issue regarding understanding SAT and FMLLR. From following discussions in this topic so far I know "the practice is to estimate fMLLR matrices for the individual training
speakers and train on the adapted features; this is called Speaker
Adapted Training (SAT)" but still I am not able to understand how the fMLLR matrices for the individual training speakers are useful while performing test on testing speakers. Don't we need to adapt the SAT models (I do not understand how these speaker dependent models are used for testing, there is not one SI model as a reference to adapt) with a portion of test speakers data? If so which script does this and with how much of data? Putting it in more general way I want to know what is happening, when I run
steps/train_sat.sh
steps/decode_fmllr.sh
steps/align_fmllr.sh
I know this issue is about my lack of good theoretical understanding in these topics. Any help in understanding this concept will be helpful.
With regards,
Subash