Pronunciation dictionaries usually have some words that have alternative pronunciation, for example word “present” or “read”.
PRESENT P ER0 Z EH1 N T
PRESENT P R EH1 Z AH0 N T
PRESENT P R IY0 Z EH1 N T
READ R EH1 D
READ R IY1 D
When running MFA with lexicon containing such cases, MFA chooses one pronunciation from the dictionary. Having run an algorithm a couple of times it seems like the choice is not random or constant.
So how does MFA choose one phonemization? Is there some additional model for this or does MFA choose the right pronunciation based on an acoustic model somehow?