Thanks very much for the detailed description! I think I have a handle on how to implement this now.
I hadn't made the link to your efforts on multilingual training, but what you say about keeping cross-lingual homographs out of the word lists/lexicon FSTs for each speaker group makes perfect sense. My mistake about also unifying word IDs -- the corpus I'm using has a lot of repeated prompts across speakers and by accident it seems to produce identical word lists between my two speaker groups.
It may be a while before I get round to experimenting with this properly, but I'll make sure to feed back anything that could make for useful guidance on model configuration :)
Thanks
Dan