Converting alignments between separate models

196 views
Skip to first unread message

Joachim Fainberg

unread,
Aug 3, 2015, 9:21:19 AM8/3/15
to kaldi-help
Hello,

Say I have two corpora, corpus A and B. I'm trying to perform data augmentation by adding transformed speakers from B to speakers in A. To do this I've trained both corpora separately to tri3b systems (with the same lang directory). Within a larger script I estimate fMLLR transforms for speakers from B to speaker dependent models of A speakers, using gmm-est-fmllr. However, because this is between two different models I first need to convert the alignments between the models using convert-ali. Finally, I combine the new transformed B speakers with the original A, align and train a neural net.

What I've found is that, if A and B are trained with the same parameters (numleaves and totgauss), this works kind-of successfully (-5% relative). However, I tried to optimise the parameters for A with a held-out dev set, and performing the above again I then get a huge amount of warnings when aligning the combined set (1300 vs 100), and the resulting WER after neural net training is doubled. My suspicion is that this is because I'm abusing convert-ali, and I'm not sure how it actually deals with the two models having different trees.

Am I using convert-ali wrongly? Would it be better to try to constrain the B models to the trees generated in A (or vice versa), is that possible?

Cheers,

Joachim

Joachim Fainberg

unread,
Aug 3, 2015, 10:04:54 AM8/3/15
to kaldi-help
Sorry, I found the error. Because my new speakers' features are written to disk, I do the same with the unmodified A speakers, but forgot to do that anew for the new, optimised A. Hence, alignment were aligning with fMLLR features of a different model(!).

I'd still be interested in whether using convert-ali in this way is not quite right? How does it convert between different trees?

Thank you so much.

Daniel Povey

unread,
Aug 3, 2015, 2:58:55 PM8/3/15
to kaldi-help
convert-ali works by converting the alignments to phone sequences and getting the phonetic context from there, and using the other tree to convert the phone-in-context plus HMM-state to a pdf-id and then transition-id from the other model.  It does require that the phone sets be matched (phones.txt identical, except for possibly more or fewer disambig symbols like #1, #2, #3); otherwise it would give nonsense results.
Dan


--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages