training LM using phone sequences

64 views
Skip to first unread message

Hua wang

unread,
Aug 29, 2018, 9:04:32 AM8/29/18
to phnrec
Hi: 
    
     I'm doing LRE recognization, here already have some phone sequences decoded by timit Phone Recognizer. ,
   
    for examples below, each line per utterance.

                   pau eh m er n z eh hh iy uw m uw hh ay iy z er n hh ih ng dh iy uw iy n v ow f ay n b iy ih dh iy n pau
                   pau pau s iy iy z ah v iy iy y ay iy ah v ow n dh ow z ay m ah n pau
                   pau m l iy n z ow l iy eh n t ey s eh n ah hh ey n sh er sh ah ow m n pau m ay y ae hh uw pau

    When training LM models, I'm using Srilm tools with no smooth .

               srilmbin/ngram-count -order 3 -text MlfLang/${lang].text -lm  LangModel/gram_${lang}_3  -addsmooth 0 .

    But the eer=25% !    not good .
   
     So can some one give any suggestion ?

Thanks.
   
 

Petr Schwarz

unread,
Aug 29, 2018, 9:11:55 AM8/29/18
to phn...@googlegroups.com

Hi. Usually some n-grams are missing in some languages. You can train one global model on all languages

and then to adapt it to particular languages, for example using language model interpolation. It helps a lot.

At least you should use only those n-grams that are correctly estimated for all languages in your scoring.

 

Petr   

--

---
You received this message because you are subscribed to the Google Groups "phnrec" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phnrec+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hua wang

unread,
Aug 31, 2018, 5:41:35 AM8/31/18
to phnrec
Hi, Petr Schwarz:
       
       using your suggestion, it really got good . Thanks.


在 2018年8月29日星期三 UTC+8下午9:04:32,Hua wang写道:
Reply all
Reply to author
Forward
0 new messages