Adapting Language Model (G.fst)

210 views
Skip to first unread message

Saman Mousazadeh

unread,
Jul 17, 2019, 7:46:57 AM7/17/19
to kaldi-help
Hi all,
I want to adapt my simple Language model (e.g 4-gram ARPA format Estimated using Poco-lm ).

In my scenario We have some "ever green" data like, United States of America (4-gram) and we have some trending data like; president Obama (much lower probability than 4 years ago) and president Trump (much higher probability than 4 years ago). 

Lets say I learned the ever green and I have G.fst from Wikipedia ( or something else, any suggestion is truly welcomed ). (a lot of data)
I can use google-news  ( or something else, any suggestion is truly welcomed ) to find what people are talking right now .( a little data)
How can I update the G.fst?

Daniel Povey

unread,
Jul 17, 2019, 3:04:01 PM7/17/19
to kaldi-help
You wouldn't attempt to combine at the G.fst level, you would combine at the ARPA level.  If you do
git grep -w interpolate
you'll find some example scripts that use SRILM to interpolate ARPA-format models.


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/dd285247-52c5-48cf-83d0-d39b5ca2a192%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages