Adapting Language Model (G.fst)

210 views

Skip to first unread message

Saman Mousazadeh

unread,

Jul 17, 2019, 7:46:57 AM7/17/19

to kaldi-help

Hi all,

I want to adapt my simple Language model (e.g 4-gram ARPA format Estimated using Poco-lm ).

In my scenario We have some "ever green" data like, United States of America (4-gram) and we have some trending data like; president Obama (much lower probability than 4 years ago) and president Trump (much higher probability than 4 years ago).

Lets say I learned the ever green and I have G.fst from Wikipedia ( or something else, any suggestion is truly welcomed ). (a lot of data)

I can use google-news ( or something else, any suggestion is truly welcomed ) to find what people are talking right now .( a little data)

How can I update the G.fst?

Daniel Povey

unread,

Jul 17, 2019, 3:04:01 PM7/17/19

to kaldi-help

You wouldn't attempt to combine at the G.fst level, you would combine at the ARPA level. If you do

git grep -w interpolate

you'll find some example scripts that use SRILM to interpolate ARPA-format models.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/dd285247-52c5-48cf-83d0-d39b5ca2a192%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages