How do I build a language model for Tamil in Kaldi?

956 views
Skip to first unread message

Kaushik Ramachandran

unread,
May 19, 2016, 3:15:39 AM5/19/16
to kaldi-help
I’m trying to use Kaldi to build a speech recognition system for Tamil Language. Can anyone tell me the procedure to build a language model for other languages such as Tamil? I already built a phonetic dictionary for Words-to-Phones conversion.

Sunit Sivasankaran

unread,
May 19, 2016, 4:39:43 AM5/19/16
to kaldi...@googlegroups.com
Given enough text data and computational resources, you can build a LM using SRILM toolkit. An example script can be found in egs/babel. See  egs/babel/s5/local/train_lms_srilm.sh

-Regards,
Sunit



On Thursday 19 May 2016 09:15 AM, Kaushik Ramachandran wrote:
I’m trying to use Kaldi to build a speech recognition system for Tamil Language. Can anyone tell me the procedure to build a language model for other languages such as Tamil? I already built a phonetic dictionary for Words-to-Phones conversion.
--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

akash kumawat

unread,
Jun 2, 2016, 11:06:44 PM6/2/16
to kaldi-help
Can u tell me how can we build Phonetic dictionary? like is it a editable file s dat we can directly edit and make dictionary?

Kaushik Ramachandran

unread,
Jun 3, 2016, 12:51:16 AM6/3/16
to kaldi-help
Phonetic dictionary is rather an application/program that maps each word with its phoneme expansion. You can actually build one by using any scripting language like python by writing your own rules which defines the morphology of your language.
Reply all
Reply to author
Forward
0 new messages