Phonetizer Free Download

0 views
Skip to first unread message

Ellis Ruan

unread,
Jul 22, 2024, 9:39:04 AM7/22/24
to stinocorfi

Hi Judy, You may, _phonetic_alphabet_article.pdf looks interesting.
Wondering whether there is an online phonetizer for EPA. And whether I should look into X-Sampa and CXS also to simplify the phonetic alphabet for SLA.

In this position paper, we argue for a role for overt phonetizers in second language learning. Phonetization or letter-to-sound conversion is often used simply as a module of text-to-speech synthesis (TTS) or to create pronunciations for dictionaries. Based on evidence of the overwhelming influence of orthographic input on second language pronunciation and word learning, we argue that on their own (or coupled with TTS), phonetizers can be effective support tools for two broad groups: 1.language learners and instructors, and 2.non-specialized users. We address the issues involved and give the example of a multilingual phonetizer under development in New Caledonia, a special status collectivity of France in the South Pacific. Encountering words and names in one of the almost 30 languages of the indigenous Kanak people of New Caledonia is an everyday experience, for example, on class lists, road signs and in news articles. Pronouncing these words is often a challenge, since each of the languages has its own phonology and its own orthography. We discuss the motivation behind the phonetizer, challenges in its development, and potential applications, many of which are common to other endangered or vulnerable and under-resourced languages.

phonetizer free download


Download Ziphttps://bytlly.com/2zDLRH



We present an approach to correct noisy User Generated Content (UGC) in French aiming to produce a pretreatement pipeline to improve Machine Translation for this kind of non-canonical corpora. In order to do so, we have implemented a character-based neural model phonetizer to produce IPA pronunciations of words. In this way, we intend to correct grammar, vocabulary and accentuation errors often present in noisy UGC corpora. Our method leverages on the fact that some errors are due to confusion induced by words with similar pronunciation which can be corrected using a phonetic look-up table to produce normalization candidates. These potential corrections are then encoded in a lattice and ranked using a language model to output the most probable corrected phrase. Compare to using other phonetizers, our method boosts a transformer-based machine translation system on UGC.

Complementary includes specifications for creating transcripts using the phonetic alphabet Mexbet and for converting Mexbet output to the International Phonetic Alphabet and X-SAMPA. An automatic phonetizer for Mexbet, written in Python 2.7, to create pronouncing dictionaries is provided as well.

760c119bf3
Reply all
Reply to author
Forward
0 new messages