Can I use own dictionary without building dikctionary?

22 views
Skip to first unread message

Shantanu Nath

unread,
Feb 11, 2020, 2:01:52 AM2/11/20
to Nematus Support
Dear Sir,

I am trying to use nematus for English to Bangla machine translation. But i don't have have dictionary. Can I use my own dictionary or I have to use build_dictionary.py
if i use it what will be the parameter, i mean, i have to use corpus?

Philip Williams

unread,
Feb 11, 2020, 5:15:51 AM2/11/20
to Shantanu Nath, Nematus Support
Hi,

build_dictionary.py is used to create the JSON files that specify the model's vocabulary. Typically, you would give it the preprocessed version of your corpus file(s) as an argument. If you want your model to use separate source and target vocabularies, you run it twice - once each with the source and target corpora - or if you want a shared vocabulary then you can concatenate the source and target corpora and run it on the combined file. See this script for an example of how preprcessing can be done using scripts from the Moses toolkit and subword-nmt - the build_dictionary.py step comes right at the end:


Best wishes,
Phil



--
You received this message because you are subscribed to the Google Groups "Nematus Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nematus-suppo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nematus-support/e4711727-3770-4ef3-a295-ffb7e960cf26%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages