You may need to look for an existing and closely related done on Arabic, Mandarin, Kanji, Hindu, Telegu and stuff like that, which could be adapted or used to jump-start yours
Some statistical machine translator like Moses uses ICU, and Unicode mappings but there is nothing stopping you from extending ANTLR for handling international language
God blesses!!!
Regards