Digital Classicist seminar: Applications of translation alignment in digital environments
6 views
Skip to first unread message
Gabriel Bodard
unread,
Jun 22, 2021, 9:00:08 AM6/22/21
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Antiquist
Digital
Classicist London SeminarFriday
June 25, 2021,17:00 (UK
time/UTC+1) Chiara
Palladino (Furman) & Tariq Yousef (Leipzig)We
want to learn all languages! Live
at https://youtu.be/R2Ms6yAMZss In
this seminar, we will introduce the topic of translation technologies, with particular regard to text and translation alignment, one of the most important and complex tasks of NLP. Then, we will present Ugarit (http://ugarit.ialigner.com/),
a web-based tool for manual and automatic alignment of parallel corpora. Conceived as a Citizen Science project to collect training data for the implementation of statistical machine translation of Ancient Greek, Persian, and English, Ugarit has now become
one of the most used digital environments for the manual alignment of texts in underrepresented and historical languages. Currently, Ugarit hosts corpora in 43 languages, and has been widely used in scholarly projects for the study of Armenian, Persian, Arabic,
Ancient Greek, Latin, Portuguese, and Egyptian. It has also been successfully applied in language teaching to facilitate a direct approach to original texts through the scaffolding provided by the systematic comparison with translations. While
most translation technologies are limited to the coverage of modern, widely indexed languages like English, Ugarit introduces a new way of working with languages that is based on manual alignment between parallel texts: with the systematic support of translations
in a known language, users can create datasets of aligned pairs to support language learning for themselves, study the reception of a particular text, or to provide alignments for other readers. Moreover, users contribute training data for the implementation
of statistical machine translation for underrepresented languages, which has been tested for Persian and Ancient Greek. The database of Ugarit can also be visualized and queried to investigate relationships across languages that have not been directly aligned:
by using the underlying graph database, we can visualize connections between words in two different languages by using a third language as a bridge, with which both languages have been aligned, and investigate broader phenomena such as word frequency across
languages and common tendencies in translations.
ALL WELCOME
==
Dr Gabriel BODARD (he/him)
Reader in Digital Classics
Institute of Classical Studies / Digital Humanities Research Hub
University of London
Senate House
Malet Street
London WC1E 7HU