Digital Classicist seminar: Applications of translation alignment in digital environments

6 views

Skip to first unread message

Gabriel Bodard

unread,

Jun 22, 2021, 9:00:08 AM6/22/21

to Antiquist

Digital Classicist London SeminarFriday June 25, 2021, 17:00 (UK time/UTC+1)
Chiara Palladino (Furman) & Tariq Yousef (Leipzig)We want to learn all languages!
Live at https://youtu.be/R2Ms6yAMZss
In this seminar, we will introduce the topic of translation technologies, with particular regard to text and translation alignment, one of the most important and complex tasks of NLP. Then, we will present Ugarit (http://ugarit.ialigner.com/), a web-based tool for manual and automatic alignment of parallel corpora. Conceived as a Citizen Science project to collect training data for the implementation of statistical machine translation of Ancient Greek, Persian, and English, Ugarit has now become one of the most used digital environments for the manual alignment of texts in underrepresented and historical languages. Currently, Ugarit hosts corpora in 43 languages, and has been widely used in scholarly projects for the study of Armenian, Persian, Arabic, Ancient Greek, Latin, Portuguese, and Egyptian. It has also been successfully applied in language teaching to facilitate a direct approach to original texts through the scaffolding provided by the systematic comparison with translations.
While most translation technologies are limited to the coverage of modern, widely indexed languages like English, Ugarit introduces a new way of working with languages that is based on manual alignment between parallel texts: with the systematic support of translations in a known language, users can create datasets of aligned pairs to support language learning for themselves, study the reception of a particular text, or to provide alignments for other readers. Moreover, users contribute training data for the implementation of statistical machine translation for underrepresented languages, which has been tested for Persian and Ancient Greek. The database of Ugarit can also be visualized and queried to investigate relationships across languages that have not been directly aligned: by using the underlying graph database, we can visualize connections between words in two different languages by using a third language as a bridge, with which both languages have been aligned, and investigate broader phenomena such as word frequency across languages and common tendencies in translations.

ALL WELCOME

==
Dr Gabriel BODARD (he/him)
Reader in Digital Classics

Institute of Classical Studies / Digital Humanities Research Hub

University of London
Senate House
Malet Street
London WC1E 7HU

E: Gabriel...@sas.ac.uk
T: +44 (0)20 78628752

Especially at the moment, I may email at odd hours of the day and night/days of the week. I do not ever expect a reply outside of your working hours.

Reply all

Reply to author

Forward

0 new messages