Machine-Readable Texts for Egyptologists (seminar)

4 views

Skip to first unread message

Valeria Vitale

unread,

Jul 9, 2021, 10:26:04 AM7/9/21

to anti...@googlegroups.com

This afternoon's Digital Classicist London seminar is streamed live at https://youtu.be/K-y2MO_WWZc

Heidi Jauhiainen (University of Helsinki), Machine-Readable Texts for Egyptologists

Friday July 9, 2021, 17:00 (UK time/UTC+1)

In order to use digital methods to study texts, one needs them in machine-readable form. Assyriology has freely downloadable corpora of machine-readable texts, such as Open Richly Annotated Cuneiform Corpus, but the lack of similar corpora hinders the digital study of ancient Egyptian texts. A transliterated text in digital format, for example as a text or TEI file, is machine-readable. Producing transliterated texts manually is time consuming and, hence, there have been experiments in automatically producing transliterated texts. However, in order to produce machine-readable texts with automated transliteration, one needs machine-readable hieroglyphic texts. There is a tradition in Egyptology of using encoding to represent hieroglyphic texts so that the information on the signs themselves and their places in regard to each other is being maintained. Various types of encoding have been used when publishing texts in books but those machine-readable texts are not openly available. Such encoded texts could be produced by OCRing hieroglyphic texts, but this approach requires a lot of texts in the same handwriting for training the method.

In this paper, I present Machine-Readable Texts for Egyptologists, which is a three-year project that started in the beginning of 2021. The aim is to produce a large number of manually encoded hieroglyphic texts and then to develop an iterative process and methods for automatically transliterating the encoded texts. During the process, the automatically transliterated texts will be validated and, if necessary, corrected and then used for making the method more accurate. Both the coded texts and their transliterations will eventually be offered for free download.