Linked Open Data People: Data Gathering Sprint

36 views
Skip to first unread message

Jun Ogawa

unread,
Jun 25, 2025, 11:45:39 AMJun 25
to Ancient People
Dear colleagues,

On Tuesday July 29 at 12:00 BST, the Pelagios Network’s People Activity will run a Sample Data Gathering Sprint. All are welcome to join us in looking at examples of prosopographical, onomastic or other person-related linked open data, and ingesting a small sample of various datasets into the LOD-People repository for comparison and experimentation moving forward.


We particularly welcome you to attend if you have person-data in your project, or have worked with other openly licensed person-data that you could share with us. We also have some suggested sources of data that individual participants can help track down. This data will purely be used for discussion of formats and standards, and trying out any proposed interchange format in the future. The session will last one hour and be held via zoom.


Attached for your reference is a list of datasets and databases we have collected so far.


If you are not able to join in person, please send any contributions, suggestions, or sample datasets to the Ancient-People discussion list.


To register for the sprint, please go to <https://ics.sas.ac.uk/events/linked-open-data-people-data-gathering-sprint> and fill out the booking form. You will be sent the zoom link by email.


Best regards,


Jun and Gabby


Margherita Fantoli

unread,
Jul 3, 2025, 12:07:43 PMJul 3
to ancient...@googlegroups.com
Dear all,
I am writing on this mailing list in order to get some feedback/start a conversation on several points that my colleagues and I have been discussing in the last months. I am an assistant professor of Digital Humanities, and a classicist in my soul, and hence I carry out or supervise some projects where computational methods are applied to ancient texts. In one in particular, we want to advance the state of automatic named entity disambiguation when it comes to people mentioned in Ancient Greek and Latin non-documentary texts : our corpus can be considered a ‘literary’ one, where we include also scientific literature. We do not focus on ‘fictional’ texts such as novels, but in the annotation include mythological entities (one of the most discussed choices in past meetings). Well, what we needed was a “knowledge base” of people against which we could match the mentions detected in our corpus. The time-coverage is pretty wide (from Homer to the 4th century CE), and ideally the same resource would work for Latin and Ancient Greek sources, because we couldn't include the very long work of mapping overlaps between several resources. We work in close collaboration with the Trismegistos team, which pointed out the availability of the Pauly-Wissowa in Wikisource, which is, to our eyes, a goldmine: it is quite comprehensive, the list of all the entries is available (Register), it has a disambiguation system for the entries (name + number), it provides, when in the public domain, a textual description of the people and, when the article cannot be published yet, we still have a very short description (the kurztext). It can be found here: https://de.wikisource.org/wiki/Paulys_Realencyclop%C3%A4die_der_classischen_Altertumswissenschaft
The TM team worked on isolating people within the RE and interlinked the resource to their own database (WIP for certain aspects, but a tool for querying the real and for dowloading the list of all identifiers of people can be found here: https://www.trismegistos.org/real/ ). That said, we are also confronted with some limitations:
  • Not all the cultural spaces are equally well covered, for instance Christianity is not as complete as we could hope for
  • The Kurztext is really kurz and for computational purposes it doesn’t always help
  • The text is in German, which is only a minor issue, but many models we deal with are mostly tailored to English
  • There is no structured information attached to the people, nor systematic interlinking between the entries
The team behind Wikisource Pauly Wissowa has also occasionally worked on linking the entries to Wikidata records that are the subject of the entry. This is mainly done for the already published entries (and can be done very effectively by using a Wikidata property), but is also possible in principle for those that are only listed in the Register (even though with more tweaking). I have systematically checked one letter (the B) and came to the conclusion that more or less half of the RE people have also a Wikidata entry, but the current mapping is much lower and unevenly distributed, much better for the first letters of the alphabet, where also more articles can be fully published due to copyright expired.
On the other end, in collaboration with Camillo Pellizzari di San Girolamo, who, I should stress, has done most of the work, we developed a wikidata query to try to spot ‘ancient’ people on Wikidata, using different strategies (identifiers in databases, prosopographic information etc.). We land on around 30000 people, with some noise of course. However, in the process we noticed that the completeness of the results could be enhanced by minor additions to the Wikidata entries.
In short, thinking of our concrete needs I wanted to ask the following questions:
  • Are there projects/people who are currently working on ‘Wikidata people’ or linking their records to Wikidata? (I know of some, but would love to hear of more)?
  • Has anyone experience with using selected entries of Wikidata as a reference knowledge base for further processing, and has relevant experience to share about this? I am especially concerned about instability of the data, the face that there isn’t a single “property” that brings together all the relevant records etc. The handling of uncertain identifications is also a question mark.
  • Would anyone be interested in joining forces for the integration of people of the Pauly Wissowa into wikidata? That would already represent a huge amount of information.
For any feedback, you can contact me at: margherit...@kuleuven.be

Thank you in advance,
Kind regards,
Margherita





--
You received this message because you are subscribed to the Google Groups "Ancient People" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ancient-peopl...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ancient-people/386e1315-a27c-46ce-ae9d-b856d0f6722bn%40googlegroups.com.

Emilie Page-Perron

unread,
Jul 3, 2025, 1:13:21 PMJul 3
to ancient...@googlegroups.com
Dear Margherita,
Just in case this would be of interest, please see https://zenodo.org/records/15752988. They are using the Paulys Realencyclopädie der classischen Altertumswissenschaft.
All best,
Émilie

Margherita Fantoli

unread,
Jul 3, 2025, 1:26:08 PMJul 3
to ancient...@googlegroups.com
Thank you Émilie..these are my PhD students indeed, so you spotted us :-D
Evelien will join the upcoming meeting!

Emilie Page-Perron

unread,
Jul 3, 2025, 1:28:09 PMJul 3
to ancient...@googlegroups.com
Amazing! Sorry for my ignorance :-)
Émilie

Shaw, Ryan

unread,
Jul 4, 2025, 6:32:31 AMJul 4
to ancient...@googlegroups.com

> On Jul 3, 2025, at 12:07 PM, Margherita Fantoli <margheri...@gmail.com> wrote:
>
> Has anyone experience with using selected entries of Wikidata as a reference knowledge base for further processing, and has relevant experience to share about this? I am especially concerned about instability of the data …

PeriodO uses Wikidata to build the gazetteers we use to indicate spatial coverage of periods. Because, as you point out, the data can be unstable, we do not query Wikidata live but create our own local subgraph of Wikidata.

https://github.com/periodo/periodo-places#readme

> ... there isn’t a single “property” that brings together all the relevant records etc. The handling of uncertain identifications is also a question mark.

You could create an “ancient person” class and make all your ancient people instances of that class. While creating new WD properties is hard, anyone can create a new class.

Also maybe of interest:

https://www.wikidata.org/wiki/Property:P8069

https://www.wikidata.org/wiki/Property:P2972

https://www.wikidata.org/wiki/Property:P2460

Cheers,
Ryan

Charlotte Roueche

unread,
Jul 9, 2025, 4:57:54 AMJul 9
to ancient...@googlegroups.com

The list includes the Connecting Archives project. This find the BIAA list (which you have) and a parallel one for the BILNAS archives

https://slsgazetteer.org/person/

We produced a model , which has not yet been applied in full. Example:

https://slsgazetteer.org/person/18/

Name

URI

Dates of attestations in the Archive

Variant names – if used in the Archive

External links

Links to the data in the Archive

 

Charlotte

-----------------------------------------------------

Professor Charlotte Roueché

charlott...@kcl.ac.uk

http://orcid.org/0000-0002-3606-2049

Telephone: +44 7842 756384

 

 

--

Reply all
Reply to author
Forward
0 new messages