6th Summer Datathon on Linguistic Linked Open Data (SD-LLOD’26)
=======================================================
The 6th Summer Datathon on Linguistic Linked Open Data (SD-LLOD-26) will be held from August 30th to September 4th 2026 at Villa Cagnola (Gazzada Schianno, Italy, near Milan). This edition of the datathon is jointly organised by the GOBLIN COST Action (as a GOBLIN training school) and the European Master in Linguistic Data Science (EMLDS).
The event has no registration fee and offers over fifteen travelling grants for participants.
In natural language processing, linguistics, and neighboring fields, Linguistic Linked Open Data (LLOD) describes a method and an interdisciplinary community concerned with creating, sharing and (re-)using language resources in accordance with Linked Data principles. LLOD is currently the principal way to build linguistic knowledge graphs on the Web.
Since 2015, a unique series of datathons/training schools has been organised to train on LLOD technologies and to further expand the field with new data and applications. This edition in Villa Cagnola is the sixth edition of such a datathon series.
Topics and outcomes
=================
During the datathon, sessions will be organised to cover topics such as:
Ontologies, Linked Data and Knowledge Graphs
The Lexicon Model for Ontologies (Ontolex-Lemon)
Integrating documents, annotations and NLP tools with Linked Data and RDF
Knowledge Graph embeddings and language resources
Neural approaches for linguistic data
During the datathon, participants will be able to:
Generate their own Linguistic Linked Data from existing data sources, using visual tools like VocBench and community standards like OntoLex lemon
Apply semantic technologies (linked data, knowledge graphs, RDF, SPARQL) to the field of language resources and learn about their benefits and applications for specific use cases, particularly those involving multilingual and/or multimodal aspects.
Explore the potential use of embeddings, machine learning, and deep learning techniques in combination with Linguistic Linked Data.
The programme of the summer datathon will contain three types of sessions:
Seminars to explain theoretical aspects and discuss selected topics.
Hands-on sessions to introduce the basic foundations of each topic, method, and techniques, which participants will apply directly through different practical assignments.
Datathon sessions, where participants will work, in groups of 3-5, on miniprojects and where they will apply what they have learned, involving the generation and/or use of Linguistic Linked Data.
Registration
============
The datathon is financially supported by the GOBLIN COST Action and the EMLDS master program, and it has no registration fee, but participants are expected to cover the cost of their meals and accommodation at the residence.
More than fifteen traveling grants for attendees will be provided by the GOBLIN COST Action (covering accommodation, meals and travel expenses). We particularly encourage applications by young researchers and innovators and researchers from Inclusiveness Target Countries (ITCs) [1]. Find out whether you are eligible to attend the datathon as a COST participant via the Am I eligible? COST tool.
For more about the registration process consult the datathon website https://datathon2026.fcsh.unl.pt/ or directly fill out the registration form here.
Important dates (tentative)
======================
Registration opens: 10/04/2026
Registration closes: 22/05/2026
Notification: 29/05/2026
Datathon: 30/08/2026 to 04/09/2026
Organisers
=========
Jorge Gracia (University of Zaragoza, Spain)
Christian Chiarcos (University of Augsburg, Germany)
Milan Dojchinovski (TIB and InfAI/DBpedia Association, Germany and CTU in Prague, Czech Republic)
Local organiser
=============
Marco Passarotti (Università Cattolica del Sacro Cuore di Milan, Italy)
Academic Advisory Board
=====================
Rute Costa (Universidade Nova de Lisboa, Portugal)
Jorge Gracia (University of Zaragoza, Spain)
Marco Passarotti (Università Cattolica del Sacro Cuore di Milan, Italy)
Tutors and lecturers
=========================================
Mehwish Alam (Institut Polytechnique de Paris, France)
Cristian Fäth (University of Augsburg, Germany)
Katerina Gkirtzou (Athena Research Center, Greece)
Francesco Mambrini (Università Cattolica del Sacro Cuore, Italy)
Blerina Spahiu (Università degli Studi di Milano-Bicocca, Italy)
Armando Stellato (University of Rome Tor Vergata, Italy)
Andon Tchechmedjiev (IMT École des Mines d’Alès, France)
-----
[1] See https://www.cost.eu/about/strategy/excellence-and-inclusiveness/ for the current list of ITCs