CFP: DEARING 2026 - 3rd International Workshop on Data-Centric Artificial Intelligence at ECML PKDD 2026
Workshop date: September 11, 2026.
Artificial
Intelligence (AI) has historically relied on two components: data and
algorithms. However, the conventional model-centric AI paradigm has
historically prioritized algorithms, often treating data as static
entities. Typically, data is initially collected, pre-processed, and
held fixed, with a significant portion of development efforts dedicated
to optimizing learned models. This conventional approach has led to the
creation of increasingly intricate and opaque models, necessitating
substantial training data. In contrast, the emerging data-centric AI
paradigm is dedicated to systematically and algorithmically generating
optimal data to feed Machine Learning (ML) models. The primary objective
of data-centric AI approaches is to consistently enhance data quality,
thereby achieving a level of model accuracy that was previously
considered unattainable through model-centric techniques alone. This
workshop aims to explore the transformative impact of recent
advancements in the data-centric AI paradigm on the future of AI and ML.
It serves as a platform for in-depth discussions and the exchange of
scientific contributions, recent achievements, and open challenges.
Workshop topics:
We
welcome submissions that explore the opportunities, perspectives, and
research directions within the data-centric AI paradigm. Potential
topics include, but are not limited to:
- High-quality data preparation:
- Data cleaning, denoising, and interpolation
- Novel feature engineering pipelines
- Label Errors and Confident Learning (CL) — Selecting features and/or instances
- Performing outlier detection and removal
- Ensuring label consensus
- Producing consistent and low-noise training data
- Extracting smart data from raw data
- Creating training datasets for small data problems
- Handling rare classes and explaining important class coverage in big data problems
- Incorporating human feedback into training datasets
- Generating high-quality synthetic data
- Combining multi-view, multi-source, multi-objective datasets
Data-centric ML and Deep Learning approaches:
- Active learning to identify the most valuable examples to label
- Core-set learning to handle big data
-
Semi-supervised learning, few-shot learning, weak supervision,
confident learning to take advantage of the limited amount of labels or
handle label noise
- Transfer learning and self-supervised learning
algorithms to achieve rich data representations to be used with
scarceness of labels
- Concept drift detection and management Adversarial learning to improve robustness and resilience
Responsible and Ethical AI:
- Ensuring fairness, bias, ethics, and diversity
- Green AI design and evaluation
- Scalable and reliable training
- Privacy-preserving and secure learning
- Reproducibility of AI
Data benchmark creation:
- Creating licensed datasets based on public resources
- Creating high-quality data from low-quality resources
Data-centric Explainable AI:
- Novel XAI methods to identify possible data issues in the learning stage
- XAI methods to generate features for machine learning problems
- Applications of novel data-centric AI solutions:
-
Healthcare and Medical Applications: Ensuring data diversity and
generating realistic patient data without exposing sensitive information
- Autonomous Vehicles and Smart Cities: Simulating representative scenarios for software testing
-
Cybersecurity and Fraud Detection: Detecting, exploring, or generating
rare/edge cases and patterns for machine learning robustness
- Manufacturing and Industrial Applications: Ensuring coverage in equipment failures for stress testing
- Facial Recognition and Biometrics: Increasing diversity in images to reduce bias
- Legal and Military Applications: Fostering data quality for fair and explainable systems
Website
https://dearing-workshop.github.io/#/landingSubmission Guidelines
The CMT submission portal is open. Direct link:
https://cmt3.research.microsoft.com/ECMLPKDDWT2026/Track/28/Submission/CreateWorkshop paper submission deadline
05-06-2026
Camera ready submission
10-07-2026
Accepted Formats
Regular research papers with 12 to 16 pages including references.
Short
research papers of at most 6 pages including references, aiming at
fostering discussion and collaboration (e.g. outlining new researching
ideas).
Requirements
PDF submission via Microsoft CMT.
English language, conference template required.
Springer LNCS style. Templates:
https://www.springer.com/gp/computer-science/lncs/conference-proceedings-guidelinesOrganizers
Donato Malerba - University of Bari Aldo Moro, Italy
Vincenzo Pasquadibisceglie - University of Bari Aldo Moro, Italy
Mara Sangiovanni - University Federico II of Naples, Italy
Miriam Seoane Santos - University of Porto, Portugal
Ricardo Cardoso Pereira - University of Coimbra, Portugal