CFP: DEARING 2026 - 3rd International Workshop on Data-Centric Artificial Intelligence at ECML PKDD 2026

6 views
Skip to first unread message

vincenzo pasquadibisceglie

unread,
May 7, 2026, 9:08:05 AMMay 7
to AIxIA mailing list
CFP: DEARING 2026 - 3rd International Workshop on Data-Centric Artificial Intelligence at ECML PKDD 2026

Workshop date: September 11, 2026.

Artificial Intelligence (AI) has historically relied on two components: data and algorithms. However, the conventional model-centric AI paradigm has historically prioritized algorithms, often treating data as static entities. Typically, data is initially collected, pre-processed, and held fixed, with a significant portion of development efforts dedicated to optimizing learned models. This conventional approach has led to the creation of increasingly intricate and opaque models, necessitating substantial training data. In contrast, the emerging data-centric AI paradigm is dedicated to systematically and algorithmically generating optimal data to feed Machine Learning (ML) models. The primary objective of data-centric AI approaches is to consistently enhance data quality, thereby achieving a level of model accuracy that was previously considered unattainable through model-centric techniques alone. This workshop aims to explore the transformative impact of recent advancements in the data-centric AI paradigm on the future of AI and ML. It serves as a platform for in-depth discussions and the exchange of scientific contributions, recent achievements, and open challenges.

Workshop topics:
We welcome submissions that explore the opportunities, perspectives, and research directions within the data-centric AI paradigm. Potential topics include, but are not limited to:
- High-quality data preparation:
- Data cleaning, denoising, and interpolation
- Novel feature engineering pipelines
- Label Errors and Confident Learning (CL) — Selecting features and/or instances
- Performing outlier detection and removal
- Ensuring label consensus
- Producing consistent and low-noise training data
- Extracting smart data from raw data
- Creating training datasets for small data problems
- Handling rare classes and explaining important class coverage in big data problems
- Incorporating human feedback into training datasets
- Generating high-quality synthetic data
- Combining multi-view, multi-source, multi-objective datasets

Data-centric ML and Deep Learning approaches:
- Active learning to identify the most valuable examples to label
- Core-set learning to handle big data
- Semi-supervised learning, few-shot learning, weak supervision, confident learning to take advantage of the limited amount of labels or handle label noise
- Transfer learning and self-supervised learning algorithms to achieve rich data representations to be used with scarceness of labels
- Concept drift detection and management Adversarial learning to improve robustness and resilience

Responsible and Ethical AI:
- Ensuring fairness, bias, ethics, and diversity
- Green AI design and evaluation
- Scalable and reliable training
- Privacy-preserving and secure learning
- Reproducibility of AI

Data benchmark creation:
- Creating licensed datasets based on public resources
- Creating high-quality data from low-quality resources

Data-centric Explainable AI:
- Novel XAI methods to identify possible data issues in the learning stage
- XAI methods to generate features for machine learning problems
- Applications of novel data-centric AI solutions:
- Healthcare and Medical Applications: Ensuring data diversity and generating realistic patient data without exposing sensitive information
- Autonomous Vehicles and Smart Cities: Simulating representative scenarios for software testing
- Cybersecurity and Fraud Detection: Detecting, exploring, or generating rare/edge cases and patterns for machine learning robustness
- Manufacturing and Industrial Applications: Ensuring coverage in equipment failures for stress testing
- Facial Recognition and Biometrics: Increasing diversity in images to reduce bias
- Legal and Military Applications: Fostering data quality for fair and explainable systems

Website
https://dearing-workshop.github.io/#/landing

Submission Guidelines
The CMT submission portal is open. Direct link: https://cmt3.research.microsoft.com/ECMLPKDDWT2026/Track/28/Submission/Create

Submission link

Workshop paper submission deadline
05-06-2026
 
Camera ready submission
10-07-2026

Accepted Formats
Regular research papers with 12 to 16 pages including references.
Short research papers of at most 6 pages including references, aiming at fostering discussion and collaboration (e.g. outlining new researching ideas).

Requirements
PDF submission via Microsoft CMT.
English language, conference template required.
Springer LNCS style. Templates: https://www.springer.com/gp/computer-science/lncs/conference-proceedings-guidelines

Organizers
Donato Malerba - University of Bari Aldo Moro, Italy
Vincenzo Pasquadibisceglie - University of Bari Aldo Moro, Italy
Mara Sangiovanni - University Federico II of Naples, Italy
Miriam Seoane Santos - University of Porto, Portugal
Ricardo Cardoso Pereira - University of Coimbra, Portugal


Reply all
Reply to author
Forward
0 new messages