CALL FOR PAPERS
ACM Journal of Data and Information Quality (JDIQ)
Special Issue on Quality of Synthetic Data
** SUBMISSION DEADLINE: March 3, 2026 **
Online CFP:
https://dl.acm.org/pb-assets/static_journal_pages/jdiq/pdf/ACM_JDIQ_SI_Synthetic_Data-1761382323477.pdf================================================================================
GUEST EDITORS:
Andrea Maurino, University of Milano Bicocca, Italy
andrea....@unimib.itFabian Panse, University of Augsburg, Germany
fabian...@uni-a.dePaolo Missier, University of Birmingham, UK
p.missier@bham.ac.uk================================================================================
OVERVIEW:
Synthetic Data Generation (SDG) methods estimate the probability distribution
of their training data and then sample from the learned distributions. The
resulting synthetic datasets emulate information found in the actual data while
maintaining the ability to draw valid statistical inferences. These methods
have been motivated by the need to mitigate privacy and confidentiality
concerns, for instance, in health care settings, but have also been routinely
used in Machine Learning and AI settings as a form of oversampling, to provide
new training data points when real data is insufficient, costly to label, or
biased. Closely associated with SDG is Data Augmentation (DA), a strategy that
applies across diverse data modalities, where multiple variations of existing
data points are generated by applying mode-specific transformation operators to
them.
Foundational SDG and DA techniques have matured significantly over the past few
years. However, gaps remain in understanding and systematically evaluating,
ensuring, and enhancing the quality of synthetic data, ranging from general
properties such as fidelity that can be measured using adversarial techniques,
to more domain-, type-, and application-specific quality measures. This special
issue invites contributions that develop innovative quality metrics, report on
relevant case studies, or propose novel SDG and DA methods that include quality
guarantees by design. We also welcome research that bridges the gap between
theoretical quality measures and practical deployment requirements, addresses
scalability challenges in quality assessment, or establishes new benchmarks for
domain-specific synthetic data evaluation.
================================================================================
TOPICS OF INTEREST:
Topics of interest include, but are not limited to:
* Quality-aware SDG methods for different data types, domains, and applications
* Evaluation of SDG for multimodal data (images, audio, video, text) with
cross-mode semantic consistency constraints
* Evaluation of Semantics-aware SDG for textual data that requires
out-of-vocabulary generalisation capabilities
* Evaluation of synthetic multi-table data with cross-table dependencies
* Interactive quality assessment tools and human-in-the-loop SDG
* Adversarial robustness or quality degradation analysis in synthetic datasets
* Quality-aware SDG techniques that optimize for specific utility measures for
downstream task performance as well as domain-specific requirements
* Evaluation methods for synthetic data utility under data repurposing across
ML tasks
* Quality assessment frameworks for synthetic data that integrate
privacy-preserving guarantees, privacy-utility trade-offs, or metrics for
quantifying residual privacy risks
* Methods for tracing SDG/DA steps, quantifying the influence of source data
on synthetic outputs, and supporting auditability, reproducibility, and
regulatory compliance in quality assessment
* Domain-specific quality assessment frameworks. Examples include clinical
validity and regulatory compliance for synthetic healthcare data, ensuring
financial synthetic data preserves market dynamics and risk characteristics,
and more
* Standardization efforts and benchmark development for cross-domain synthetic
data quality evaluation
================================================================================
EXPECTED CONTRIBUTIONS:
We welcome the following types of research contributions:
* SURVEY PAPERS: Should present a coherent review of scientific work related
to data quality issues in data preparation, together with interesting future
research directions in the field (up to 23 pages).
* TECHNICAL PAPERS: Should present novel research contributions on the topics
above, clearly describing the progress from the state of the art and
providing evidence for the benefits of the contributions (up to 23 pages).
* EXPERIENCE PAPERS: Should detail recent applications of data quality
techniques in practice and industry, providing pertinent application
scenario(s), lessons learned, and open problems (up to 10 pages).
* RESOURCE PAPERS: Should present a new resource, such as a dataset or tool,
or an appealing compilation of multiple datasets (up to 10 pages).
================================================================================
IMPORTANT DATES:
Submission deadline: March 3, 2026
First-round review decisions: May 30, 2026
Deadline for revision submissions: July 31, 2026
Notification of final decisions: September 30, 2026
Camera-ready manuscript: October 30, 2026
Tentative publication: December 2026
================================================================================
SUBMISSION INFORMATION:
JDIQ welcomes manuscripts that extend prior published work, provided they
contain at least 30% new material, and that the significant new contributions
are clearly identified in the introduction.
Submission guidelines with LaTeX (preferred) or Word templates are available at:
https://dl.acm.org/journal/jdiq/author-guidelines#submPlease submit the paper by selecting as the type of submission:
"SI: Syndata-quality"
For questions and further information, please contact:
Andrea Maurino -
andrea....@unimib.it================================================================================
---------------