Call for papers 4th Workshop on Scholarly Document Processing - SDP@ACL 2024

Skip to first unread message

Tirthankar Ghosal

Feb 20, 2024, 12:00:08 PMFeb 20
to Ghosal, Tirthankar

** Call for Research Papers **

Scholarly literature is the chief means by which scientists and academics document and communicate their results and is therefore critical to the advancement of knowledge and improvement of human well-being. At the same time, this literature poses challenges to NLP uncommon in other genres, such as specialized language and high background knowledge requirements, long documents and strong structural conventions, multimodal presentation, citation relationships among documents, an emphasis on rational argumentation, and the frequent availability of detailed metadata. These challenges necessitate the development of NLP methods and resources optimized for this domain. The Scholarly Document Processing (SDP) workshop provides a venue for discussing these challenges, bringing together stakeholders from different communities including computational linguistics, machine learning, text mining, information retrieval, digital libraries, scientometrics and others, to develop methods, tasks, and resources in support of these goals.

This workshop builds on the success of prior workshops: the 1st, 2nd, and 3rd SDP workshops held at EMNLP 2020, NAACL 2021, and COLING 2022, and the 1st and 2nd SciNLP workshops held at AKBC 2020 and 2021. In addition to having broad appeal within the NLP community, we hope the SDP workshop will attract researchers from other relevant fields including meta-science, scientometrics, data mining, information retrieval, and digital libraries, bringing together these disparate communities within ACL.


X (Twitter):

Topics of Interest

We invite submissions from all communities demonstrating usage of and challenges associated with natural language processing, information retrieval, and data mining of scholarly and scientific documents. Relevant topics include (but are not limited to): 

  • Large Language Models (LLMs) for Science

  • Representation learning and language modeling

  • Information extraction and NER

  • Document understanding

  • Summarization and generation

  • Question-answering

  • Discourse modeling/argumentation mining

  • Network analysis

  • Bibliometrics, scientometrics, and altmetrics

  • Reproducibility and research integrity, including new challenges posed by generative AI

  • Peer review tools, principles and technology

  • Metadata and indexing

  • Inclusion of datasets and computational resources

  • Research infrastructures and digital libraries

  • Increasing the representation in scholarly work of disadvantaged populations

  • LLM-based interfaces to consume/produce scholarly documents


** Submission Information **


Authors are invited to submit full and short papers with unpublished, original work. Submissions will be subject to a double-blind peer-review process. Accepted papers will be presented by the authors at the workshop either as a talk or a poster. All accepted papers will be published in the workshop proceedings (proceedings from previous years can be found here:


The submissions must be in PDF format and anonymized for review. All submissions must be written in English and follow the ACL 2024 formatting requirements: 

Long paper submissions: up to 8 pages of content, plus unlimited references.

Short paper submissions: up to 4 pages of content, plus unlimited references.

Submission Website: Paper submission has to be done through openreview: <>


Final versions of accepted papers will be allowed 1 additional page of content so that reviewer comments can be taken into account.


** Important Dates (Main Research Track) ** 

Paper submission deadline: May 17 (Friday), 2024

Notification of acceptance: June 17 (Monday), 2024

Camera-ready paper due: July 1 (Monday), 2024

Workshop dates: August 16, 2024 

** SDP 2024 Keynote Speakers **

We are excited to have several keynote speakers at SDP 2024. 

  1. Iryna Gurevych, Professor at Technical University Darmstadt and head of the UKP Lab, Germany.

  2. Anna Rogers, Assistant Professor, University of Copenhagen, Denmark

  3. Heng Ji, Professor, University of Illinois at Urbana-Champaign, USA.

  4. Doug Downey, Associate Professor at Northwestern University and Research Manager at Allen Institute for AI, USA.

** SDP 2024 Shared Tasks **

SDP 2024 will host two exciting shared tasks. More information about all shared tasks is provided on the workshop website:

DAGPap24: Detecting automatically generated scientific papers

A big problem with the ubiquity of Generative AI is that it has now become very easy to generate fake scientific papers. This can erode public trust in science and attack the foundations of science: are we standing on the shoulders of robots? The Detecting Automatically Generated Papers (DAGPAP) competition aims to encourage the development of robust, reliable AI-generated scientific text detection systems, utilizing a diverse dataset and varied machine learning models in a number of scientific domains. 

Organizers: Savvas Chamezopoulos, Yury Kashnitsky, Drahomira Herrmannova, Anita de Waard (Elsevier), Domenic Rosati (Scite)

Context24: Contextualizing Scientific Figures and Tables

When making sense of results across many research papers on a topic, figures or tables of key results from the papers can serve as effective, information-dense summaries that can be compared/contrasted and synthesized with other results. However, to understand the results, key elements (e.g., measures, sample) need to be contextualized with associated methodological details, which are typically dispersed throughout the text, often far from the figure/table and from each other. In this shared task, we are interested in contextualizing scientific figures and tables, i.e., automatically retrieving and ranking snippets from the paper that are most needed to interpret their results, with the goal of making figures/tables more self-contained. 

Organizers: Joel Chan, Matthew Akamatsu

** Organizing Committee **

Tirthankar Ghosal, Oak Ridge National Laboratory, USA

Philipp Mayr, GESIS – Leibniz Institute for the Social Sciences, Germany

Aakanksha Naik, Allen Institute for AI, USA

Shannon Shen, Massachusetts Institute of Technology, USA

Amanpreet Singh, Allen Institute for AI, USA

Anita de Waard, Elsevier, Netherlands

Orion Weller, Johns Hopkins University, USA

Yanxia Qin, National University of Singapore, Singapore

Yoonjoo Lee, Korea Advanced Institute of Science & Technology, South Korea



Tirthankar Ghosal


National Center for Computational Sciences (NCCS)

Oak Ridge National Laboratory, United States


Tirthankar Ghosal

Apr 2, 2024, 8:51:50 AMApr 2
to Ghosal, Tirthankar
Reply all
Reply to author
0 new messages