2nd International workshop on Machine vision and NLP for Document Analysis (VINALDO)
https://sites.google.com/view/vinaldo-workshop-icdar-2024/home
As part of the 18th International Conference on Document Analysis and Recognition
(ICDAR 2024)
August 30- September 4, 2024 — Athens, Greece
Context
Document understanding is an essential task in various application areas such as data invoice extraction, subject review, medical prescription analysis, etc., and holds significant commercial potential. Several approaches are proposed in the literature, but datasets' availability and data privacy challenge them. Considering the problem of information extraction from documents, different aspects must be taken into account, such as (1) document classification, (2) text localization, (3) OCR (Optical Character Recognition), (4) table extraction, and (5) key information detection.
In this context, machine vision and, more precisely, deep learning models for image processing are attractive methods. In fact, several models for document analysis were developed for text box detection, text extraction, table extraction, etc. Different kinds of deep learning approaches, such as GNN, are used to tackle these tasks. On the other hand, the extracted text from documents can be represented using different embeddings based on recent NLP approaches such as Transformers. Also, understanding spatial relationships is critical for text document extraction results for some applications such as invoice analysis. Thus, the aim is to capture the structural connections between keywords (invoice number, date, amounts) and the main value (the desired information). An effective approach requires a combination of visual (spatial) and textual information.
Objective
After the success of VINALDO 2023, in the second edition of the VINALDO workshop, we encourage the description of novel problems or applications for document analysis in the area of information retrieval that has emerged in recent years. On the other hand, we want to highlight a particular topic namely “Multi-view and Multimodal approaches”. In fact, the VINALDO workshop aims to combine visual and textual information for document analysis, in this context, multi-view and multimodal methods have really an important advantage in dealing with different types of data. Thus, we encourage works that combine machine vision and NLP through Multiview or/and multimodal approaches. Finally, we also encourage works that combine NLP and computer vision methods and develop new document datasets for novel applications.
The VINALDO workshop aims to bring together an area for experts from industry, science, and academia to exchange ideas and discuss ongoing research in Computer Vision and NLP for scanned document analysis.
Topics of interests
This workshop invites submissions with high-quality works that are related, but are not limited, to the topics below:
Multi-view document representation
Multi-view algorithms for document clustering
Multimodal document classification
Multimodal deep networks
Multi-view models for document ranking
Document retrieval using multi-view document representation
Document structure and layout learning
OCR based methods
Semi-supervised methods for document analysis
Dynamic graph analysis
Information Retrieval and Extraction from documents
Knowledge graph for semantic document analysis
Semantic understanding of document content
Entity and link prediction in graphs
Merging ontologies with graph-based methods using NLP techniques
Cleansing and image enhancement techniques for scanned document
Font text recognition in a scanned document
Table identification and extraction from scanned documents
Handwriting detection and recognition in documents
Signature detection and verification in documents
Visual document structure understanding
Visual Question Answering
Invoice analysis
Scanned documents classification
Scanned documents summarization
Scanned documents translation
Graph-based approaches for a spatial component in a scanned document
Graph representation learning for NLP
The workshop is open to original papers of theoretical or practical nature. Papers should be formatted according to LNCS instructions for authors. VINALDO 2024 will follow a double-blind review process. Authors should not include their names and affiliations anywhere in the manuscript. Authors should also ensure that their identity is not revealed indirectly by citing their previous work in the third person and omitting acknowledgments until the camera-ready version. Papers have to be submitted via the workshop's Easychair submission page.
We welcome the following types of contributions:
Full research papers (12-15 pages): Finished or consolidated R&D works to be included in one of the Workshop topics
Short papers (6-8 pages): ongoing works with relevant preliminary results, opened to discussion.
At least one author of each accepted paper must register for the workshop in order to present the paper. For further instructions, please refer to the ICDAR 2024 page.
Important dates
Submission Deadline: March 20, 2024 at 11:59pm Pacific Time
Decisions Announced: April 29, 2024, at 11:59pm Pacific Time
Camera Ready Deadline: May 10, 2024, at 11:59pm Pacific Time
Workshop: To be announced
Workshop Chairs
Rim Hantach, Engie, France
Rafika Boutalbi, Aix-Marseille University, France
Deadline Extension
Submission Deadline: March 20, 2024 at 11:59pm Pacific Time April 1st, 2024
Decisions Announced: April 29, 2024, at 11:59pm Pacific Time
Camera Ready Deadline: May 10, 2024, at 11:59pm Pacific Time
Workshop: To be announced