We are seeking a postdoc to join the Vision, Language and Reading group (http://vlr.cvc.uab.es/) at the Computer Vision Center (CVC), in Barcelona, Spain.
The position is initially for 3 years and linked to the European project "European Large Open Multi-Modal Foundation Models For Robust Generalization On Arbitrary Data Streams" (ELLIOT, https://www.elliot-ai.eu/). The project targets the development of the next generation of open Multimodal Foundation Models and further adapting them for specific downstream tasks.
The successful candidate is expected to participate in large-scale training efforts, research on finetuning methods, and applications on the specific use case of Document Understanding.
More information and application: https://www.cvc.uab.es/blog/2025/10/02/postdoc-position-in-multimodal-foundation-models-for-document-understanding/
Application Deadline: Until a suitable candidate has been hired. Applications will be regularly reviewed.
Contact: Dimosthenis Karatzas (di...@cvc.uab.es) or Ernest Valveny (ern...@cvc.uab.es)