Call for Papers: 1st Workshop on Human–Scene Interaction (HSI)
Submission deadline: July 7, 2026
Location: ECCV 2026, Malmö, Sweden
Website: hsi-workshop.com
Overview
We invite submissions to the First Workshop on Human–Scene Interaction (HSI) at ECCV 2026. This workshop focuses on modelling and generating human motion and behaviour grounded in the surrounding scene. The goal is to bring together research from computer vision, graphics, robotics, and multimodal learning to advance scene-aware embodied agents.
We welcome both archival and non-archival submissions.
Invited Speakers
- Umar Iqbal – NVIDIA DAIR Lab
- Taku Komura – University of Hong Kong
- Zhengyi Luo – NVIDIA GEAR Lab
- Gerard Pons-Moll – University of Tübingen
Topics of interest
Topics include, but are not limited to:
- Scene-conditioned human motion generation
- Human-scene and human-object interaction modelling
- Referring expression understanding and grounding in 3D scenes
- Language understanding and grounded communication for embodied agents
- Vision-language-motion alignment and grounding
- Vision-language-action (VLA) models for embodied agents
- Multimodal learning for motion and interaction
- Datasets, benchmarks, and evaluation for interaction
- Affordance learning and scene understanding
- Physically-based simulation of interaction
- Applications in robotics, animation, AR/VR, and embodied communication
- Technical reports accompanying challenge submissions
Submission guidelines
Archival submissions
- Must present original, unpublished work
- Will undergo peer review
- Accepted papers will be published in the ECCV 2026 Workshop Proceedings
- Papers must follow the ECCV formatting guidelines
Non-archival submissions
- May include previously published work, work under review, or ongoing research
- Intended for presentation only (poster or oral), and will not be included in the proceedings
- Ideal for sharing recent results, demos, or position papers
Submission links: To be announced
Important dates
- July 7 – Submission deadline
- July 31 – Notification
- September 2026 – Workshop
Challenge
The workshop also hosts a challenge on scene-aware referential gesture generation. Given speech, a 3D target coordinate, and a virtual scene, participants must generate full-body referential gestures that indicate the correct object among scene distractors. For details on the task, data, evaluation protocol, and baselines, see:
hsi-workshop.com/challenge
Contact
hsi-wo...@googlegroups.com
Organizers
- Jonas Beskow – KTH Royal Institute of Technology, Sweden
- Rishabh Dabral – Max Planck Institute for Informatics, Germany
- Anna Deichler – KTH Royal Institute of Technology, Sweden
- Fethiye Irmak Doğan – University of Cambridge, United Kingdom
- Anindita Ghosh – Max Planck Institute for Informatics, Germany