Post-doc position: Large AI foundation models for reinforcement learning of robotics tasks

33 views

Skip to first unread message

Fabien Moutarde

unread,

Jan 15, 2026, 7:22:17 PM (yesterday) Jan 15

to Machine Learning News

Post-doc position: Large AI foundation models for reinforcement learning of robotics tasks

Laboratory: Centre for Robotics, Mines Paris - PSL
Location: 60 Bd St Michel 75006 Paris
Duration: 9 monthes
Starting Date: as soon as possible (and before 2026 summer)
Contact: fabien.moutarde(at)minesparis.psl.eu

Subject: Following the development of Large Language Models (LLMs), whose power was demonstrated by chat-GPT and its competitors,we have recently seen the emergence of large AI models of the Vision-Language-Action (VLA) type, specifically designed for robotic applications. The aim of this project is to explore the use of such pre-trained VLA models for task reinforcement learning by manipulator robots.

The adoption and use of collaborative robots, including in small and medium-sized enterprises (SMEs) with limited internal skills and expertise to configure and above all to program them, would be greatly facilitated by the existence of tools enabling manipulator robots to reliably and highly adaptively perform a given function, without the need for explicit programming but rather through automated task learning algorithms. Reinforcement learning (RL) has proven its relevance in this context for "teaching" fairly basic tasks (reach, pick, place, push, etc.) to manipulator robots [Bujalance and Moutarde, 2023][Bujalance Martin, 2024]. However, for teaching more complex tasks (such as typical assembly operations in manufacturing), a hierarchical RL approach seems necessary. But designing a general and comprehensive dictionary of basic manipulation tasks "by hand" proves to be delicate and tedious. And even more so when it comes to building fairly generic task hierarchies that combine and sequence more atomic tasks.

This point therefore constitutes a technological barrier to overcome in order to enable the practical use of machine learning for robotic tasks instead of their explicit programming. Recently, after the Large Language Models (LLMs) whose power chat-GPT and its competitors have demonstrated, we are seeing the development of large AI models of the Vision-Language-Action (VLA) type dedicated to robotic applications: introduced by Deepmind [Zitkovich et al., 2023], their principle is to fine-tune cutting-edge VLMs jointly on robotic trajectory data and visual language tasks at internet scale; several other teams have followed this new VLA path, to take into account three-dimensional aspects [Zhen et al., 2024], to apply VLAs to manipulator robots [Gbagbe et al. 2024], or even propose an open-source variant of VLA [Kim et al., 2024]. This new type of generative model therefore seems to have enormous potential in robotics for algorithms for planning, control, and task learning.
In this context, the aim of the project is to explore the use of such pre-trained VLA models (in particular open-VLA [Kim et al., 2024]) for hierarchical reinforcement learning of tasks by manipulator robots.

Required Skills
**********************
The candidate should ideally have skills in robotic manipulation, as well as in Reinforcement Learning. He/she must also be familiar with the ROS environment and a simulation tool such as RLBench (https://sites.google.com/view/rlbench), Panda-Gym (https://panda-gym.readthedocs.io/en/latest/), or equivalent.

References
***********
Bujalance Martin J. (2024) Apprentissage profond par Renforcement et Démonstrations, pour le comportement de robots manipulateurs,
thèse de doctorat PSL préparé à MinesParis.
Bujalance J. and Moutarde F., (2023). Reward Relabelling for combined Reinforcement and Imitation Learning on sparse-reward tasks.
22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023), May 2023, London, United Kingdom.
Gbagbe, K. F., Cabrera, M. A., Alabbas, A., Alyunes, O., Lykov, A., & Tsetserukou, D. (2024). Bi-VLA: Vision-Language-Action Model-Based System for
Bimanual Robotic Dexterous Manipulations. arXiv preprint arXiv:2405.06039.
Kim, M. J., Pertsch, K., Karamcheti, S., Xiao, T., Balakrishna, A., Nair, S., ... & Finn, C. (2024). OpenVLA: An Open-Source Vision-Language-Action Model.
arXiv preprint arXiv:2406.09246.
Zhen, H., Qiu, X., Chen, P., Yang, J., Yan, X., Du, Y., ... & Gan, C. (2024). 3d-vla: A 3d vision-language-action generative world model.
arXiv preprint arXiv:2403.09631.
Zitkovich, B., Yu, T., Xu, S., Xu, P., Xiao, T., Xia, F., ... & Han, K. (2023, December). Rt-2: Vision-language-action models transfer web knowledge
to robotic control. In Conference on Robot Learning (pp. 2165-2183). PMLR.Post-doc : Larges modèles d’IA pour l’apprentissage par renforcement de tâches robotiques

Reply all

Reply to author

Forward

0 new messages