Learning Robotic Locomotion: From Simulation to Real World | 9am April 23, 2024

3 views
Skip to first unread message

Grigory Bronevetsky

unread,
Apr 21, 2024, 1:32:41 AMApr 21
to ta...@modelingtalks.org

image.pngModeling Talks

Learning Robotic Locomotion: From Simulation to Real World
Jie Tan, Google DeepMind
image.png

Tues, Apr 23 | 9:00 am PT

Meet | Youtube Stream


Hi all,


The presentation will be via Meet and all questions will be addressed there. If you cannot attend live, the event will be recorded and can be found afterward at
https://sites.google.com/modelingtalks.org/entry/learning-robotic-locomotion-from-simulation-to-real-world


Abstract:
Deep reinforcement learning holds the promise to automatically train highly performant robot controllers. Traditionally, these learning-based techniques require a huge amount of data, which is unsafe and time consuming to collect on real robots. To mitigate these problems, simulations provide a safe, fast, and scalable data collection alternative. However, controllers trained in simulated environments often underperform in real-world applications due to the "sim-to-real gap". This talk will introduce the concept of simulation in robotics, explore the causes of the sim-to-real gap, and offer a high-level overview of the evolution of sim-to-real research in robotics. I will focus on three pivotal studies that aim to bridge this gap. Additionally, I will discuss the outstanding challenges and future research direction of robotics simulation and sim-to-real transfer.


Bio:

Jie Tan is a Senior Staff Research Scientist and Tech Lead Manager in the robotics team of Google DeepMind. His research focuses on applying foundation models and deep reinforcement learning methods to robots, with interests spanning locomotion, navigation, manipulation, simulation, and sim-to-real transfer. In addition to his role at Google, Jie serves as an Adjunct Associate Professor at the Georgia Institute of Technology and teaches AI and robotics courses at Stanford University.


More information on previous and future talks: https://sites.google.com/modelingtalks.org/entry/home

Grigory Bronevetsky

unread,
Apr 24, 2024, 4:43:32 PMApr 24
to Talks, Grigory Bronevetsky
Video Recording: https://youtu.be/hkLpUiSEwdw

Slides: https://docs.google.com/presentation/d/1nqDlZaKeAHjvkRrRhYDJ6JTmIBMUI7yuGr0098PjTTo/edit?usp=sharing&resourcekey=0-7wn1Sov37112ijcLHsgxRA

Summary

  • Focus: Learning Robot Locomotion from Simulation

    • Traditional robots: fixed, restricted in factories

    • New vision: robots interacting with people in everyday life

    • Challenge: human environment is very unstructured and dynamic

  • DeepMind Approach: use deep reinforcement learning to control robots

    • Previously very successful at playing games in virtual environments (Go, Starcraft)

  • Locomotion:

    • Enables us to explore the Earth

    • Many applications: healthcare, delivery, etc.

  • Simulation:

    • We have available many technologies to model locomotion of realistic bodies in simulation (e.g. from computer graphics community)

    • Simulation

      • Much faster and more scalable than real-world experiments

      • Safe

    • But not fully accurate (performance in simulation does not fully translate to real-world performance)

    • This is the Sim-To-Real problem

  • Approaches

    • Fast simulation in GPUs

    • Imitation learning from animals

    • Simulation from small amounts of video

  • Simulation for robots

    • Physics: robot body, collision detection, contact solver, numerical integrator

    • Sensor simulation

    • Actuator simulation

    • Robot control API

    • Scene creation and management: robots, objects, humans

  • Robot training method

    • Trained with Deep Reinforcement Learning (PPO) <-> Physics Simulation (PyBullet)

    • Deployed on real robot

    • Reinforcement learning:

      • Agent (algorithm that makes decisions)

      • Environment (robot body, world, etc.)

      • Agent observes environment, suggests action that affects environment, gets reward immediately or later in time, adjusts recommendations to maximize reward

    • Formulation:

      • Observations: joint angles, roll, pitch

      • Actions: desired motor angles

      • Reward: maximize movement towards a given goal, minimize energy expenditure

      • Early termination: robot falls

      • Policy: 2 layer neural network

    • Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning, CoRL 2021

  • Improving robot performance on reality despite Sim-To-Real gap

    • Sim-to-Real: Learning Agile Locomotion For Quadruped Robots, Robotics: Science and Systems (RSS) 2018

    • Challenge: policy trained on simulation doesn’t directly produce good results in reality

      • Unmodeled dynamics (e.g. robot body softer than expected)

      • Wrong simulation parameters (e.g. incomplete CAD file)

      • Inaccurate contact models

      • Communication latency

      • Actuator dynamics

      • Stochastic real environment (e.g. roughness of floor/carpet)

      • Numerical accuracy

    • Overcoming the gap

      • Analyze each component of the system to identify the best parameters

      • Actuator model is the most critical gap

        • Traditional: analytical models

        • Neural network models of actuators

      • Domain Randomization

        • Sample physical parameters from some distribution

        • Train robot in simulation across all those parameters

        • Learned policy is much more robust to different realities, even if it doesn’t have a way to measure which world/environment it is living in

          • Actions are more conservative/robust

          • Responses to events are more diverse

          • Robot’s peak performance is worse

          • But much more consistent across different scenarios

  • Reduce Sim-to-Real gap: Automatic System Identification

    • Simulation-Based Design of Dynamic Controllers for Humanoid Balancing, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016

    • Technique:

      • Physics simulation based on initial physical parameters (not carefully tuned, so not that accurate)

      • Controller learning:

      • Trained policy doesn’t work in reality, so the loop is closed to adjust the physics model to improve its accuracy (Automatic System Identification)

      • Using key parameters as indicators of sim-to-real discrepancy

        • PD gains

        • Center of mass

    • CAMES: 

      • Given a set of parameter samples

      • Fit a gaussian distribution

      • Get quality of all the members, remove the low-performing ones

      • Fit another gaussian distribution on survivors

      • Generate more samples from new gaussian distribution

      • Repeat until sample population reaches high quality

    • Key lessons

      • Converge in 2 iterations, 12s of robot data

      • Can overfit physical parameters, which will be unphysical, will not transfer to other tasks

      • Need to select a subset of physical parameters

  • Learning by Imitating Animals

    • Learning Agile Robotic Locomotion Skills by Imitating Animals, Robotics: Science and Systems (RSS), 2020

    • Approach:

      • Take motion capture data of animals

      • Translate motion to robot bodies

      • Adjust policy training to add new optimization target: minimize distance between robot and reference motion

      • Robot learns how to walk like the animal while successfully walking given its different body

    • Bridging the Sim-To-Real gap via Domain Adaptation

      • Randomly sample physical parameters

      • Take all physical parameters (>100 dim space) and map them to a low-dimensional space (10-20 dim) using an auto-encoder

      • Put encoded description of physical parameters as input to robot’s policy

      • Real-world runs

        • May not know the physical parameters

        • Use optimization to find the best choice of parameters that maximize reward

        • Follow-on work: retune dynamically so the policy is responsive to rapid changes to physics (e.g. wind)

  • Open Sim-To-Real problems

    • Complex dynamics (soft objects, fluids)

    • Realistic rendering (e.g. capturing what visual sensors will actually see)

    • Scalable creation of diverse scenes (e.g. furniture, nature, messy room)

    • Modeling humans and human behaviors/reactions to robot actions

Reply all
Reply to author
Forward
0 new messages