Introducing Meta-World-V2!

849 views
Skip to first unread message

Avnish Narayan

unread,
Jun 25, 2021, 3:51:38 PM6/25/21
to Meta-World Announcements

Hello Meta-World Users!

MT10.gif

We’re excited to announce our official launch of Meta-World V2! Many of you have been using V2 since we softly released it on the Meta-World repo in March of this year. You can now see the official benchmarking results of Meta-World V2 in our updated arxiv submission. We hope that access to the latest benchmarking results makes your time as users easier when you need to compare your results.

Thank you so much for your patience-with and feedback-about the Meta-World codebase. Consistently benchmarking meta-/MTRL is a huge technical challenge, and we’ve spent the past year redesigning the benchmark from its core. 

Major Changes

  • Redesigns of the reward functions of all 50 environments. This effort has made the environments robustly solvable in a reasonable time frame (2-5m timesteps) on multiple random seeds.
    success_rate_comparison_metaworld.png

Overall Effect

When using Meta-World V1 a difficult question we found ourselves facing was  “Is my meta/multi-task RL algorithm’s failing because the individual environments are difficult, or is it failing due to fundamental challenges in meta/multi-task RL?”

By redesigning the environments’ reward functions to make them robustly solvable, we’ve made it easy to measure the component of performance attributable to fundamental challenges in multi-task RL. See the attachments for plots that show an increase in performance, decrease in rise-time, and reduction in the variance of the performance from Meta-World V1 to Meta-World V2 on our MT-10 and ML-10 benchmarks.


A Thank You To Our Users

We couldn’t have done this without your conscientious bug reports and questions. We’ve iterated on this benchmark many times over the past year, and are confident that we’re delivering a tool that can help to accelerate your robot-learning research.

We’ll be shortly releasing a technical report that details techniques that we used for designing Meta-World V2 with full explanations and results of how we benchmarked its performance. Be on the lookout!

Lastly, we’re in the process of cleaning up the baselines that we used for getting results for Meta-World-V2. You can keep track of our progress in PR #2287 in our sister repository, Garage.

If you have any questions at all, please join our MetaWorld slack community by filling out this Google Form.


Happy benchmarking!

-Avnish, Hayden, Adithya, Ryan (the Meta-World Team).


Tips and Trick for using MetaWorld

  • Update your code to the new API and use the most recent version of Meta-World from GitHub. MetaWorld has not established a regular release cadence yet, so you need to update to the latest version available on GitHub to benefit from recent improvements to reproducibility and usability.

  • Always run multiple seeds (preferable 5-10, per Colas, et al.) for your experiments, and calculate a confidence interval using those seeds. Performance in reinforcement learning is highly seed-sensitive, and this is especially true in difficult environments such as those in MetaWorld, where exploration is key to performance.

  • Re-run your baseline experiments rather than relying on curves from publications. As careful as members of the RL software community are about not creating regressions, RL software is still in its infancy, and performance regressions are common in both environment and algorithm implementations. Changes to upstream dependencies (e.g. numerical libraries like PyTorch and TensorFlow, and physics engines like MuJoCo) can induce significant latent changes to the performance curves you will observe.

  • Prefer well-tested off-the-shelf algorithm implementations for running baseline experiments over custom implementations. Using well-tested and broadly-shared implementations benefits the research community by establishing shared performance standards, and shields you from review criticisms which imply that you may have used a half-hearted baseline implementation. RL algorithms are difficult to fully reproduce, and meta/MTRL algorithms are doubly-difficult to re-implement. Of course, remember to always cite the implementations you use.

Join the Community for Support

If you have any questions about how to use Meta-World, we would love to help! Please join our Slack community by filling out  this Google Form.


MAML_v1_vs_v2_ml10.png
MTSAC_v1_vs_v2_mt10.png
window_old_new.gif
Reply all
Reply to author
Forward
0 new messages