Research update: AAMAS 2022 and MARL frameworks

36 views

Skip to first unread message

Ram Rachum

unread,

May 30, 2022, 9:13:18 AM5/30/22

to ram-rachum-res...@googlegroups.com

Hi everyone!

Here's my research update for June.

Retrospective on goals for May

In the May update I outlined a list of goals. Here's my update on these goals:

Experiment with existing RL frameworks. ✅ Done and ongoing
I'm getting more comfortable with Gym and Stable Baselines 3. I'm slowly experimenting with different setups. One thing that I need to wrap my head around is the concept of "environment". I always hated it, because I think that state in a game/MDP should be seen as immutable, while an environment is a mutable object. But since this abstraction is used in so many RL libraries, I'll have to learn how to play nice with it.
I made this little experiment using Gym and Stable Baselines. It shows how a learning agent can learn to cooperate from a hardcoded opponent who's playing a reciprocating policy. This result isn't innovative, but it was fun to program. The code is shorter than I thought it would be. You could run it locally if you'd like. Follow the instructions in that link.
Gym is focused on single-agent reinforcement learning, while I'm interested in multi-agent. Learning a few multi-agent frameworks is a goal for June. (More details in the bottom section.)
Get more involved with CAIF: ✅ Done and ongoing
I'm feeling good about CAIF. I've listened to all of their talks on YouTube. I'm attending their seminar series regularly, and I started making contact with some of the people involved. It'll be a few months before I'm ready to ask them for a grant or give a talk, but I'm happy with the progress I'm making.
I'm learning a lot from the seminar series. A couple of weeks ago Gillian Hadfield gave a talk where she discussed, among other things, her position that third-party enforcement of norms is a better model for cooperation than second-party enforcement. In simple terms, instead of "I'm not playing with you anymore because you were mean to me" we have "We're not playing with you anymore because we all saw you being mean to someone". I was resistant to this idea at first, but now I agree that it has advantages.
I'll keep attending the seminar series. This is great for me. I'm also sometimes talking to people I met there, which is a great way to make connections.
Get interviewed for an Israeli podcast: ↷ Postponed
I knew I jinxed it by talking about it :( The guy running the podcast was dealing with his sick kid on the day I was supposed to be interviewed, so we postponed to July. I hope that by July I'll have more impressive things to talk about in the interview.

Stuff I've done beyond the goals

It's amazing how much you can accomplish when you're procrastinating on your top priorities 😇 Here's some more stuff I've done:

Wikipedia articles

I continued improving the Multi-Agent Reinforcement Learning article I wrote for Wikipedia. Besides contributing to the broader community, this is a good opportunity for me to start talking with more researchers. I got input from Lewis, Joel and Dorsa. I added more sections and more citations. It's now at a respectable 6 pages! I do wish I could find more CC-licensed pictures and videos for it.

I also wrote stub articles for two important RL terms: Self-play and PPO.

My first academic conference

Now that I'm a researcher, I need to understand how academic conferences work. I attended AAMAS 2022. This conference is considered more diverse in topics than ICLR or NeurIPS, so most of the talks were too far away from my research direction. I had to pick and choose which sessions to attend. I did learn about the way that researchers present their work, talk to each other and ask questions.

Link shortener

I set up a dedicated link shortener for my research. Now I can make a short link for anything related to my research. e.g. r.rachum.com will take you to the research homepage, r.rachum.com/talk-video will take you to the YouTube video of my talk, r.rachum.com/announce will take you to this mailing list, etc.

My goals for June

Experiment with PettingZoo and RLlib.
Now that I have a better understanding of Gym and Stable Baselines 3, I'd like to dive into the popular frameworks used for multi-agent reinforcement learning. Looking online, I found these frameworks: PettingZoo, RLlib, Tianshou, OpenSpiel and a few others.
I think that PettingZoo and RLlib are the most popular ones, so I'll try these first. I looked at the documentation for both, and there are lots of complicated concepts to understand. This is difficult for me, because admittedly I have somewhat of an NIH syndrome. But I'll have to figure it out anyway.
Start meeting more researchers regularly.
I started to make connections with more researchers, and I'm scheduling regular video meetings with them. This is good because I can bounce my ideas off of them, ask questions and get their input. Some of these researchers are in DeepMind, some in European universities and some in Israeli universities. Most of these researchers are in MARL, some are in MAS and some in a different subfield of computer science.
Some of our conversations are about the subject matter of my research, but some are just about research life. I need to understand academic politics and how things work in the research world.