Here's my research update for August.
Retrospective on goals for July
In the July update I outlined a list of goals. Here's my update on these goals:
Prepare two talks for EuroPython and give them: 😒 So-so
I put a lot of effort into making these 2 talks as fun as possible, and I practiced them many times. I gave the 30-minute PySnooper talk and people liked it, but I don't care about promoting PySnooper anymore. I gave the 5-minute research talk and people weren't very excited about it. Only one person approached me after the talk, and no one signed up for this mailing list. I think I did a good job on the talk... Maybe the audience wasn't right, or maybe people were tired because the lightning talk session was at the end of the day after a keynote that went 15 minutes too long.
Have fun in Dublin: 😒 So-so
I got sick halfway through my 10 days in Dublin. So I had fun in the first five days, but the last five days were meh. I was coughing and sneezing all the time, and my throat was sore. I didn't have Covid, but I still couldn't really meet people, so it wasn't a great time. At least I had half a vacation.
Prepare my work on Fruit Slots for presenting at WAHT and present it: ✅ Done
I'm very happy about this one. I presented a series of experiments in implicit communication at the Ad-Hoc Teamwork Workshop at IJCAI. I was worried because it's my first time publishing and presenting in an academic setting. The talk went smoothly. I got a few questions in the Q&A session, and I started corresponding with two researchers I met at the workshop.
One of the judges suggested that I open-source the code, so I did.
Stuff I've done beyond the goals
Angeliki's survey: "Emergent Multi-Agent Communication in the Deep Learning Era"
Five months ago I met Angeliki Lazaridou, a researcher on DeepMind. I know she's done work on communication in MARL agents, which is close to my interests. We had a nice chat. She left MARL and now she works on NLP. Back in 2020 she wrote a survey of emergent communication in MARL agents. I finally had time to read it.
There are some great examples there:
Havrylov and Titov (2017) allowed the sender to emit strings of symbols of variable length [...] The resulting emergent language developed a prefix-based hierarchical scheme to encode meaning into multiple-symbol sequences. For example, the “word” for pizza was 5261 2250 5211, where 5261 refers to food, 2250 to baked food, and 5211 to pizzas.
Making art with Dall-E 2
This isn't really related to my research, but it's fun and it's AI so I'm gonna share it. Here's some art I created with Dall-E 2.
My goals for August
Attend HAAISS, learn about Human-Aligned AI, meet new people.
I flew out to Prague to participate in the Human-Aligned AI Summer School, which will start tomorrow. "Human-aligned AI" or "AI alignment" means figuring out how to get the AI systems that we make to not be evil, basically. In the short term it means things such as ensuring that AI systems don't violate our freedom or privacy, or reinforce discriminatory beliefs. In the long term it might mean "when the singularity comes, make sure that the AI superpower would want to keep humans alive rather than destroy them."
While it's not strictly my field of research, it's a possible application of my work in MARL, so I think it'll be worthwhile for me to attend. I'll also get to meet a lot of relevant people and make important connections.
From HAAISS's website:
We will meet for four intensive days of discussions, workshops, and talks covering latest trends in AI alignment research and broader framings of AI alignment research. The school is focused on teaching and exploring approaches and frameworks, less on presentation of the latest research results.
To prepare for HAAISS, I read / listened to:
Have fun in Prague.
Oh man, I'm spending way more money than I intended on flights and hotels. Whenever I'm in a city I've never been in before, I can't help but spend a week checking it out and meeting people. At least Prague is the cheapest of the bunch, only 88 Euro per night in a hotel.
If you're in Prague and want to hang out, hit me up. I'll be here until August 10th.
Learn RLlib and figure out multiple brains.
When I worked on the Fruit Slots experiments I used the PettingZoo framework. Meanwhile my friend Errol has been experimenting with the competing RLlib framework. He's raving about it, saying the code is so elegant and works just right. Of course I had to back the wrong horse... Now I've got major FOMO and I think I should learn RLlib. Even if it's not as good, it'll be good to be able to compare it to PZ. I'm going to watch this video to figure out the basics of RLlib.
I hope that this will let me run MARL experiments in which each agent has its own brain, rather than one shared brain like in PettingZoo. This will give me more interesting results for Fruit Slots, and any other multi-agent experiments I'll run.
That's it for now. See you in September!