Hi everyone! Here are some updates about my progress.
I've been running some heavy experiments recently, and they've been taking a while to run, so I decided it's time to buy my own GPU. I bought an NVidia RTX 5080. It's a GPU on the high end of the consumer category, and it set me back $1.7K:
I benchmarked the team formation experiment I sent you on my last update, and this GPU runs the experiment 2.12 times faster than my water-cooled CPU.
It's nice to have my own hardware that I can always run stuff without having to connect to a cloud account.
In my last update I shared that while I managed to get team formation behavior in a tailored environment, I wasn't able to get team formation, or even reciprocity, in a gridworld environment. I removed parts from this experiment one-by-one until I found the culprit: My algorithm fails to learn reciprocity when it has hidden layers of neurons. This is a problem because powerful deep learning systems require many hidden layers.
I tried to debug the problem a little, but then I remembered that I recently heard of a new opponent shaping algorithm called AdAlign, and a variant called Proximal Advantage Alignment (PAA). (Paper, GitHub, OpenReview.) The authors, Juan Duque and Milad Aghajohari, claim that this algorithm is much more elegant and efficient than existing OS algorithms. One clear advantage is that this algorithm does not require a second derivative. Their paper was accepted as an Oral to the ICLR 2025 conference, which is the highest form of acceptance.
I'm toying with the idea of using AdAlign/PAA for my work. I've tried to implement it and got some results, but I'm still trying to find whether my implementation is correct. Juan and Milad are helping me figure it out.
My CHAI internship starts in a month and a half. I'm still working on getting a visa, which is complicated because I have to deal with multiple offices that need to interact with each other. I still have to figure out housing, subletting my Tel Aviv apartment, and many other logistical things. The CHAI people have added me to their system, so now I'm perusing their many onboarding documents that explain how to work with their systems.
That's it, see you next month. Ram.