Research update: Agents that make friends

19 views

Skip to first unread message

Ram Rachum

unread,

Jan 30, 2025, 11:05:57 AMJan 30

to ram-rachum-res...@googlegroups.com

Hi everyone!

Small updates:

Earlier this month I presented my dominance hierarchies research at the IDSAI conference. Had a good time.
I got a little freelance job. This is very good, because I haven't made money in a long time. I still need more, especially since between May and September I'll be interning at CHAI and will not be able to work.

Now let's get to the more interesting stuff. Since my goal is to get agents to form teams, I've been working on metrics for team formation. I defined two such metrics with partial success. But then I also developed a visualization for such teams, and I think it does a better job than the metrics right now. The goal of these metrics and visualization is to detect when agents form teams, i.e. multiple mutually-exclusive subsets of agents, such that agents cooperate within their teams but defect to agents who don't belong to their team.

I figured that a good warm-up exercise would be to create an environment in which teams are "forced", i.e. the rules of the environment encourage team formation explicitly. This would not be interesting as research, but it's a good test run before doing more meaningful work.

Here is such an artificial environment: The agents are playing pairwise iterated prisoner's dilemma. There are six agents, and in each turn, they're matched to a random agent and play 50 rounds of prisoner's dilemma with each other. Crucially, the agents don't know who they are matched with.

Now, this game would usually result in most or all of the agents learning to reciprocate and cooperate, thanks to the opponent shaping algorithm I use. However, to get the agents to form teams, I add a twist. The agents are numbered from 0 to 5, inclusive. If an odd-numbered agent happens to get paired to an even-numbered agent, and these two agents cooperate, then they do not get a cooperation reward, and they both get a signal in their observation that indicates that their cooperation reward was rescinded. This naturally gets the agents to only cooperate within their teams, i.e. even-numbered agents cooperate within themselves, as do odd-numbered agents, but with no cross-cooperation between the teams:

That's a cool visualization, isn't it? The nodes represent agents and the edges represent the cooperation rates (corates) between each two agents. A green edge means that the agents cooperate highly, while a red edge means they mostly defect. Agents that cooperate highly are also placed closer to each other as part of the visualization.

Now, this is a pretty artificial scenario, so it's only useful as a warm-up, and to ensure that my visualization method and metrics are functioning well, which they are.

I also came up with a less artificial environment in which such teams form. It's still too artificial, but better than the previous environment. I won't reveal its rules, but I'm going to show you the output:

The agents form two teams: One team is 3, 4, 6 and 7, and the other team is 2 and 5.

Even better, I made this visualization into an video! It shows how this population of agents evolves over time. Some agents join teams and some leave.

The next task for me is to come up with an environment that is less artificial, and get the same team formation behavior happening there.