Research update: Bar-Ilan University, tug of war

28 views

Skip to first unread message

Ram Rachum

unread,

Jan 1, 2023, 10:49:09 AM1/1/23

to ram-rachum-res...@googlegroups.com

Hi everyone!

Here's my research update for this month.

Retrospective on goals for last month

In last month's update I outlined a list of goals. Here's my update on these goals:

Read up on rebellion and disobedience in AI: ✅ Done
I read the three RaD-AI papers I mentioned in my last update. The most interesting question to me is the one raised by Arnold et al (2021), and also discussed in Mirsky et al (2021): How do we define disobedience? We understand disobedience on a "you know it when you see it" basis, but it's hard to pin down what it means so we could get definitive results from experiments.
Disobedience is a lack of obedience, fair enough. But we don't have a definition for obedience. We can set up a game where Agent A tells Agent B to do an action; we can run a few experiments that show that sometimes Agent B does the action and sometimes it doesn't. This is pretty much what Milli et al (2017) has done. But we only know that Agent A communicated the desired action to Agent B. When we say a word like "obedience", or even "assistance", we imply a social contract between the two agents. We imply some kind of relationship, friendship, expectation, or whatever you'd call it. We're still in the dark about how to define any of these concepts.
I hope that my RaD-AI experiment could shed light on that. More detail on that below.
Make a MARL experiment that shows a tug-of-war: ✅ Done
Here's a recap of this goal: Reuth discussed the scenario of a blind person trying to walk into traffic, and that person's seeing-eye dog stopping them. The person and dog will have a sort of fight, a tug-of-war where each of them is pushing in an opposing direction. The interesting thing about this fight is that both players are actually on the same side. They don't fight because they have a conflict of interest; they fight because they see the world differently. Ultimately they both want what's best for the group.
I want to create the same scenario between AI agents. This means I need a fully cooperative game where two agents are having a tug-of-war with each other. I've been working hard on this. I started with a simple game, and then kept changing the rules to make it easier to work with.
Here is the list of game rules that I currently use:
- There are two agents: Agent A and Agent B.
- The game is fully cooperative, meaning that the reward for Agent A is always equal to the reward for Agent B.
- Each episode of the game is made of exactly 40 turns.
- In the game, Agent A and Agent B "fight" each other in a series of skirmishes. There's only one skirmish happening at a time, and immediately as one skirmish ends, a new one begins.
- A skirmish can last anywhere between 1 turn and 40 turns. (I'm starting to remind myself of this.)
- In each skirmish, each of the agents needs to choose between a left reward or a right reward. This choice is the only action they do in the game. If and when both agents choose the same reward, the skirmish is over and each agent gets the reward that they both chose.
  - Trivial example: A new skirmish starts, both agents immediately choose the left reward, and the skirmish ends with a length of one turn. Both agents get the left reward and a new skirmish begins.
  - Non-trivial example: On the first turn Agent A chooses left while Agent B chooses right. On the second turn Agent A chooses left again while Agent B chooses right again. On the third turn, both agents choose right. Both agents get the right reward and a new skirmish begins.
- On every turn where the agents don't agree, they both get a reward of zero, which is always less than even the smallest reward.
  - On one hand, both agents want to choose the higher reward, but on the other hand they want to just agree so they could move on to the next skirmish.
- Ugly but necessary rule: If both agents change their choice at the same time, we pick one of the rewards at random and give it to both of them. Just because it's too awkward otherwise.
- On each skirmish, new random numbers are used for the rewards. Each reward will be a random number between 0 and 10.
- Agents have a rough idea of the size of each reward. This means that each agent has an estimate for each reward, which is somewhat higher or somewhat lower than the actual reward for that side. (Determined randomly.)
- At the beginning of each game, we determine a random handicap for each agent between 1 and 10. The handicap determines how accurate that agent's estimates will be. e.g. if the agent has a low handicap like 1.2, they'd get estimates that are on average 1.2 points away from the actual reward. If an agent gets a high handicap like 7.8, they'll get estimates that are on average 7.8 points away from the actual reward.
  - This means that the lower an agent's handicap is, the higher the chance that its answer to the question "which reward is higher, left or right?" is going to be correct.
- Each agent's observation (what it sees) consists of the full history of moves, estimates and rewards seen in the episode so far, as well as the agent's own handicap. Crucially, each agent doesn't know the other agent's handicap.
I know, these rules are pretty complicated :( I hope that after I have good results, I could make them more intuitive. Or maybe a video demonstration will make them easier. In the meanwhile here is a text dump of an episode of this game.
The idea of the game is that both agents want to get the higher reward, but they have to make a judgement call on whether to choose the reward that they think is higher, or to yield to the other agent's choice. If the agents knew for a fact which one of them has the lower handicap (and thus the more accurate estimates), they might have learned to just always yield to the agent with the lower handicap. But each agent only knows its own handicap. I'm hoping that they will be able to gauge the other agent's handicap based on their behavior, and especially based on their disagreements.
Sidenote: One interesting thing I noticed is that the agents immediately found a simple strategy for this game that gets a great score but makes the entire game uninteresting for my research. I added a simple modification to the rules to make that simple strategy not work, so they won't learn it. Can you guess what that strategy is? Feel free to reply by email.
I'll continue talking about this game in the section for the goals for next month.

Stuff I've done beyond the monthly goals

I got a Visiting Researcher position at Bar-Ilan University!

I'm really excited about this. I've been working with Reuth Mirsky for a while now, and we decided that I'll join her lab as a Visiting Researcher.

It's amusing for me to have the title of a Visiting Researcher. This kind of position is meant for professors who have a home university, and are temporarily spending a year abroad doing research in a different university. It isn't a paid position, and there wouldn't be any duties I'll have to do or any restrictions on the direction of my research. This is perfect for me, because my freedom is critical for me, and the main thing I need right now is to have that validation that I'm a real researcher. I'm hoping that now when I present my research to people, and especially when I do fundraising, the fact that I'm with Bar-Ilan University would give me more credibility than presenting myself as an independent researcher.

It's also somewhat amusing for me to have a title usually given to professors while I don't even have a bachelor's degree. I can't wait for the first conversation I'll have with university people where they'll ask "oh, you're a visiting researcher here, where did you do your PhD?" and I'll say "I can tell you where I got my high-school diploma, but that's about it 😊"

Our RaD-AI Workshop was accepted to AAMAS 2023!

Last month I told you about the workshop I'm organizing with Reuth about Rebellion and Disobedience in AI. I'm happy to report that our workshop was accepted to the AAMAS 2023 conference in London. There are still a few things to be finalized, like making sure we could get enough people to sign up to the workshop, but it looks like we're going to have the workshop in London either on the 29th or the 30th of May, 2023. I'm always happy for a little vacay :)

This also means that I'm going to get valuable experience in the logistics of organizing a conference. We're a total of six organizers, and my hunch is that I'll be responsible for marketing. Speaking of marketing... If you'd like to show your research at RaD-AI workshop, please submit here by January 20th.

Also, if you're in London or you intend to be there around that time, feel free to reach out!

Starting to work with AWS

This isn't an official goal yet, but I'm slowly working on running experiments on AWS instances. So far I've run experiments on a physical Linux machine that I bought, which is very convenient. But in case I'll want access to a stronger GPU, or the ability to run experiments on multiple computers, cloud instances would be a good solution.

When I was at AISIC 2022, I met Zohar Jackson, a fellow software engineer. He was interested in my research and he's helped me set up my AWS account and workflows. Thank you Zohar!

My goals for this month

Learn better tools for quickly exploring tabular data.
Whenever I run my experiments, like the RaD-AI experiment I did last month, my code outputs the results into a big CSV file. (Example.) Each row is a new training iteration, and the different columns have different information about the performance of the agents, for example their average reward. This means that on the first row, you see the behavior of the agents before any training, so the reward would usually be low, and then as you scroll down the file, the agents gradually learn more and more, and the score rises accordingly.
So far I would inspect these files by just opening them in my favorite text editor (EditPad Pro). I knew this wasn't the right tool for the job, but back when I had only 2-3 columns, I figured it was good enough for the time being. I gradually added more useful metrics of the agents' behaviors, and when I was at 4-5 columns, the text editor interface was starting to get annoying. I've recently reached 15 columns and decided I have to put my experiments on hold while I find a better workflow.
I know it's a cliche for engineers, but I have very high standards for the tools that I use, and I go to great lengths to make sure I find tools that fit my workflow. I know I can load a CSV file into Excel or Jupyter and run some crazy analytics on it. That's nice, but what I really need is a tool that I can use with very little friction. When I do a 4-hour session working on my research, I could be looking at 50-100 CSV files, of different schemas. I ain't dragging 50 files into Excel or Jupyter. I don't want to touch the mouse, and I prefer to not even leave the shell. I just want to tap a few keys on the keyboard and see a nice table.
I found a tool called VisiData which looks like it delivers exactly what I'm looking for. I did the excellent tutorial by Jeremy Singer-Vine, who also answered a few of my questions and joined this mailing list. (Hi Jeremy!) This month, I'll try to incorporate this tool into my workflow.
Also: When I was searching for the right tool, I was going through a few different options that people suggested... Which inspired me to create this meme.

Get the agents in my RaD-AI experiment to show interesting behaviors.
Last month I iterated through a series of RaD-AI experiments, culminating with the experiment that I described at the top of this update. My goal now is to use this experiment to show some interesting behaviors.
I said above that I want the agents to gauge each other's handicaps based on their behavior. Here are a few more (personified) thoughts that I would like the agents to have:
- "The other agent is disagreeing with me, maybe it means he's seeing something I'm not?"
- "Whenever I disagree with the other agent and eventually win, I find that the reward is smaller than my estimate of it. Maybe this means that the other agent has a lower handicap to me, and I should be listening to it more?"
- "The other agent is disagreeing with me, even though it already learned that my estimates are usually better and it should usually just choose what I choose. If the other agent disagrees despite this knowledge, it's possible that the gap between the rewards that the other agent is currently seeing is big enough to outweigh the fact that I have a lower handicap."
- "We are at the beginning of the episode, I should lower my threshold for disagreeing with the other agent, not because I think my handicap is lower, but because this is a good way to get more mutual information about each other's handicaps, so we could use that information in the later part of the episode."
It will be really cool if I could get the agents to behave according to these thoughts. Some of these are really ambitious, so I'm not sure it'll work. But I'll start with the simple ones and see how high up I can get.

That's it for now. See you next month!

Ram.

Reply all

Reply to author

Forward

0 new messages