Research update: CAIF Summer School, dominance in mice

18 views
Skip to first unread message

Ram Rachum

unread,
Aug 3, 2023, 10:23:15 AM8/3/23
to ram-rachum-res...@googlegroups.com

Hi everyone!

Retrospective on goals for last month

In last month's update I outlined a list of goals. Here's my update on these goals:

  1. Make progress on the dominance hierarchies paper: 🔄 Ongoing

    I've gotten a few rounds of feedback from Reuth, Yonatan and ChatGPT. Now I need to process all of them and make a new revision of the dominance hierarchies paper. It's possible that I'm going to change the punchline of the paper to be about drawing parallels between dominance hierarchies in biological life to those in RL agents, rather than the current punchline of meme-like transmission of dominance hierarchies.

    I was looking for data about dominance hierarchies in animals. I talked with Eli Strauss, who is a leading researcher in the field of dominance hierarchies. He pointed me to DomArchive which is a collection of data aggregated from a century of dominance hierarchy research. It collects data from 436 dominance hierarchy-related papers, starting with the seminal paper in 1922 and ending with papers from 2020.

    I wrote some code to extract the data from R to Python. I used my trusty friend VisiData to explore some of this data. The best data I found came from a paper called "Social context-dependent relationships between mouse dominance rank and plasma hormone levels" by Cait M Williamson from 2017.

    I also found a very relevant paper from 2021 called The evolution of social dominance through reinforcement learning. It was written by Olof Leimar, who is a very experienced researcher in the fields of evolution and animal behavior. As far as I know it's the only paper that connects reinforcement learning with dominance hierarchies, so it's definitely getting a mention in my paper. One key difference between my paper and his, is that he's using reinforcement learning to understand dominance hierarchies better, while I'm attempting to use dominance hierarchies to make reinforcement learning agents perform better.

  1. Attend the CAIF Summer School, present a poster on dominance hierarchies: ✅ Done

    I'm writing this on the plane back home from the CAIF Summer School. This was a great event. We had some top-tier AI researchers giving talks about their latest research. My two favorite talks were Joel Leibo's and Vince Conitzer's.

    Joel gave a newly-prepared talk titled "A theory of appropriateness". With the advent of publicly-accessible LLMs, there is a lot of effort to ensure that the text they produce is "appropriate". A major challenge is that our current understanding of what "appropriate" means is on a "we know it when we see it" basis. We know that if ChatGPT gives people instructions on how to make napalm, or says racially insensitive things, that's considered inappropriate, but we don't have a good working theory of what it means for a message to be appropriate. Joel is building such a theory, paying attention to treating the concept of appropriateness as a dynamic, responsive entity rather than a fixed standard. I believe that the material from this talk will find its way into a paper soon.

    Vince gave a talk about open-source game theory, which theoretically allows agents in social dilemmas to reach better social welfare by conditioning their altruistic action on the code of the other agent. I've seen this talk before, and I'm skeptical about the practical applications of this theory, I admire Vince's presentation style and I always enjoy hearing him speak.

    I presented my poster and people were interested and appreciative. I've met for the first time a few researchers whose names I've been seeing over and over in papers: Noam Brown, Ed Hughes, Michael Dennis, Jesse Clifton and more. It's important for me that people know I'm working on dominance hierarchies. In a way it's a brand that I've adopted for myself; I hope that in a year from now, there'll be multiple people about dominance hierarchies and more appreciation of their importance, and then people could remember me as "the dominance hierarchies guy".

Stuff I've done beyond the monthly goals

AI Safety and saving the world

There's something in my approach to my research that changed some time in the last 4 months, and I think I should mention it.

I've been talking about AI Safety a lot in the last year. The reason I got into AI Safety is because I thought that was the most likely path for funding my research, which appears to have been a correct intuition, given that I'm now funded by ALTER. However, my personal motivation in my research isn't AI Safety. I was drawn to my research because it explores social development. Social dynamics are very interesting to me because I've had lots of problems adjusting socially and emotionally since I was a child, and that's something that's been taking a lot of my attention and focus as I went through all kinds of personal challenges that I won't get into right now, up until the present day.

When it appeared that AI Safety is the best way to get funding for my research, I adjusted it for that purpose, effectively doing marketing to show how my research can be applied to their goals. I do believe that AI Safety is a huge concern, and that there's a very real possibility that in the next 5 years, AI will reach superintelligence and possibly destroy humanity, which could mean directly killing all people on the planet but might also mean more indirect, devastating changes to our society. (Most AI experts project a longer timeline than mine.) I don't worry about it too much, not because it's not real, but in the same way I don't worry about climate change or the Iranian nuclear threat. Global disasters have been looming over all of us continuously since before I was born, and I'm not going to waste any of my time or get stressed out about them.

I've been involved in AI Safety for a year now; during this year, I repeated and polished my sales pitch so many times that I am now 100% sold on it. In other words, I strongly believe that social behavior of AI agents is the research direction that's most likely to lead to the creation of safe AI systems, which could determine whether humanity will be destroyed or not.

I also believe, with a lesser degree of confidence, that the direction of research I'm working on now, which starts with dominance hierarchies, is the best path for preventing that from happening. Whether I'll do it successfully and in time, I don't know.

The upshot is this: I feel like my day job is saving the world.

This is pretty special. I never thought that one day I would be in this position.

The interesting thing is that if anyone were to tell me that they're working on saving the world, I would assume something about what kind of person they are. If I were feeling generous, I might think that they're very noble people, and if I were feeling nasty, I might think that they have delusions of grandeur. But for me, the fact I'm working on saving the world is pure coincidence.

My goals for this month

  1. Finish the dominance hierarchies paper.

    Now that the CAIF Summer School is out of the way, I have no more excuses. August is the month in which I finish this paper.

    I decided I won't try for AAAI, because the deadline is too close. AAMAS is a possibility.

    One question I'm debating is whether I want the paper to also be about my long-term plan for AI Explainability and AI Corrigibility, or just about the dominance hierarchies. On one hand I want to include the plan, because it gives strong motivation for dominance hierarchies and makes it clear why they're so important. On the other hand, it might make the paper longer, more controversial, and less relevant for AI conferences. This also re-raises the question of whether I'm better off writing a paper for a conference or just for arXiv... I need to think about that.

Bonus meme.

That's it for now. See you next month!

Ram.

Reply all
Reply to author
Forward
0 new messages