Research update: War

9 views
Skip to first unread message

Ram Rachum

unread,
Mar 5, 2026, 2:40:53 PMMar 5
to ram-rachum-rese...@googlegroups.com

Hi everyone,

Short update this time.

It's wartime again. I heard someone say today that it started on Saturday, which is five days ago. I couldn't believe it. It feels like we've been in it for two weeks. My family and I are safe, so I'm fortunate that the war is mostly an inconvenience for me. About 10-20 times a day, we hear a loud warning sound on our phones, which tells us that a siren will sound in 2-8 minutes. The siren sounds from large speakers that I never know where they're hidden, and also from our phones again. The siren means that we have one and a half minutes to make it to a protected space. Some people have a reinforced room in their apartment. My apartment building has a bomb shelter, so I go there. I take my laptop and lots of supplies, and try to do my work from there. (I'm typing this from the shelter right now.)

Today is March 5th, which is the official deadline for the RLC 2026 conference I'm working towards. Fortunately the Program Committee was kind enough to offer an extension to people affected by the war. I now have an extra week to finish the paper, and boy do I need them.

Bad news: I couldn't get my method to work. I tried to use Experience Breakdown to explain agent behavior in a driving simulation. In almost all of my attempts, the output that it produced didn't have any explanatory power. I see two factors that cause the problem:

  1. Training optimizations such as PPO clipping and optimizer momentum cause the theta update vector to point in a different direction than the training gradient. I thought it'll be just random noise, but I've seen runs where these optimizations added a consistent bias over hundreds of epochs. This means that sometimes steerage is reported as negative even though the behavior is rising.

  2. Weird similarity. Often I'd see that the experiences that have high steerage seem to have little to do with the behavior. This can be called weight-sharing. It makes the results not-explanatory.

I'm not sure whether these problems are solvable. But that's something for me to think about after the deadline. Right now I'm planning on making the paper be mostly about BXRL, which is the problem formulation, and about HighJax, which lets researchers define and measure behaviors.


Back to work,

Ram.

Reply all
Reply to author
Forward
0 new messages