Difference between listed probabilities and reward fractions

19 views
Skip to first unread message

Alivia Bechler

unread,
Oct 28, 2025, 11:41:28 AMOct 28
to FEDforum
Hello!

I’ve been working with a statistician on analyzing the FED3 data, and he pointed out some interesting patterns in how restless bandit probabilities often do not match the fraction of actually rewarded trials. I’ve attached a file summarizing the proportion of rewarded trials across all 15 mice for each unique probability combination (probability_legitimacy_summary.xlsx). For instance, looking at when the combination is 0.9 and 0.3, the fraction of rewarded trials are 0.73 and 0.53, respectively. Yet when it’s 0.9 and 0.9, the fraction is 0.87 and 0.87.

I’ve observed a similar pattern in data collected with a different restless bandit script (probability_legitimacy_summary_Ephys.xlsx), which makes me wonder if this effect could be related to the device itself. While this shouldn't impact my analysis too heavily, it's something that I am keen to understand. Do you happen to know why this occurs?

Warmly,
Alivia Bechler

P.S. I included the python script I used to analyze this issue. If you need more information from me to understand the situation, please let me know! I'd be happy to help.

probability_legitimacy_summary.xlsx
check_prob_ab.py
probability_legitimacy_summary_Ephys.xlsx

Lex

unread,
Oct 28, 2025, 12:37:10 PMOct 28
to Alivia Bechler, FEDforum
Hi Alivia,
Thanks for writing about this, and for your careful characterization of this issue!  Can you provide a little more information?
1) What code you are running on the FED3?  Is it the Bandit example code in the FED3 library?  https://github.com/KravitzLabDevices/FED3_library/tree/main/examples/1_Programs/Bandit
2) How many trials do you collect data on at each probability?  The values are randomly sampled so I could see how the percentages may not match if there is not a sufficient number of trials at each percentage.

Talk soon,
-Lex

--
You received this message because you are subscribed to the Google Groups "FEDforum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fedforum+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/fedforum/f4c2a147-01c3-4bec-b679-877a8b36fa23n%40googlegroups.com.

Thomas Huff

unread,
Oct 28, 2025, 2:00:25 PMOct 28
to FEDforum
Hi Alivia and Lex,

I may have noticed something similar over the summer, when I was trying out the 2-armed bandit protocol as a probabilistic reversal task (in my case, setting the arms to 90% and 10%, with reversals every 20 pellets). During testing, I too noticed some deviations between expected and actual reward outcomes -- as well as another phenomenon in which low-probability events sometimes "clustered" 2-3 times in a row. 

One potential issue I discovered is that Arduino's random number generator is apparently only pseudo-random, and needs to be "seeded" with a random value (e.g., a value from the clock) to produce numbers that are truly random. Otherwise, it will tend to generate the same sequences over and over again. See here for a short explanation and some methods of seeding: https://www.reddit.com/r/arduino/comments/rfxmu2/random_is_generating_the_same_sequence_of_random/ 

In my case though, even seeding the random() function didn't fix the issue completely. It improved the task's adherence to the probabilities I'd set, but I continued to see some deviations beyond what probability would predict. I intended to post an inquiry here on the forum and look into the issue in greater detail, but it's still on my to-do list. My running hypothesis was that perhaps Arduino's random() function just doesn't produce truly random values -- even when seeded. But I'd be curious to know if anyone else had similar issues and if they found a fix. 

Cheers,

Tom



--
You received this message because you are subscribed to the Google Groups "FEDforum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fedforum+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/fedforum/f4c2a147-01c3-4bec-b679-877a8b36fa23n%40googlegroups.com.



---------- Forwarded message ----------
From: Domain postMaster address <postm...@georgetown.edu>
To: Thomas Huff <tmh...@georgetown.edu>
Cc: 
Bcc: 
Date: Tue, 28 Oct 2025 08:41:28 -0700 (PDT)
Subject: check_prob_ab_py was removed from this message
 
Logo
 

We removed a file from this message

Your organization's email policy doesn't permit this type of file. If you need it, please contact your administrator.

 

File Details

 
check_prob_ab.py (3200 bytes)
 
 
 
 
                                                           

Lex

unread,
Oct 28, 2025, 8:54:24 PMOct 28
to Thomas Huff, FEDforum
Thanks Thomas!  You bring up a couple of interesting points that I think I can lend clarity to.

1) Computers - including Arduinos - are deterministic machines, meaning that if you give them the same instructions and starting conditions they’ll produce the same results every time. So if you call the Arduino random function with the same seed you’ll get the same sequence of numbers every run. ie: if you flash the Example Arduino code:
 
void setup() {
  Serial.begin(9600);
  for (int i = 0; i < 5; i++) {
    Serial.println(random(10));
  }
}

This will keep printing the same 5 numbers every time you reset the Arduino. If you want to change the random behavior you can set a random "seed" with the function randomSeed(). Each "seed" will produce a new sequence of deterministic numbers. If you don't set the seed (we don't set one in the Bandit example code), Arduino will default to randomSeed(1).  To simulate something *more* random people often seed the randomizer based on a variable event such as the time that the first poke happens. This still follows deterministic rules but the seed itself will change every run of the device.

2) You observed that the example bandit code can generate "low-probability events sometimes "clustered" 2-3 times in a row". This is expected behavior, as we don't prohibit this from happening in the code.  Pulling two 20%probability events in a row has a 4% chance of occuring, not that crazy.  Pulling 3 low prob events in a row has a 0.8% chance, which should be rare but not impossible.  It would be possible to modify the randomization logic to prohibit this, making it pseudo-random.

3) "My running hypothesis was that perhaps Arduino's random() function just doesn't produce truly random values".  Minus the caveat on computers being deterministic in point #1, Arduino generates random distributions of numbers as you would expect. You can test this yourself by having a FED3 pull a bunch of numbers from 0 to 9 and then plotting the distribution.

void loop() {
    Serial.println(random(10));
}

I just flashed that on a FED3 and ran it for a few seconds, generating ~10K numbers - here is their distribution, which is pretty uniform - if I ran it longer it would flatten out completely.  So, Arduino random is not broken :) 

image.png

4) So what's going on with the probabilities on FED3 bandit? I think I have an explanation which I'll write up in another post to just keep this thread manageable :)  

-Lex

Lex

unread,
Oct 28, 2025, 9:19:21 PMOct 28
to Thomas Huff, Alivia Bechler, FEDforum
On to why the FED3 Bandit task might not produce exactly the probabilities that are set!

In the FED3 Bandit task example, here is the logic that decides if a left poke should be rewarded or not:

image.png

The code block is entered following a left poke (line 74), and in line 78 it asks if the result of random(100) returns a value less than the value of prob_left.  If prob_left is set to 80, this will evaluate to true when the random(100) function returns any number from 0-79, which will occur 80% of the time.  In these cases, FED3 will play a conditioned stimulus and drop a pellet.  This is the whole randomizing logic in this Bandit example task, there is nothing else that influences it (the same logic is also repeated for right pokes).

I pulled ~40 files from our lab where we ran this 80-20 bandit task for 72 hours and plotted the average probability of a pellet following an 80% probability (High) poke vs. a 20% probability (Low) poke.  It looks as we'd expect from an 80-20 task:

image.png

However you can see that individual mice are not always receiving pellets at the 80-20% split.  This is due to the variance in random sequences, and the low numbers of trials even when the task was run for 72 hours (we get ~500-800 trials in these sessions). The variance from expected probabilities will likely be larger if you run fewer numbers of trials.  It may be possible to force the FED3 to correct for biased sampling, but the example code does not do that.

2) As another possiblity, there may be a semantic issue going on with the expectations of the task and what it means to be an 80-20 bandit.  The 80:20 probabilities (or any probabilities you set) are setting the *probability of a reward following each poke*.  But this does not mean that 80% of rewards will come from the high-probability pokes.  As an example, if a mouse poked Left 100% of the time it would receive a pellet on 80% of these left pokes on aveage, but would receive 100% of its pellets from Left pokes and 0% from Right pokes.  So if you asked the question - what percentage of pellets were delivered following Left pokes you won't get 80%. Alivia, is it possible that this explains the discrepancies you are seeing?  

Please share more details, data, or ideas.  I think this is a really cool and useful task, I want to iron these issues out and make sure everything is working as expected!  -Lex



 

Alivia Bechler

unread,
Oct 29, 2025, 12:17:18 PMOct 29
to Lex, Thomas Huff, FEDforum
Hi Lex and Thomas!

Responding to your first email Lex, I used a different bandit code from another lab that I am working with (I have the code folder attached, "Restless Bandit Code"). Also--in the excel files depicting the probabilities (probability_legitimacy_summary.xlsx), the left-most column gives the number of trials that have experienced each probability combination. This ranges from ~200-18000 trials.

While I am somewhat familiar with this code this lab wrote for the bandit task, I am not wholly knowledgeable in how it works. I believe they used the same process you described with the random(100) function, but I could be wrong about that. 

I have also attached how I calculated the fraction of rewards (a python script) and an example file of what it reads. The example file comes from another script that reads the raw FED3 file and identifies the trials/variables I want to look into. I included those scripts as well and my raw FED3 data to provide you the pathway I used. Here's the steps I completed to get this analysis:
(It's important to note that the code I used puts the probabilities into a separate file, the walkLog, rather than into the FED file directly. We are currently in communication with the lab to understand why they decided to create a walkLog.)
  1. 01_combine_alldata_walklogs_ab.py  --> combines walkLog probabilities to FED data
  2. 02_identify_trials_ab.py  --> selects variables of interest
  3. 03_compute_variables_ab.py  --> further analyzes these variables
  4. Now I have 15 csvs (15 mice) like the example in the folder (1_withWalk_trials_withComputedVars.csv)
  5. check_prob_ab.py  --> collects each trial associated with a unique probability combination and calculates the fraction of trials rewarded for left/right pokes within that combination.
  6. This gives me the excel file (probability_legitimacy_summary.xlsx) which I added a couple columns to that simply gives me the difference between the probability and the fraction rewarded.

Let me know if you need any other insights from me on how I ran the bandit and/or analyzed it!
Best,
Alivia

Lex Kravitz

unread,
Oct 29, 2025, 5:58:04 PMOct 29
to FEDforum
Hi Alivia,
Unfortunately, as this looks like a custom fork of the FED3 library and Bandit task it's beyond my expertise to dig in and advise on how it works.  

I do think from this conversation you should be able to convince yourself that the random function in the FED3 is randomizing reward probabilities correctly, so there's either a discrepancy in how the task logic chooses probabilities vs. what you expect it to do, or there is a semantic issue on what a "High probability" poke means, as I brought up in point 2 above:
"As another possibility, there may be a semantic issue going on with the expectations of the task and what it means to be an 80-20 bandit.  The 80:20 probabilities (or any probabilities you set) are setting the *probability of a reward following each poke*.  But this does not mean that 80% of rewards will come from the high-probability pokes.  As an example, if a mouse poked Left 100% of the time it would receive a pellet on 80% of these Left pokes on average, but would receive 100% of its pellets from Left pokes and 0% from Right pokes."

Sorry I can't be more help than that, good luck with this!
-Lex

Alivia Bechler

unread,
Oct 30, 2025, 12:41:34 PMOct 30
to Lex Kravitz, FEDforum
No worries, Lex!

I appreciate your help and engagement with this issue. I hope you have a great rest of your week!

Alivia

You received this message because you are subscribed to a topic in the Google Groups "FEDforum" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/fedforum/nPl_BeT0HvI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to fedforum+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/fedforum/f8f7e786-1b32-48d9-b9f7-4627aa242c18n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages