Hello!
I'm a postdoc in Lex's lab, and have been working closely in the development of this implementation of the bandit task using FED3. I hope I can help answer some of your questions. I am writing documentation for this task, which may also address some of your questions:
Disclosure: The documentation is still in development. Some parts of it may be outdated or incomplete.
Regarding your questions, Taaseen:
- Ideally, how long should we do these experiments? Previously when I worked with some programs that I could run long term (e.g. Closed Economy), I would leave the FEDs in the cages, check daily and then collect the data after approximately a week or two. Would it be fairly similar with the bandit task?
- The length of the experiments really depend on your scientific question, I guess. But you can certainly leave the FEDs in the cage and just check daily that they are getting enough food and other metrics (like battery levels, that the hopper still has enough pellets for the next day, and that it isn't jammed). We have run the bandit task in a closed economy setting continuously for several months with no issues
- If I were to keep the FEDs in the cages for a long period (e.g. 1-2 weeks), and it would be their only source of food, would that be alright? That's how I did it for Closed Economy.
- Short answer Yes! But see above :)
- Do the mice need to be 'pre-trained' for the bandit task in any way?
- Not really! If I plan to do a more sophisticated version of the task (like more than two probabilities), then I train them on deterministic reversal first, which really is a special case of the bandit task, where the only probability options are 100 and 0. This seems to help them learn how to reverse, so they know that when one side of the pokes "stops working" they can test the other side. I usually train them on this for 1-2 weeks (max).
- Has anyone tried using alternative criterions to change the probability, e.g. instead of 30 using 60 pellets or more (or even less than 30)? Wondering which might be a good number to use depending on the length of my experiments.
- We have tested different number of pellets for the criterion of probability changes. We've found that the best criterion is to do 20 or 30 pellets. 20 seems to work totally fine, since mice appear not to care about what happened 25 pellets ago, for instance. The issue with doing more pellets is that you get fewer probability changes, and usually we have observed the strongest phenotypes to be around the probability switches. If it is a really large number (let's say 100), mice might also go through extinction, thinking that it has stopped working completely. That said, we have not rigorously quantified this, but is more like anecdotal data.
- Speaking of pellet numbers, is there a reason why we change the probabilities by logging pellet numbers? I think 30 pellets is low enough for it to be fine, but how would timer probability switches work? The probability switching after every set time period?
- Great question! No strong rationale, really, it just seems to work well. For instance, I believe there are some groups that are using a slightly different implementation using FED where probabilities change slightly after every poke (restless bandit). In the documentation there is also a brief explanation of how one would go about changing the criterion for probability changes. The bandit code is written to, hopefully, make these kind of customizations easy.
- I may be misunderstanding the code, but from what I gathered, "fed3.allowBlockRepeat = false" means that the probabilities aren't repeated between two blocks in a row. I've previously had some problems with a few of my FEDs which restart automatically for some reason (I couldn't tell why, I saw one happen in front of me as I stood next to it), and it has previously caused some minor problems to my left-right sequence experiments. That makes me a bit worried because if I were to use the bandit task code, and my FED were to restart, it may coincidentally grant a mouse identical probabilities for two consecutive blocks.
- I see! That definitely could happen, although I only rarely see FEDs restarting spontaneously. I would say that, particularly if you run them for long periods of time, if it happens once I don't think it'd be that terrible, because the mice would get many blocks of different probabilities overall. If the FEDs are restarting very often, then there is probably something wrong with the device and that's a different story. But in general, I have had no issues with this.
- I also noticed the line of code regarding the timeout mechanics, and that it resets the timeout if the mouse pokes during it again. I think that's a good design choice to help identify persevering tendencies in mice, but I was wondering if this line of code could be a problem if a mouse continues to poke far too frequently for the timer to be reset for too long? Would a longer timeout with no resets work better by any chance (in which case, it may actually be better to log pokes during this phase even if not 'valid pokes').
- Great point! I have seen it been an issue especially if mice sneak in bedding or nestlets in the poke, such that it keeps resetting over and over. That being said, regarding mouse behavior itself, some mice do persist quite a bit, but after a few minutes (at the longest) they stop. It is actually cool to see that throughout training, the number of pokes during time out (which are logged in the csv) is reduced. I've considered that a measure of learning, in a way. I've found that 10 seconds with resetting activated and with white noise tends to work best. We've tried other variations, such as 30 seconds without resetting, or 5 seconds after every poke, regardless of whether they get a reward or not, and I've seen no evidence that it improves their behavior or learning, if anything it can be slightly worse. But! I haven't done a rigorous analyses of different parameter settings and see how that affects learning/performance, but is a bit more of anecdotal observations. One more thing! You can set the "reset" parameter to true or false, so you can decide what works best for you.
Regarding your question Zane,
- In terms of pellets to switch, do you think a shorter criterion is better? If doing experiments for 4 days, have you seen that mice adapt to the bandit task better with a short criterion?
- I think I were to do my experiments again, I'd do 20 pellets, instead of 30. In general, I think there is no difference between the two, but with 20 you get more probability changes, which is usually where the learning phenotype is strongest. It is possible that even fewer pellets than that could work, but I'd be concerned that mice just start poking randomly assuming they will eventually get a pellet. Although the time out helps to avoid it, I've still seen many mice just "decide" that they don't want to learn the task and just start poking randomly. There are some big-scale (in terms of the number of mice) experiments going on right now that are using 20 pellets for criterion, and it seems to be going really well.
I hope this helps. Feel free to reach out if you have any further questions!
Best,
Alex Legaria