Jean-Gabriel,
This is an extremely common pain point and trying to tackle it in a more general way is fairly high on my infinitely long list of things I'd like to do with my very finite time. At the end of the day, labscript was initially designed for a BEC experiment with each shot taking multiple seconds, so transition delays for each shot weren't a serious concern and design choices were made under that basic assumption. You are likely correct that the transition functions for your devices are the limiting factor now. If everything else is perfect (which it likely isn't), I strongly suspect the ultimate limit to cycle times is the serialized access to the shot h5 file, which lives on disk. While it has been a while since I've looked into the problem, the times you are seeing are fairly close to what I would expect for optimal and it appears you are doing many of the things I would normally suggest. Johannes has provided some pretty good resources for more involved ways to tackle the problem. There are also a number of historical threads of people tackling this problem (
here or
here for instance) that could be useful to read to get a feel for the wider problem.
All that said, I would highly encourage you to look into less involved solutions first. One day I'll formalize this advice into the docs but for now I guess I'll continue to workshop it here on the list-serve. Organized in roughly priority order:
- Invest in your control computer(s). I personally find it interesting that people will spend hundreds of thousands on the experiment hardware/lasers/synths then cheap out on the control computer. You are going to use that computer for >8hrs/day for years. It is OK to spend $3-5k on good hardware. I don't have hard numbers for specs since there are a ton of confounding factors in practice, but the following are good places to focus. In general, latency is going to be more important than raw processing power, assuming you meet a certain base threshold. In any case, I also highly recommend not using the latest and greatest so that your system is more robust.
- The fastest hard-drive you can reasonably attain. Given serial access to the shot file is a common bottleneck, ensuring you can read/write to disk as fast as possible is very important.
- Lot's of RAM, with the fastest clock speed your motherboard supports. Labscript relies heavily on having tons of simultaneous python threads. Keeping all of that within fast memory is important.
- Decent graphics card. BLACS cycle times include updates to the GUI within the hot loop (for now at least). Ensuring those updates are not limited by how quickly you can draw to the screen is important.
- Obviously a fast, multi-core processor with decent sized caches. Be wary of going too far here. Having too many cores can actually cause subtle slowdowns in software that tries to aggressively parallelize (like certain numpy backends).
- Labscript best practices
- No remote shot storage. Keep your shots local on the computer that is running BLACS. Latency to serial accesses to a remote shot file can be quite large.
- Ensure devices don't open the shot file read/write if they will only read. This can allow for multiple simultaneous reads of the shot file, breaking some of the serialization bottleneck.
- Ensure devices don't hold the shot file open longer than they need to. A common problem is a device that takes a long time to read data off. If you keep the shot file open during a 1second long data transfer, nothing else can transition during that time either. Best to open the file to get necessary info, close it, read data off the device, then re-open the file to save the data.
- Optimize your script to limit re-programs or excessively large instruction counts. Novatech's are a common culprit here. Specifying dense ramps can lead to 1000s of instructions, and the serial comms interface on the device is very slow. Determine if you really need that many instructions.
- Careful leverage of smart-cache. Even for a slow programming device, if there aren't changes between shots the smart cache will prevent reprogramming. If you have a particularly slow device, re-programming individual instructions that change, instead of the entire array may be warranted.
- Avoid slow comms devices. Basic serial interfaces, overloaded GPIB, USB1.1/2 devices, or slow ethernet devices can slow things down too. Try to use the fastest interface available, or even find a different device that has a better interface.
- Labscript hacks
- Try to move some of your experiment into the programming time. A common case is performing your MOT reload between shots where timing and dynamic control aren't necessary.
- Do multiple experiments in a single shot. This allows you to distribute the programming penalty over more data points, at the expense of a more complicated script and analysis.
- Optimized dependencies
- Use modern pythons. Modest speed improvements to base python have been implemented since 3.11. Using later versions of python could provide modest benefits.
- Use optimized backends. Not all software used by labscript is python. Ensure you aren't being limited by a compiled dependency. For instance, Labscript does use the BLAS/LAPACK backends via numpy/scipy. Using an optimized BLAS/LAPACK for your architecture can provide some benefit. By default, the numpy provided by conda's default channel uses the highly optimized MKL, which tends to be faster for heavy computational loads on Intel CPUs.
- Carefully profile
- Once the above have been implemented and/or have failed to help, then you need to think proper optimization, and that requires proper profiling (ie more than the times BLACS flashes in the GUI). For very slow things, checking timestamps in the logs may provide a hint, but realistically a full profiling solution will be necessary. Setting up a common recipe for how to do this is on my todo list, but I would recommend looking in to a statistical profiler (like Scalene) as it requires less tampering and overhead. In any case, it is very hard to fix what you can't quantitatively measure, so you really should do this before moving on to the next step so you can focus your efforts on what will give the most bang for you time.
- Tune labscript internals for your use case
- Labscript is designed to be quite general, and we are generally loathe to implement an optimization in the mainline repos that limits flexibility. That said, most experiments don't need the full flexibility of labscript and can make customizations locally that trade unnecessary features for greater speed (ie these are things like what Johannes linked). Be wary of deviating too far from mainline without good reason as once your code diverges, merging updates from the mainline can get pretty painful.
In any case, if you (or others) make progress on identifying bottlenecks or improving speed, please do post something to the listserve. That experience is invaluable to the community, but can be very hard to find when it is buried deep in random github repos, theses, and lab notes.
-David