Time overhead of Shots in BLACS

Johannes Schabbauer

unread,

Oct 13, 2022, 5:08:25 AM10/13/22

to the labscript suite

Hello labscript community,

I am working since the last month to implement the experiment control of our new group in Vienna. We want to use an ADwin, which has both internal clocks and programmable output/input modules. I wrote the code for the new device and am so far very happy how it works in labscript. If someone is interested, I can share my implementation of the ADwin when I'm done.

There is just one point where I'm worried right now, which is the time it takes BLACS to execute a shot. Right now the ADwin is the only device connected to BLACS and the execution time from one shot to the next one queued up is about 700 ms. I tested with a python script that the bare time to send the output/input data between PC and hardware and shot execution is below 100 ms. In the BLACS log file I've seen that the time after one worker function is finished and the next one started is about 100 ms. For example here it takes 150ms from the signal that transition_to_buffered is finished to start_run (the Master Pseudoclock is the same and only device):

2022-10-12 15:42:51,230 DEBUG BLACS.ADwin_main_worker.worker: ADwin called transition_to_buffered.
2022-10-12 15:42:51,233 DEBUG BLACS.ADwin.mainloop: Waiting for worker to acknowledge job request
2022-10-12 15:42:51,233 DEBUG BLACS.ADwin.mainloop: Worker reported job started, waiting for completion
2022-10-12 15:42:51,243 DEBUG BLACS.ADwin_main_worker.worker: Job complete
2022-10-12 15:42:51,243 DEBUG BLACS.ADwin_main_worker.worker: Waiting for next job request
2022-10-12 15:42:51,243 DEBUG BLACS.ADwin.mainloop: Job completed
2022-10-12 15:42:51,243 DEBUG BLACS.ADwin.mainloop: returning worker results to function _transition_to_buffered
2022-10-12 15:42:51,324 DEBUG BLACS.ADwin.mainloop: Finalising function
2022-10-12 15:42:51,324 DEBUG BLACS.queue_manager.thread: ADwin finished transitioning to buffered mode
2022-10-12 15:42:51,339 DEBUG BLACS.ADwin.mainloop: Waiting for next event
2022-10-12 15:42:51,339 DEBUG BLACS.queue_manager.thread: About to start the master pseudoclock
2022-10-12 15:42:51,339 DEBUG BLACS.ADwin.state_queue: requesting next item in queue with mode 8
2022-10-12 15:42:51,339 DEBUG BLACS.ADwin.state_queue: New state queued up. Allowed modes: 8, queue state indefinitely: True, delete stale states: False, function: _start_run
2022-10-12 15:42:51,339 DEBUG BLACS.ADwin.state_queue: Current items in the state queue: [[0, 1214, 8, True, False, [<function ADwinProIITab.start_run at 0x000001B7A95723A0>, [(<queue.Queue object at 0x000001B7A9624DC0>,), {}]]]]
2022-10-12 15:42:51,339 DEBUG BLACS.ADwin.state_queue: Current items in the state queue: [[0, 1214, 8, True, False, [<function ADwinProIITab.start_run at 0x000001B7A95723A0>, [(<queue.Queue object at 0x000001B7A9624DC0>,), {}]]]]
2022-10-12 15:42:51,408 DEBUG BLACS.ADwin.state_queue: Re-initialsing self._get_blocking_queue
2022-10-12 15:42:51,408 DEBUG BLACS.ADwin.state_queue: iterating over states in queue
2022-10-12 15:42:51,408 DEBUG BLACS.ADwin.state_queue: requested state found in queue
2022-10-12 15:42:51,408 DEBUG BLACS.ADwin.mainloop: Processing event _start_run

Now I'm not sure if this time difference of caused by my implementation of the BLACS Tab/Worker, or if this is internal overhead of the BLACS experiment queue.

Can someone using labscript for longer tell me how large the shot execution time in BLACS is, compared to the bare time for hardware communication?

If you could send me the BLACS log output from one shot, that would also help me a lot.

Best wishes,

Johannes

Johannes Schabbauer

unread,

Nov 8, 2022, 11:56:22 AM11/8/22

to the labscript suite

I just thought I should give a quick update of what I found after my last post: Most of the ~700ms time needed for a single shot was due to the computer having a very bad GPU. This caused a lag to update the BLACS GUI before and during the shot. Checking the time performance on other computers with the same script gave 90-150ms for each shot. It's quite interesting that this can make such a difference and the experiment queue depends on updating the GUI.

As far as I understand there is very litte improvements possible for making the shot to shot time performance even faster than that in BLACS.

Best wishes,

Johannes

dihm....@gmail.com

unread,

Nov 10, 2022, 2:59:56 PM11/10/22

to the labscript suite

Johannes,

Thanks for providing an update on this. I'll admit I'm surprised that BLACS can cycle that fast at all!

I'm also surprised the BLACS GUI updates are actually blocking. My understanding was that GUI updates are handled in a separate thread with non-blocking calls specifically to avoid this problem. Did you manage to track down a specific offending operation? I'd be interested in taking a look at it at least briefly.

-David

Johannes Schabbauer

unread,

Nov 11, 2022, 9:31:24 AM11/11/22

to the labscript suite

Hi David,

I now found that the main cause for the time consumption during each shot is calling the setter functions for mode and state labels. Just by commenting out the two calls of _update_state_labels in the base tab class, the time went down from ~800ms to a bit more than 100ms. I still expect if I by another GPU there might be almost no difference anymore, but it still could be worth to update the GUI label just once at the end of the shot (if an error happens, it should still be visible in which step the program stopped).

Best wishes, Johannes

Lars Pause

unread,

Nov 17, 2022, 2:22:51 AM11/17/22

to the labscript suite

Hi David,

hi Johannes,

I just saw your dicussion and we recently also had a BSc student looking into the shot-to-shot flow of blacs. Maybe this adds some more helpful information.

He found that the single QT-Main thread is a bottle neck when updating shots fast /Reprogramming NI-cards fast. Here is the extract from his thesis and the corresponding figure 2.3.

[...] A sequence diagram for the execution of a single experiment is shown in Fig. 2.3.
The execution starts in the experiment queue as it selects the next shot that should be executed.
This is followed by a transition to buffered that is sent to the device tabs. They should forward
the instruction to the device workers, however, this forwarding is performed in the qt
main thread - a thread that is created by qt to control the user interface (UI). This is a potential
bottleneck, as all tabs have to perform their communication through the single qt main
thread which works asynchronus, but hinders fully parallel programming of the devices. After
transitioning to buffered, each device sends a done message, which is propagated back to the
experiment queue. Once all devices are transitioned to buffered, a start signal is sent to the
main pseudoclock. This is sufficient, as all devices should be in a buffered mode and start
execution automatically once the trigger of the pseudoclock is received.

The pseudoclock device tab then starts to check the status of the pseudoclock in regular intervals.
Once the status of the pseudoclock is finished, we know that the shot has finished
execution. This is sent to the experiment queue so that all devices can be transitioned to manual
mode.
The transition to manual mode is implemented similarly to the transition to buffered mode
as the command is sent to the device tabs, which send it to the qt main thread which finally
sends it to the devices. Once all devices have transitioned to manual, i.e. written their acquired
data to the h5 file and stopped the buffered task, they respond with a done message to the
experiment queue. This marks the end of one shot.

Best regards,

Lars

Execution_Flow_Blacs.JPG

Ben Olsen

unread,

Dec 29, 2022, 4:18:16 AM12/29/22

to the labscript suite

Hi Johannes,

I'm also working on setting up an experiment using an ADwin system (Pro II). Would you be comfortable sharing your implementation? I'd be happy to test some things out before I connect anything critical to the ADwin outputs.