How to restrict to a single instance of task class to run at the same time?

33 views
Skip to first unread message

Samuel Lampa

unread,
Aug 9, 2022, 5:30:45 AM8/9/22
to Luigi
I couldn't find any information about this:

Is there a way to restrict so that only one instance of a task class in a workflow can run ... say for example DownloadData()?

My use case is that I'm extracting information from a somewhat resource constrained database, and want to restrict only the tasks accessing that one to run one at a time, while any other tasks in the workflow will be able to run more tasks in parallel.

I saw the max_batch_size parameter [1] , but IIUC, it does something else (more related to bundling multiple tasks depending on each other into one batch)?

Best
Samuel

Lars Albertsson

unread,
Aug 9, 2022, 9:31:38 AM8/9/22
to Samuel Lampa, Luigi

--
You received this message because you are subscribed to the Google Groups "Luigi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to luigi-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/luigi-user/5183ecdd-3555-4372-8de1-6c31e360b2d0n%40googlegroups.com.

Samuel Lampa

unread,
Aug 9, 2022, 11:37:06 AM8/9/22
to Luigi
Thanks Lars! I was looking at that one, but unless I'm missing something, that would require me to let the "download" tasks require unique access to all workers, which is less efficient than it needs to be, as it would be no problem to run e.g. one "download" task, and a few downstream tasks, as it is only the "download" tasks that I want to limit in terms of parallelism?

Samuel

Lars Albertsson

unread,
Aug 9, 2022, 2:15:03 PM8/9/22
to Samuel Lampa, Luigi
I don't understand what "require unique access to all workers" means. 

If e.g. max_download is set to 1 in the luigid conf (default value), and your task has resources attribute set to `{"max_download": 1}`, you will get mutual exclusion between the workers' download tasks. At most one download task will run at a time. Is that what you wanted to achieve?

The batch* parameters are not related. (And not very useful in a post-Hadoop world, TBH.)

Samuel Lampa

unread,
Aug 9, 2022, 2:52:41 PM8/9/22
to Luigi
Ah, I see now! I realize I had a totally confused understanding of how they work. Gotta try that out. Thank you!

Best
Samuel
Reply all
Reply to author
Forward
0 new messages