I kicked off two sets of tasks that had overlapping dependencies and was surprised when the second one kicked off just stopped running rather than waiting for tasks to finish - if I wait a bit and kick off again, everything works fine.
Three main questions here:
1. What does "not granted run permission" really mean?
2. Is it okay to have two tasks with different names that require the same set of upstreams?
3. How do I avoid this problem in the future? (e.g., should I run a loop that just retries when it gets "not granted run permission"?)
---
More detail:
I have a long-running ETL task that I want to kick off, with shared dependencies between different tasks. I was running ~40 workers on two different machines, WorkerGroupA was supposed to run big ETL flow, while WorkerGroupB (on other machine) was running related tasks but not actually in the same tree. ("WorkerGroup" is what I'm calling "luigi SomeTask --workers=40")
(the task flow is that I have a task that clusters data, and then I need to generate some files based on the population in each cluster. It's possible to have two "GenerateFilesForCluster" tasks that have different input files but the same upstream requirements, WorkerGroupA would then perform other work later on).
When I tried to kick off WorkerGroupA, it ran 40 tasks, but then stopped with the "This progress looks :| because there were tasks that were not granted run permission by the scheduler" message.
My expectation would be that WorkerGroupA would keep polling until WorkerGroupB finished its tasks, helping if it could pick tasks up. Is it problematic that two tasks share dependencies while the parent is in different trees?
Happy to provide more information and my apologies if this is too vague or this should be placed elsewhere. (Message is at the end). I tried googling a little bit and found this https://groups.google.com/forum/#!topic/luigi-user/YH6pxBngKDw , however, that issue seemed to do with only using a single worker, whereas I'm using multiple.
Thank you for your help!
Jeff
---
* 679 present dependencies were encountered:
- 150 Task
...
* 358 ran successfully:
- 9 Task
- 5 Task
...
* 1679 were left pending, among these:
* 1 were missing external dependencies:
- 1 Task(...)
* 40 were being run by another worker:
- 40 Task(...)
* 1635 had missing external dependencies:
- 1 Task(...)
- 326 Task(...) ...
- 326 Task(...)
- 326 Task(...)
- 326 Task(...)
...
* 1 had dependencies that were being run by other worker:
- 1 **TaskA**(...)
* 2 was not granted run permission by the scheduler:
- 1 **TaskB**(...) (upstream of TaskA)
- 1 **TaskC**(...) (requires TaskB but not TaskA)
The other workers were:
- Worker(salt=863995201, workers=40, host=oak1-prd-hpc-n018, username=production, pid=3943266) ran 40 tasks
This progress looks :| because there were tasks that were not granted run permission by the scheduler
--
You received this message because you are subscribed to the Google Groups "Luigi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to luigi-user+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to luigi-user+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "Luigi" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/luigi-user/FGsqAJkadmI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to luigi-user+...@googlegroups.com.