Concurrent execution fails with 'no job found'

24 views
Skip to first unread message

Manuel Pasieka

unread,
Aug 1, 2022, 5:21:28 AM8/1/22
to Luigi
Hello Luigi Community,

I have a data processing pipeline, that uses celery queue to trigger Luigi Tasks (using Luigi 3.0.3) as soon as there is new data. That works fine for a single celery worker running a single process. As soon as I start enabling concurrency on that worker and the concurrent execution of the pipeline, my Luigi Tasks fail.

The pipeline consist of 5 steps, and when running two concurrent executions, I can see in the logs that they workers either fail at the 2 or 3 step (there seems to be some race conditions, they don’t always fail at the same step).

Running a test with two new input datasets with two Luigi Workers that a triggered by Celery with concurrency=2, in the logs I can see things like:




```

[2022-07-26 08:41:16,974: WARNING/ForkPoolWorker-1] DEBUG: Asking scheduler for work...

[2022-07-26 08:41:16,974: DEBUG/ForkPoolWorker-1] Asking scheduler for work...

2022-07-26 08:41:16,983 tornado.access[16105] INFO: 200 POST /api/get_work (127.0.0.1) 7.41ms

[2022-07-26 08:41:16,984: WARNING/ForkPoolWorker-1] DEBUG: Pending tasks: 1

[2022-07-26 08:41:16,984: DEBUG/ForkPoolWorker-1] Pending tasks: 1

[2022-07-26 08:41:16,985: WARNING/ForkPoolWorker-1] INFO: [pid 16155] Worker Worker(salt=808887536, workers=1, host=ip-172-31-33-38, username=ubuntu, pid=16155) running RunDataPipeline(tenant=test, source_storage_type=minio, preprocessor=hana, batch_ts=01032016, when=2022-07-26T084058)

[2022-07-26 08:41:16,985: INFO/ForkPoolWorker-1] [pid 16155] Worker Worker(salt=808887536, workers=1, host=ip-172-31-33-38, username=ubuntu, pid=16155) running RunDataPipeline(tenant=test, source_storage_type=minio, preprocessor=hana, batch_ts=01032016, when=2022-07-26T084058)

[2022-07-26 08:41:16,989: WARNING/ForkPoolWorker-1] Cannot access hana.main.01032016, no job found
```



```

[2022-07-26 08:41:17,576: WARNING/ForkPoolWorker-2] DEBUG: Asking scheduler for work...

[2022-07-26 08:41:17,576: DEBUG/ForkPoolWorker-2] Asking scheduler for work...

2022-07-26 08:41:17,584 tornado.access[16105] INFO: 200 POST /api/get_work (127.0.0.1) 6.15ms

[2022-07-26 08:41:17,585: WARNING/ForkPoolWorker-2] DEBUG: Pending tasks: 1

[2022-07-26 08:41:17,585: DEBUG/ForkPoolWorker-2] Pending tasks: 1

[2022-07-26 08:41:17,585: WARNING/ForkPoolWorker-2] INFO: [pid 16159] Worker Worker(salt=013570457, workers=1, host=ip-172-31-33-38, username=ubuntu, pid=16159) running RunDataPipeline(tenant=test, source_storage_type=minio, preprocessor=hana, batch_ts=01032017, when=2022-07-26T084058)

[2022-07-26 08:41:17,585: INFO/ForkPoolWorker-2] [pid 16159] Worker Worker(salt=013570457, workers=1, host=ip-172-31-33-38, username=ubuntu, pid=16159) running RunDataPipeline(tenant=test, source_storage_type=minio, preprocessor=hana, batch_ts=01032017, when=2022-07-26T084058)

[2022-07-26 08:41:17,588: WARNING/ForkPoolWorker-2] Cannot access hana.main.01032017, no job found
```

What does the warning "Cannot access XXXX, no job found" mean?

Any tips on how to debug the issue?
Thank you,
Manuel

Lars Albertsson

unread,
Aug 16, 2022, 5:34:21 PM8/16/22
to Manuel Pasieka, Luigi
I cannot find the string "no job found" in the luigi source. Could it come from some other component? Celery or hana?

--
You received this message because you are subscribed to the Google Groups "Luigi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to luigi-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/luigi-user/cae12834-d5b6-4e71-89ec-5c0d2d9c872an%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages