Reloading Plug-ins?

Craig Kimerer

unread,

Sep 9, 2015, 11:38:29 AM9/9/15

to Airflow

Hello,

I spent some time debugging an issue I ran across while deploying some new DAGs to an already running scheduler. The new DAGs were picked up as expected, but any time I would deploy a DAG that depended on a new custom plugin I had to restart the scheduler for it to actually be imported. I am curious if this is functionality I should depend on when deploying (if anything in plugins has changed, restart the scheduler), or if this is something I should file an issue about?

Thanks in advance.

Maxime Beauchemin

unread,

Sep 9, 2015, 5:50:03 PM9/9/15

to Airflow

New DAGs should get picked up, but there's no builtin mechanism as of now to pick up the new/changes on plugins.

Until then, we've been running this hack to auto-restart the scheduler every N runs, it addresses this issue and other corner cases:

while echo "Running"; do

airflow scheduler -n 5

echo "Scheduler crashed with exit code $?. Respawning.." >&2

date >> /tmp/airflow_scheduler_errrors.txt

sleep 1

done

We should absolutely improve the scheduler to be 100% resilient and reflect all changes, but on the short term this works well.

Craig Kimerer

unread,

Sep 9, 2015, 6:29:27 PM9/9/15

to Airflow

Thanks for the reply. Can you shed any light on other corner cases I might hit by not restarting this? I am just trying to figure out the right path forward for running airflow in a production environment.

Thanks.

Maxime Beauchemin

unread,

Sep 9, 2015, 7:16:46 PM9/9/15

to Airflow

Some of these bugs have been addressed over time, but I remember that there was issues around cases like:

* Moving a dag_id from a file to another, the dag may stopped getting scheduled or the old version might get scheduled

* Deleting a pipeline file, the dag may linger and still appear in production until the scheduler gets restarted

* Very rare networking or transient bugs may not be caught in a try block and the scheduler can just stop if you are not using runit, sv, upstart or some sort of daemon service that ensure the process is up

Since we've had the hack to restart the scheduler periodically we haven't seen any of these, but they may still occur. I'll open a github issue to keep track of it.

Reply all

Reply to author

Forward