I investigated further in the multiple task instances angle. In my previous post I mentioned I queried the DB (specifically the table task_instances) and didn't see any duplicate entries. However, when I check the jobs table related to the tasks around the timestamp of when our task failed, I see these entries:
+------+-----------------+---------+--------------+----------------------------+----------------------------+----------------------------+----------------+------------------------------------+----------+
| id | dag_id | state | job_type | start_date | end_date | latest_heartbeat | executor_class | hostname | unixname |
+------+-----------------+---------+--------------+----------------------------+----------------------------+----------------------------+----------------+------------------------------------+----------+
| 1403 | production-poki | success | LocalTaskJob | 2019-03-27 05:16:25.286102 | 2019-03-27 05:35:42.650791 | 2019-03-27 05:16:25.286120 | CeleryExecutor | airflow-worker-5d74ddb497-29f4c | airflow |
| 1402 | production-poki | success | LocalTaskJob | 2019-03-27 05:12:46.252704 | 2019-03-27 06:21:22.380016 | 2019-03-27 05:12:46.252720 | CeleryExecutor | airflow-worker-5d74ddb497-ck2wr | airflow |
| 1401 | production-poki | success | LocalTaskJob | 2019-03-27 05:10:08.055841 | 2019-03-27 05:28:15.305873 | 2019-03-27 05:10:08.055857 | CeleryExecutor | airflow-worker-5d74ddb497-ck2wr | airflow |Eventhough it doesn't mention the task_id, it might be interesting to see as there were not many tasks to be running/scheduled at that moment.
Additionally, I see three scheduler processes running at the moment:
$ ps aux | grep scheduler
airflow 1 0.0 0.0 19780 3176 ? Ss 13:09 0:00 /bin/bash /var/local/airflow.sh scheduler
airflow 35 6.0 1.8 225152 71332 ? S 13:09 0:04 /usr/bin/python /usr/local/bin/airflow scheduler -r 600
airflow 257 18.0 1.8 239492 68592 ? R 13:10 0:00 /usr/bin/python /usr/local/bin/airflow scheduler -r 600
airflow 258 18.0 1.8 239492 68400 ? R 13:10 0:00 /usr/bin/python /usr/local/bin/airflow scheduler -r 600
Is this to be expected?
Best,
Wouter