"Broken Dag" errors in UI

18,118 views
Skip to first unread message

Terrence Szymanski

unread,
May 29, 2018, 9:31:18 PM5/29/18
to cloud-composer-discuss
I have a setup where several dags (e.g. "dags/dag1.py", "dags/dag2.py") import from other python files in the dags folder (e.g. "dags/settings.py" and others). I'm getting error messages in the web UI like:

Broken DAG: [/home/airflow/gcs/dags/dag1.py] No module named settings

The dags aren't loaded in the UI properly -- I can see them but can't run them and there is an error message like "This DAG seems to be existing only locally. The master scheduler doesn't seem to be aware of its existence.”

If I run "list_dags", all the dags are imported fine with no errors, and I can run the same code locally in Airflow with no issues.

In the past, in Composer we had errors like this that would come and go kind of unpredictably, but now I can't seem to get things to work at all. Is it safe to use import statements (e.g. "from settings import foo") for Python modules we copy into the dags folder? It's strange that this usually works for us, but sometimes doesn't.

Terry


Cheng Liu

unread,
Jun 9, 2018, 1:18:55 AM6/9/18
to cloud-composer-discuss
Hi Terry,

Just a hypothesis: do we need a __ini__.py file in the under dags in order to have all the dags load settings.py properly? Otherwise, could you try to move settings.py to dags/lib/settings.py and load the module via import lib.settings?

Terrence Szymanski

unread,
Jun 13, 2018, 10:27:42 PM6/13/18
to cloud-composer-discuss
Thanks for the suggestion. Actually we previously had a structure similar to that but moved everything out of subfolders because that caused fewer issues.

The code works most of the time, the errors only occur some of the time, which makes it hard to debug.

Terry

Trevor Edwards

unread,
Jun 19, 2018, 6:03:01 PM6/19/18
to cloud-composer-discuss
Hi,

If you are importing user-defined module, I recommend zipping the modules in subfolders (with __init__.py's).

hec...@iihnordic.com

unread,
Jun 28, 2018, 6:39:53 AM6/28/18
to cloud-composer-discuss
I have had a similar problem that I just posted on stackoverflow.

I got around the issue by renaming the folder that contains the functions (in your case settings.py) and refactoring the imports in the dags. It's not a great long-term solution though.

Héctor

talha.kha...@gmail.com

unread,
Jan 17, 2019, 1:46:42 AM1/17/19
to cloud-composer-discuss
I had the same problem with the Airflow. This usually happens when you update the code with new variable references like imported a new variable from Constants file. For some reason, Airflow don't kill some of its child processes that it creates to run Tasks and these old child ghost processes points to the older version of your code(older version of Constants file in our case) which don't have this new variable reference. So in your updated code you do have the correct variable reference but old child processes dont.

Heres how to fix:
1. Make sure no Airflow DAGs/Tasks are running
2. Stop Airflow worker/webserver/scheduler
3. Search for all Airflow child ghost processes using `ps aux | grep airflow`
4. Kill all processes that contain `airflow` keyword using `pkill -f my_pattern`
5. Start Airflow processes

Broken DAG errors should be removed

talha.kha...@gmail.com

unread,
Jan 17, 2019, 1:49:55 AM1/17/19
to cloud-composer-discuss
Following my above response. The command in Step#4 should be `pkill -f airflow`

Victor Hansjons Vegeborn

unread,
Jul 16, 2020, 6:40:00 PM7/16/20
to cloud-composer-discuss
Is there a better long-term fix for this issue yet? I am currently experiencing this phenomenon and there basically no info on this issue anywhere....

Anmol Mourya

unread,
Jul 23, 2020, 3:20:36 AM7/23/20
to cloud-composer-discuss
yes just adding __ini__.py  under dags solved my issue.

sgus...@butterflynetinc.com

unread,
Jul 30, 2020, 9:52:59 AM7/30/20
to cloud-composer-discuss
that will not solve until GCP fix file synch from bucket to webserver.
Reply all
Reply to author
Forward
0 new messages