Cloud Composer can't rendered dynamic dag in webserver UI “DAG seems to be missing”

444 views
Skip to first unread message

Lucas Magalhães

unread,
Jan 12, 2021, 10:13:39 AM1/12/21
to cloud-composer-discuss
Hi everyone,
I'm trying to make a dag that has 2 operators that are created dynamically, depending on the number of "pipelines" that a json config file has. this file is stored in the variable dag_datafusion_args. Then I have a standard bash operator, and I have a task called success at the end that sends a message to the slack saying that the dag is over. the other 2 tasks that are python operators are generated dynamically and run in parallel. I'm using the composer, when I put the dag in the bucket it appears on the webserver ui, but when I click to see the dag the following message appears'DAG "dag_lucas4" seems to be missing. ', If I test the tasks directly by CLI on the kubernetes cluster it works! But I can't seem to make the web UI appear. I tried to do as a suggestion of some people  in SO to restart the webserver by installing a python package, I tried 3x but without success. Does anyone know what can it be?


dag.py

rodrigo....@wizeline.com

unread,
Jan 12, 2021, 10:52:40 AM1/12/21
to cloud-composer-discuss
Could you share the function `return_datafusion_config_file('med')` please

Shiv Shankar

unread,
Jan 12, 2021, 11:48:28 AM1/12/21
to rodrigo....@wizeline.com, cloud-composer-discuss

Hi Lucas,

The most likely reason for your issue could be the  Composer environment  architecture . The Airflow-Webserver in Cloud Composer runs in the tenant project, the worker and scheduler runs in the customer project. Tenant project is nothing but its google side managed environment for some part of airflow components. So the Webserver UI doesn't have complete access to your project resources. As it doesn't run under your project's environment. 

Here the json file you are trying to read to dynamically generate the DAG's task or something which is trying to access a resource which is not accessible outside your project will be the exact reason for this issue. You can try seeing the Cloud Composer's Airflow Webserver log in GCP Console log and you should have the exact error there. 
I had the same issue, We are  generating  dynamic tasks of a DAG reading a Cloud SQL db table.  We solved it by deploying self-managed airflow webserver 

Hope it helps!!

Thanks
Shiv


This email and its contents (including any attachments) are being sent to
you on the condition of confidentiality and may be protected by legal
privilege. Access to this email by anyone other than the intended recipient
is unauthorized. If you are not the intended recipient, please immediately
notify the sender by replying to this message and delete the material
immediately from your system. Any further use, dissemination, distribution
or reproduction of this email is strictly prohibited. Further, no
representation is made with respect to any content contained in this email.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-composer-discuss/86726df3-67ad-44bc-a6c7-a787f351f6c3n%40googlegroups.com.

The information in this mail and any attachment(s) to this message is/are confidential and intended solely for the addressee or organisation to whom it is addressed. If you have erroneously received this message, please notify mail...@acko.com immediately and destroy the message and attachment(s). If you are not the intended recipient, any copying, forwarding, altering or disclosing the contents of this email message may be unlawful.  The information, attachment(s) or the opinions expressed in this mail are those of the individual sender and not necessarily those of ACKO.  ACKO accepts no responsibility for any loss or damage arising from the use of this email message or its attachment(s).

Rodrigo Chaparro Plata Hernández

unread,
Jan 12, 2021, 12:16:42 PM1/12/21
to Shiv Shankar, cloud-composer-discuss
Sounds like a permission issue as Shiv mentioned, you can try to instead of reading the JSON from the bucket read it from the Variables

Admin -> Variables

Create one called `datafusion_config` and put the json content 
{
    "key1": "value1",
    "key2": "value2"
}

etc...

and you can retrieve the values as follows in your DAG

from airflow.models import Variable
dag_vars = Variable.get("datafusion_config",
                        default_var={},
                        deserialize_json=True)

key1 = dag_vars.get('key1')
key2 = dag_vars.get('key2')

and so on

Note that if your JSON object is not well-formed you will get an error while loading the DAG

----

Rodrigo Chaparro Plata Hernández | WIZELINE

Senior Data Engineer

rodrigo....@wizeline.com | +52 459 103 05 81

Amado Nervo #2200, Int 6, Jardines del Sol, 45050 Zapopan, Jalisco


Lucas Ferreira

unread,
Jan 12, 2021, 12:52:37 PM1/12/21
to Rodrigo Chaparro Plata Hernández, Shiv Shankar, cloud-composer-discuss
Thank you all, is definitely due that function that imports the json config file, i wrote the entire json string inside the dag file and it works, i will try do what Rodrigo said. Create an env variable.

Mateusz Henc

unread,
Jan 12, 2021, 3:12:21 PM1/12/21
to Lucas Ferreira, Rodrigo Chaparro Plata Hernández, Shiv Shankar, cloud-composer-discuss
Hi,
You may also use dag serialization,

It will stop Airflow Webserver UI parsing dags.

Best regards,
Mateusz Henc


Reply all
Reply to author
Forward
0 new messages