Airflow webserver down, but processes still running?

3,595 views
Skip to first unread message

Imran Hassanali

unread,
Aug 19, 2018, 2:19:16 PM8/19/18
to cloud-compo...@googlegroups.com
Hell,

I get the following 502 error code when trying to access the airflow UI, however it looks like my DAGs are still running as I am getting emails related to those runs... Anyone know how I can debug what is going on with the UI or reset it somehow?   Note it was working fine on friday and there haven't been any code/config changes since.  

Thanks!

:

Error: Server Error

The server encountered a temporary error and could not complete your request.

Please try again in 30 seconds.






--
Imran Hassanali | im...@essential.com

Feng Lu

unread,
Aug 19, 2018, 7:08:59 PM8/19/18
to im...@essential.com, cloud-composer-discuss
Could you PM me your Airflow webserver URL? 

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.
To post to this group, send email to cloud-compo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-composer-discuss/CACtOZjpzjnhgqCGzP4k1rFTPBc_x35DFJvhFeMojWvLtt6eQXA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Feng Lu

unread,
Aug 21, 2018, 4:59:52 PM8/21/18
to im...@essential.com, cloud-composer-discuss
It turns out that the gs://{your-dag-bucket}/airflow.cfg (generated and needed by Composer service) is accidentally removed, which halts the Airflow webserver init process.
(Airflow webUI needs access to this shared airflow.cfg file to make sure configs are consistent across all Airflow services.) 

Please try to make a dummy Airflow config update, that should bring back the airflow.cfg file and resume the webserver for you. 

On Sun, Aug 19, 2018 at 11:19 AM Imran Hassanali <im...@essential.com> wrote:

Imran Hassanali

unread,
Aug 21, 2018, 5:44:56 PM8/21/18
to Feng Lu, cloud-compo...@googlegroups.com
Success! I am back up.  

Thanks, I think I inadvertently deleted it when setting up a git hook to gsutil rsync my code to the dag bucket and didn't realize it was needed or removed. 

Appreciate the help
--
Imran Hassanali | im...@essential.com

Feng Lu

unread,
Aug 21, 2018, 7:00:09 PM8/21/18
to im...@essential.com, cloud-composer-discuss

Stefano Giostra

unread,
May 20, 2019, 4:43:10 AM5/20/19
to cloud-composer-discuss
Hi Feng,
i've the same issue, but in my project the airflow.cfg is at his position.
can You help us?


Il giorno mercoledì 22 agosto 2018 01:00:09 UTC+2, Feng Lu ha scritto:
Great to know! 

On Tue, Aug 21, 2018 at 2:44 PM Imran Hassanali <im...@essential.com> wrote:
Success! I am back up.  

Thanks, I think I inadvertently deleted it when setting up a git hook to gsutil rsync my code to the dag bucket and didn't realize it was needed or removed. 

Appreciate the help

On Tue, Aug 21, 2018 at 1:59 PM Feng Lu <fen...@google.com> wrote:
It turns out that the gs://{your-dag-bucket}/airflow.cfg (generated and needed by Composer service) is accidentally removed, which halts the Airflow webserver init process.
(Airflow webUI needs access to this shared airflow.cfg file to make sure configs are consistent across all Airflow services.) 

Please try to make a dummy Airflow config update, that should bring back the airflow.cfg file and resume the webserver for you. 

On Sun, Aug 19, 2018 at 11:19 AM Imran Hassanali <im...@essential.com> wrote:
Hell,

I get the following 502 error code when trying to access the airflow UI, however it looks like my DAGs are still running as I am getting emails related to those runs... Anyone know how I can debug what is going on with the UI or reset it somehow?   Note it was working fine on friday and there haven't been any code/config changes since.  

Thanks!

:

Error: Server Error

The server encountered a temporary error and could not complete your request.

Please try again in 30 seconds.






--
Imran Hassanali | im...@essential.com

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.


--
Imran Hassanali | im...@essential.com

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.

Feng Lu

unread,
May 20, 2019, 8:07:57 PM5/20/19
to Stefano Giostra, cloud-composer-discuss
Sure, are you able to find anything interesting in the Airflow webserver logs?

Great to know! 

To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.


--
Imran Hassanali | im...@essential.com

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.

To post to this group, send email to cloud-compo...@googlegroups.com.

Stefano Giostra

unread,
May 22, 2019, 3:40:28 AM5/22/19
to cloud-composer-discuss
Hi Feng,
no, in the logger seems all ok, 
some time we see the messege about timeot of the scheduler, but we can't understand why
{
insertId: "4g09oof4yu842" 
logName: "projects/......./logs/airflow-webserver" 
receiveTimestamp: "2019-05-22T07:36:45.242743062Z" 
resource: {…} 
textPayload: "Timeout: 120 " 
timestamp: "2019-05-22T07:36:42.719983110Z" 
Great to know! 

To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.


--
Imran Hassanali | im...@essential.com

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.

Stefano Giostra

unread,
May 27, 2019, 9:06:40 AM5/27/19
to cloud-composer-discuss
Hi Feng,
have some ideas for my issue?
Thanks

Feng Lu

unread,
May 29, 2019, 2:02:53 AM5/29/19
to Stefano Giostra, cloud-composer-discuss
Yes, I think the bug is caused by Airflow webserver block waiting on all dags to be loaded (before serving any web request, code). 
We addressed this problem by introducing asynchronous airflow dag loading, which is available in 1.7.1 and should reach all regions by 5/31.
Please upgrade your current environment to composer-1.7.1-airflow-1.10.2 when it becomes available in your composer region. 

That being said, normally it shouldn't take Airflow double-digits seconds to finish parsing all DAGs, you may also want to make sure:
1. parsing dag code doesn't require input from external systems.
2. only dag files are placed inside the dags/ directory. 

Great to know! 

To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.


--
Imran Hassanali | im...@essential.com

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.

To post to this group, send email to cloud-compo...@googlegroups.com.

Stefano Giostra

unread,
May 29, 2019, 4:08:35 AM5/29/19
to Feng Lu, cloud-composer-discuss

Hi Feng,

do You know if this version solve salso the issue about the schedule_interval?

I’ve note that when use the cron sintax for the schedule interval AF ignore the schedule.

Es:

 

DAG

Schedule

Owner

Last Run 

assicurazione

00 00 * * *

airflow

2019-05-22 15:54 

cougar

05 00 * * *

airflow

2019-05-23 11:42 

insight

15 00 * * *

airflow

2019-05-20 10:35 

 

The last run is not today at 00.00 00:05 and 00:15

 

Thanks

 

Stefano G.

Da: Feng Lu <fen...@google.com>
Inviato: mercoledì 29 maggio 2019 08:03
A: Stefano Giostra <sgio...@bitbang.com>
Cc: cloud-composer-discuss <cloud-compo...@googlegroups.com>
Oggetto: Re: Airflow webserver down, but processes still running?

Amir Amangeldi

unread,
Jul 3, 2019, 2:01:35 PM7/3/19
to cloud-composer-discuss
Hi Feng,

Thank you for the instructions. I set the configurations as specified, but immediately got into webserver errors after the environment updated.
All DAGs become greyed out in the UI. These are the errors that I see in the webserver logs:
[2019-07-03 17:48:45,682] {dagbag_loader.py:145} WARNING - Dagbag loader sender errors.

Traceback (most recent call last):
 
File "/usr/local/lib/airflow/airflow/www/dagbag_loader.py", line 130, in _send_dagbag
    copy
.deepcopy(v[x])) for x in new_keys} if k == 'dags' else {
 
File "/usr/local/lib/airflow/airflow/www/dagbag_loader.py", line 130, in <dictcomp>
    copy
.deepcopy(v[x])) for x in new_keys} if k == 'dags' else {
 
File "/opt/python3.6/lib/python3.6/copy.py", line 161, in deepcopy
    y
= copier(memo)
 
File "/usr/local/lib/airflow/airflow/models.py", line 4151, in __deepcopy__
    setattr
(result, k, copy.deepcopy(v, memo))
 
File "/opt/python3.6/lib/python3.6/copy.py", line 150, in deepcopy
    y
= copier(x, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y
[deepcopy(key, memo)] = deepcopy(value, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 161, in deepcopy
    y
= copier(memo)
 
File "/usr/local/lib/airflow/airflow/models.py", line 2874, in __deepcopy__
    setattr
(result, k, copy.deepcopy(v, memo))
 
File "/opt/python3.6/lib/python3.6/copy.py", line 180, in deepcopy
    y
= _reconstruct(x, memo, *rv)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 280, in _reconstruct
    state
= deepcopy(state, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 150, in deepcopy
    y
= copier(x, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y
[deepcopy(key, memo)] = deepcopy(value, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 180, in deepcopy
    y
= _reconstruct(x, memo, *rv)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 280, in _reconstruct
    state
= deepcopy(state, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 150, in deepcopy
    y
= copier(x, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y
[deepcopy(key, memo)] = deepcopy(value, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 180, in deepcopy
    y
= _reconstruct(x, memo, *rv)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 280, in _reconstruct
    state
= deepcopy(state, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 150, in deepcopy
    y
= copier(x, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y
[deepcopy(key, memo)] = deepcopy(value, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 180, in deepcopy
    y
= _reconstruct(x, memo, *rv)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 280, in _reconstruct
    state
= deepcopy(state, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 150, in deepcopy
    y
= copier(x, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y
[deepcopy(key, memo)] = deepcopy(value, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 150, in deepcopy
    y
= copier(x, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 215, in _deepcopy_list
    append
(deepcopy(a, memo))
 
File "/opt/python3.6/lib/python3.6/copy.py", line 180, in deepcopy
    y
= _reconstruct(x, memo, *rv)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 280, in _reconstruct
    state
= deepcopy(state, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 150, in deepcopy
    y
= copier(x, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y
[deepcopy(key, memo)] = deepcopy(value, memo)
 
File "/opt/python3.6/lib/python3.6/copy.py", line 169, in deepcopy
    rv
= reductor(4)
TypeError: can't pickle _thread.RLock objects


Looks like the scheduler and the workers have no issues, just the webserver.
We're using composer-1.7.1-airflow-1.10.2 with the following environment configurations:

core
max_active_runs_per_dag
1
dags_are_paused_at_creation
True
scheduler
min_file_process_interval
60
catchup_by_default
False
webserver
collect_dags_interval
30
worker_refresh_interval
3600
dagbag_sync_interval
10
async_dagbag_loader
True

Please advise.
Thank you,

Amir Amangeldi


EVERQUOTE  |  Senior Software Engineer

Great to know! 

To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.


--
Imran Hassanali | im...@essential.com

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.

Diego Serrano

unread,
Jun 26, 2020, 6:00:04 AM6/26/20
to cloud-composer-discuss
Hi Feng,

I had the exact same issue described in this whole post and I managed to get Airflow's UI working after removing all the dags from Storage and upgrading the environment.

However, I still have the same described Timeout issue that shows three of my dags that as broken with the message:

Broken DAG: [/home/airflow/gcs/dags/my_dag.py] Timeout

And the airflow_monitoring DAG has the state:

Airflow DAG seems to be existing only locally. The master scheduler doesn't seem to be aware of its existence”

I only have dags in the dags/ directory folder (although dags are separated by folders) and the failing DAGs do not require input from external systems.

I deleted them from the dags/ folder but still the environment is stuck with those error messages.

Do you know a fix for this?

Thanks
Great to know! 

To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.


--
Imran Hassanali | im...@essential.com

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages