Airflow-sqlproxy CrashLoopBackoff

1,510 views
Skip to first unread message

ev...@seeraerospace.com

unread,
Apr 19, 2019, 10:46:06 AM4/19/19
to cloud-composer-discuss
Wondering if anyone else has experienced an issue on Cloud Composer where their airflow-sqlproxy workload unexpectedly starts showing the "CrashLoopBackoff" status. When I get logs from the pod using `kubectl` I just see the following:

```
2019/04/19 14:39:24 the default Compute Engine service account is not configured with sufficient permissions to access the Cloud SQL API from this VM. Please create a new VM with Cloud SQL access (scope) enabled under "Identity and API access". Alternatively, create a new "service account key" and specify it using the -credential_file parameter
```

This started happening unexpectedly yesterday around 18:30 UTC. We're on composer-1.6.1-airflow-1.10.1, and prior to this our environment had been working fine since it was stood up on 2019-04-06. Looking at stackdriver logs, I don't see anything particularly useful.

Any suggestions as to what can be done?

Thanks,
Evan

ev...@seeraerospace.com

unread,
Apr 19, 2019, 10:50:27 AM4/19/19
to cloud-composer-discuss
Also, just to clarify how this is affecting us, the scheduler no longer works as a MySQL connection can't be made. We see this in the logs for the airflow-scheduler pod:

sqlalchemy.exc.InvalidRequestError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (MySQLdb._exceptions.OperationalError) (2003, "Can't connect to MySQL server on 'airflow-sqlproxy-service.default.svc.cluster.local' (110)")

Feng Lu

unread,
Apr 21, 2019, 7:58:33 PM4/21/19
to ev...@seeraerospace.com, cloud-composer-discuss
What happens if you restart the airflow-sqlproxy pod? e.g., kubectl delete pod {your-airflow-sqlproxy-pod-name}. 
If it fixes the problem, very likely this is a cloud-sqlproxy bug. 
We have seen another similar sqlproxy issue from other customers(not on 1.6.1 though) and will enhance sqlproxy liveness checking in the next release. 

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.
To post to this group, send email to cloud-compo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-composer-discuss/9d9f3738-8910-48aa-9727-737d11dbff88%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ar...@myndyou.com

unread,
Apr 22, 2019, 6:04:23 AM4/22/19
to cloud-composer-discuss
We are having the same issue. I created a new cluster but now the composer-fluentd process has the same ""CrashLoopBackoff"

Rob Schoenbeck

unread,
Jun 14, 2019, 1:45:40 PM6/14/19
to cloud-composer-discuss
Have the same issue. Did anyone ever find a solve? Deleting the sqlproxy pod just produces the same error message when it is re-created.

Evan Wang

unread,
Jun 14, 2019, 6:56:21 PM6/14/19
to cloud-composer-discuss
It's been a little while, but if I remember correctly our issue was related to adding additional node pools on GKE for our KubernetesPodOperators.

It turned out that we needed to specify the `compute-rw`, `storage-ro`, and `cloud-platform` access scopes when creating the pools -- hopefully that helps

Rob Schoenbeck

unread,
Jun 14, 2019, 6:59:17 PM6/14/19
to cloud-composer-discuss
Thanks for the reply -- took the opportunity to upgrade our composer environment instead, but will remember that for next time

Sree Iyer

unread,
Jun 24, 2021, 3:25:28 PM6/24/21
to cloud-composer-discuss
Hi,

I'm getting the same error and I am on the latest version of  composer-1.16.7-airflow-1.10.15 . Not sure how to resolve this Error.

I have spent hours trying to debug  this error nor the documentation is very helpful.

If anyone here has fixed this error, please let me know.

Thanks,
Sree





  

Reply all
Reply to author
Forward
0 new messages