Is servicemanagement service broken?!

1,623 views
Skip to first unread message

Alex Van Boxel

unread,
Apr 25, 2018, 5:42:47 AM4/25/18
to Google Cloud Endpoints
All new kubernetes pods that are being created with endpoints (all of them) are now crash-looping because we get an un-authorized. Although our production is not down yet, if this continues we will be!

Is something changed?! We're experiencing this over different projects. The endpoints use a dedicated service account, with the correct scope :

Cloud Trace Agent
Service Controller

ERROR:Fetching service config failed (status code 403, reason Forbidden, url https://servicemanagement.googleapis.com/v1/services/XXXX/config?configId=2018-04-05r1)

It's been working for months. Nothing is changed on the service account, cluster is the same...

li...@vedra.io

unread,
Apr 25, 2018, 6:49:19 AM4/25/18
to Google Cloud Endpoints
This error is also occuring on our cluster/project. Nothing has changed on our cluster.

Started when we were using the default service account, but it still persists when I moved to a dedicated service account. It occurs with both managed rollout and specifying a version.

Alex Van Boxel

unread,
Apr 25, 2018, 7:23:19 AM4/25/18
to Google Cloud Endpoints
OK, this confirms that it's an issue on Google side. I have opened a P1 ticket, it's a dangerous and blocking situation. On our dev cluster people can't deploy new features to test, we can't deploy to production. Or when a scale up occurs the complete site can go down. 

I seems like  a new single point of failure.

Alex Van Boxel

unread,
Apr 25, 2018, 11:27:26 AM4/25/18
to Google Cloud Endpoints
Just updated the ESP to 1.17.0... no effect, the support engineer is getting a Kubernetes specialist, but I don't think it's a Kubernetes issue at all.

Dan Ciruli

unread,
Apr 25, 2018, 11:43:23 AM4/25/18
to Alex Van Boxel, Google Cloud Endpoints
Alex, please send me your  bug number and I will make sure all relevant teams are looking at it.
--
You received this message because you are subscribed to the Google Groups "Google Cloud Endpoints" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-endpoints+unsub...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-endpoints/47fcd904-c0de-4c20-88a7-c8e5ea9f366c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
DC

Message has been deleted

chris....@spareroom.co.uk

unread,
Apr 25, 2018, 11:52:47 AM4/25/18
to Google Cloud Endpoints
We are also seeing the same issue with newly created ESP containers, currently running on diminished resilience in production. Raised case #15628583.

Alex Van Boxel

unread,
Apr 25, 2018, 11:59:29 AM4/25/18
to Dan Ciruli, Google Cloud Endpoints
Hi Dan,

Case 15624898servicemanagement service broken?!

  •   
They are looking in the wrong place. I don't think it has anything todo with GKE.

The last thing I tried as give the service account Editor rights (can't go higher). Still keeping getting the same errors:

auth-proxy Apr 25, 2018, 5:48:17 PM ERROR:Fetching service config failed (status code 403, reason Forbidden, url https://servicemanagement.googleapis.com/v1/services/auth.endpoints.quantum-cloud-test.cloud.goog/config?configId=2018-04-24r0)
auth-proxy Apr 25, 2018, 5:48:17 PM INFO:Fetching the service configuration from the service management service
auth-proxy Apr 25, 2018, 5:48:17 PM INFO:Refreshing access_token
auth-proxy Apr 25, 2018, 5:48:17 PM INFO:Service account email: qm-endpo...@quantum-cloud-test.iam.gserviceaccount.com
auth-proxy Apr 25, 2018, 5:48:17 PM INFO:Constructing an access token with scope https://www.googleapis.com/auth/service.management.readonly






On Wed, Apr 25, 2018 at 5:43 PM Dan Ciruli <cir...@google.com> wrote:
Alex, please send me your  bug number and I will make sure all relevant teams are looking at it.

On Wednesday, April 25, 2018, Alex Van Boxel <alex.v...@gmail.com> wrote:
Just updated the ESP to 1.17.0... no effect, the support engineer is getting a Kubernetes specialist, but I don't think it's a Kubernetes issue at all.

On Wednesday, April 25, 2018 at 1:23:19 PM UTC+2, Alex Van Boxel wrote:
OK, this confirms that it's an issue on Google side. I have opened a P1 ticket, it's a dangerous and blocking situation. On our dev cluster people can't deploy new features to test, we can't deploy to production. Or when a scale up occurs the complete site can go down. 

I seems like  a new single point of failure.

On Wednesday, April 25, 2018 at 12:49:19 PM UTC+2, li...@vedra.io wrote:
This error is also occuring on our cluster/project. Nothing has changed on our cluster.

Started when we were using the default service account, but it still persists when I moved to a dedicated service account. It occurs with both managed rollout and specifying a version.

On Wednesday, April 25, 2018 at 9:42:47 AM UTC, Alex Van Boxel wrote:
All new kubernetes pods that are being created with endpoints (all of them) are now crash-looping because we get an un-authorized. Although our production is not down yet, if this continues we will be!

Is something changed?! We're experiencing this over different projects. The endpoints use a dedicated service account, with the correct scope :

Cloud Trace Agent
Service Controller

ERROR:Fetching service config failed (status code 403, reason Forbidden, url https://servicemanagement.googleapis.com/v1/services/XXXX/config?configId=2018-04-05r1)

It's been working for months. Nothing is changed on the service account, cluster is the same...

--
You received this message because you are subscribed to the Google Groups "Google Cloud Endpoints" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-endp...@googlegroups.com.


--
DC

--
  _/
_/ Alex Van Boxel

Dan Ciruli

unread,
Apr 25, 2018, 2:15:27 PM4/25/18
to Alex Van Boxel, Google Cloud Endpoints
SSorry -- I've been away. Root cause has been found; fix and rollout are occurring now.
--
DC

Alex Van Boxel

unread,
Apr 25, 2018, 6:09:44 PM4/25/18
to Dan Ciruli, Google Cloud Endpoints
I'm happy the root cause have been found. Thanks.

Alex Van Boxel

unread,
Apr 26, 2018, 7:42:12 AM4/26/18
to Dan Ciruli, Google Cloud Endpoints
Hi Dan,

I see the issue is resolved. Thanks for escalating.

Dan Ciruli

unread,
Apr 26, 2018, 2:13:04 PM4/26/18
to Alex Van Boxel, Google Cloud Endpoints
Yes, the fix actually rolled out about 24 hours ago. Everyone should have seen a return to normal functionality.

Please let us know if you have not!

DC
--
DC

Alex Van Boxel

unread,
Apr 26, 2018, 4:08:33 PM4/26/18
to Dan Ciruli, Google Cloud Endpoints
Hey Dan, things are normalized. Lucky we are at an off-site and the disturbance was manageable.

The thing that worries me is that the issue wasn't detected by the SRE team itself. But hey we can talk about it next week :)

Dan Ciruli

unread,
Apr 29, 2018, 10:49:22 PM4/29/18
to Alex Van Boxel, Google Cloud Endpoints
Yes -- we have a full postmortem in process. The remedies will include testing to prevent this, better detection, and faster time to get the right teams involved for repairs.

I look forward to talking to you in Copenhagen!

DC

--
DC
Message has been deleted

Alan Cabrera

unread,
May 19, 2018, 4:33:44 PM5/19/18
to pr...@helix.re, Google Cloud Endpoints
Are you sure the service account that your containers are running under has access to that project's service API?

On Sat, May 19, 2018 at 12:58 AM, <pr...@helix.re> wrote:
Hi - I get the same error now. Is there an issue now?

ERROR:Fetching service config failed (status code 403, reason Forbidden, url https://servicemanagement.googleapis.com/v1/services/XXXX/config?configId=2018-05-19r0)

Thanks,
Prem

--
You received this message because you are subscribed to the Google Groups "Google Cloud Endpoints" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-endpoints+unsub...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-endpoints/a5315388-aad5-402a-be06-961fcaf38ea2%40googlegroups.com.

pr...@helix.re

unread,
May 21, 2018, 11:57:41 AM5/21/18
to Google Cloud Endpoints
Sorry Guys. It was my mistake. The Service Account did not have the relevant IAM role. It works as expected. Thanks for the quick response.

chandrachu...@stillmind.in

unread,
Jul 23, 2018, 9:19:11 PM7/23/18
to Google Cloud Endpoints
Hi,


I am unable to get nginx (Extensible Service Proxy) working. Here's the contents of /var/log/nginx/error.log. I think I have set all the metadata values correctly. (Initially I had not set the config ID / service version, but had to set that to remove one error). 

Now facing this error. Is it due to a missing API key? Because when I paste the url into a browser, it says "The request is missing a valid API key."

Can you please help? Thanks.

cat /var/log/nginx/error.log 

INFO:Fetching an access token from the metadata service

INFO:Fetching the service name from the metadata service

INFO:Fetching the service config rollout strategy from the metadata service

INFO:Service config rollout strategy: managed

INFO:Fetching the service config ID from the metadata service

INFO:Service config ID:2018-07-23r3

INFO:Fetching the service configuration from the service management service

lei...@google.com

unread,
Jul 24, 2018, 6:12:00 PM7/24/18
to Google Cloud Endpoints

kl...@sertiscorp.com

unread,
Jul 12, 2019, 2:19:22 AM7/12/19
to Google Cloud Endpoints
I experienced this error again.


INFO:Constructing an access token with scope https://www.googleapis.com/auth/service.management.readonly
INFO:Service account email: xxx...@xxxxxx.iam.gserviceaccount.com
INFO:Refreshing access_token
INFO:Fetching the service config ID from the rollouts service
ERROR:Fetching rollouts failed (status code 403, reason Forbidden, url https://servicemanagement.googleapis.com/v1/services/xxxxxxx.endpoints.xxxxxxx.cloud.goog/rollouts?filter=status=SUCCESS)

I followed the step on https://cloud.google.com/endpoints/docs/grpc/get-started-kubernetes#create_credentials but to no avail.

Please help asap. This affect production.

Best Regards,

kl...@sertiscorp.com

unread,
Jul 12, 2019, 3:09:53 AM7/12/19
to Google Cloud Endpoints
I fixed it by adding permission inside the service page (Cloud Endpoint). I try adding Project Editor role and Service Controller role to the service account according to this post: https://cloud.google.com/endpoints/docs/grpc/get-started-kubernetes#create_credentials. But I think that project editor role is too broad, I'm not sure that removing it will cause the problem again or not. 

Tomasz Boczkowski

unread,
Jul 12, 2019, 10:35:27 AM7/12/19
to kl...@sertiscorp.com, Google Cloud Endpoints
If viewing Service Management rollouts is the only action you need, Project Viewer role will suffice.
You can also create a custom role and grant it only those permissions to Service Management it needs. Full list is available at https://cloud.google.com/service-infrastructure/docs/service-management/access-control.
For the action described, relevant permission is "servicemanagement.services.get"

--
You received this message because you are subscribed to the Google Groups "Google Cloud Endpoints" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-endp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-endpoints/7804e8e2-0d76-4353-9455-641a69ee470b%40googlegroups.com.

kl...@sertiscorp.com

unread,
Jul 15, 2019, 3:01:53 AM7/15/19
to Google Cloud Endpoints
Hi Tomasz,

Thank you for your response. I just realized that I already have a custom role named Custom Cloud Endpoint User which includes all the essential roles that an Endpoint user must have, including "servicemanagement.services.get". The problem is that at the Cloud Endpoint permission paged doesn't show the custom role. There is a limited list of role categories which custom is not included. So, my workaround is using Project Viewer role as you suggested. I just wonder why the permission in Cloud Endpoint has a separate page from IAM.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-endpoints+unsub...@googlegroups.com.

Ralph

unread,
Jul 16, 2019, 11:14:45 AM7/16/19
to Google Cloud Endpoints
I am following Getting started with Cloud Endpoints on Compute Engine tutorial at:

As soon as I run the esp container, this exists immediately:
sudo docker run --name=esp --detach --publish=80:8080 --net=esp_net gcr.io/endpoints-release/endpoints-runtime:1 --service=echo-api --rollout_strategy=managed --backend=echo:8080

Logs show the 403 error:
sudo docker logs -t esp
2019-07-16T15:05:10.413346506Z INFO:Fetching an access token from the metadata service
2019-07-16T15:05:10.479765556Z INFO:Fetching the service config ID from the rollouts service
2019-07-16T15:05:10.672513927Z ERROR:Fetching rollouts failed (status code 403, reason Forbidden, url https://servicemanagement.googleapis.com/v1/services/echo-api/rollouts?filter=status=SUCCESS)

I also tried creating a custom service account with the above mentioned roles (servicemanagement.serviceController and cloudtrace.agent), created a new vm using such account, still same error.

Kindly advice
Thanks

ralph....@ngpems.com.mt

unread,
Jul 16, 2019, 11:30:22 AM7/16/19
to Google Cloud Endpoints
I've also exported the service account key to a json file and passed it to the container but did not work either:

sudo docker run -v /esp/serviceaccount.json:/esp/serviceaccount.json --name=esp --detach --publish=80:8080 --net=esp_net gcr.io/endpoints-release/endpoints-runtime:1 --service=echo-api --rollout_strategy=managed --backend=echo:8080 --service_account_key=/esp/serviceaccount.json

marty.k...@partner.main-incubator.com

unread,
Oct 26, 2019, 1:34:36 PM10/26/19
to Google Cloud Endpoints
Have you solved it?

I am also experiencing the same issue when doing the GKE Tutorial with Endpoints.
Also tried it with a custom service key with a service user that has even admin rights and still nothing changed.

Wayne Zhang

unread,
Oct 28, 2019, 11:17:51 AM10/28/19
to marty.k...@partner.main-incubator.com, Google Cloud Endpoints
It is possible that your service account doens't have permission to fetch the service config; maybe from different project?  
Another suspicious is the service name maybe wrong. Usually, service name is suffixed with "cloud.goog",  not just "echo-api"

--
You received this message because you are subscribed to the Google Groups "Google Cloud Endpoints" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-endp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-endpoints/52f0d6ca-adf0-4b46-81ee-46e43f741fea%40googlegroups.com.

ralph....@ngpems.com.mt

unread,
Oct 28, 2019, 11:27:31 AM10/28/19
to Google Cloud Endpoints
HI,

No I have not solved this issue. I was following the tutorial step by step. Then I tried doing the service accounts to no avail. However how I am understanding this happens when the proxy container runs, it tries to fetch something and fails. Technically it is whoever is hosting whatever the container is trying to access that needs to give permissions - but this is something so public that I presume would be unauthenticated. Maybe someone from google can blindly follow the instructions to see it fail.

Rgds
Ralph


On Monday, October 28, 2019 at 4:17:51 PM UTC+1, Wayne Zhang wrote:
It is possible that your service account doens't have permission to fetch the service config; maybe from different project?  
Another suspicious is the service name maybe wrong. Usually, service name is suffixed with "cloud.goog",  not just "echo-api"

On Sat, Oct 26, 2019 at 10:34 AM <marty....@partner.main-incubator.com> wrote:
Have you solved it?

I am also experiencing the same issue when doing the GKE Tutorial with Endpoints.
Also tried it with a custom service key with a service user that has even admin rights and still nothing changed.

Am Dienstag, 16. Juli 2019 17:30:22 UTC+2 schrieb ralph...@ngpems.com.mt:
I've also exported the service account key to a json file and passed it to the container but did not work either:

sudo docker run -v /esp/serviceaccount.json:/esp/serviceaccount.json --name=esp --detach --publish=80:8080 --net=esp_net gcr.io/endpoints-release/endpoints-runtime:1 --service=echo-api --rollout_strategy=managed --backend=echo:8080 --service_account_key=/esp/serviceaccount.json


On Tuesday, July 16, 2019 at 5:14:45 PM UTC+2, Ralph wrote:
I am following Getting started with Cloud Endpoints on Compute Engine tutorial at:

As soon as I run the esp container, this exists immediately:
sudo docker run --name=esp --detach --publish=80:8080 --net=esp_net gcr.io/endpoints-release/endpoints-runtime:1 --service=echo-api --rollout_strategy=managed --backend=echo:8080

Logs show the 403 error:
sudo docker logs -t esp
2019-07-16T15:05:10.413346506Z INFO:Fetching an access token from the metadata service
2019-07-16T15:05:10.479765556Z INFO:Fetching the service config ID from the rollouts service
2019-07-16T15:05:10.672513927Z ERROR:Fetching rollouts failed (status code 403, reason Forbidden, url https://servicemanagement.googleapis.com/v1/services/echo-api/rollouts?filter=status=SUCCESS)

I also tried creating a custom service account with the above mentioned roles (servicemanagement.serviceController and cloudtrace.agent), created a new vm using such account, still same error.

Kindly advice
Thanks

--
You received this message because you are subscribed to the Google Groups "Google Cloud Endpoints" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-endpoints+unsub...@googlegroups.com.

Wayne Zhang

unread,
Oct 28, 2019, 12:55:16 PM10/28/19
to ralph....@ngpems.com.mt, Google Cloud Endpoints
ESP is using the default service account of GCE instance, use it to access Google ServiceManagement service to get the service config. the GCE instance that deploying the ESP should use the same project as you publish your service in "gcloud endpoints service deploy"

To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-endp...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Google Cloud Endpoints" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-endp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-endpoints/1a99d3eb-d334-47ec-8b13-9a0c9b6cc558%40googlegroups.com.

Wayne Zhang

unread,
Oct 28, 2019, 7:10:51 PM10/28/19
to Google Cloud Endpoints
I just followed this instruction to create a GCE instance with default service account with format: xxx-com...@developer.gserviceaccount.com and with project editor role.  It is the same project as I run CLI "gcloud endpoint services deploy".  It works,  ESP was able to start inside the GCE instance.

Thanks
-Wayne
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-endpoints+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Google Cloud Endpoints" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-endpoints+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages