Cloud Run error: Internal system error, system will retry later.

566 views
Skip to first unread message

Mark Terrel

unread,
Oct 8, 2020, 7:20:47 PM10/8/20
to Google Cloud Developers
Hi all,

There's no Cloud Run group that I could find, so hopefully this is the right place.

I'm attempting to deploy a Cloud Run service. This is done via our CI/CD system and has worked successfully hundreds of times previously.

The service gets created but the first revision never gets deployed. When I look at the newly created service in the Console, it shows "Cloud Run error: Internal system error, system will retry later." as the main status message for the service.

The command line that is failing is:

gcloud --configuration=adapt-cloud-gcloud-testing --quiet run deploy cloud-run-gen-name-a179e65d6fdfc19abc57e15df563d8cb --platform=managed --format=json --no-allow-unauthenticated --memory=128M --cpu=1 --image=gcr.io/adapt-ci/http-echo --region=us-central1 --port=5678 --set-env-vars=ADAPT_TEST_DEPLOY_ID=MockDeploy-aymb --args=-text,Adapt Test

Then the output from that command (the dots after Creating Revision just keep going):

Deploying container to Cloud Run service [cloud-run-gen-name-a179e65d6fdfc19abc57e15df563d8cb] in project [adapt-ci] region [us-central1]
Deploying new service...
Creating Revision....................................................................................................................

The YAML tab in the Console also shows that message for each of the three status conditions (see below).

Is Cloud Run having issues? Is there something I can do to troubleshoot?

Thanks!
Mark

status:
  observedGeneration: 1
  conditions:
  - type: Ready
    status: Unknown
    message: 'Cloud Run error: Internal system error, system will retry later.'
    lastTransitionTime: '2020-10-08T21:07:20.844314Z'
  - type: ConfigurationsReady
    status: Unknown
    message: 'Cloud Run error: Internal system error, system will retry later.'
    lastTransitionTime: '2020-10-08T21:07:20.755212Z'
  - type: RoutesReady
    status: Unknown
    message: 'Cloud Run error: Internal system error, system will retry later.'
    lastTransitionTime: '2020-10-08T21:07:20.844314Z'
  latestCreatedRevisionName: cloud-run-gen-name-3bab80f75cfd57cf87ad89d9d2c18ba3-00001-fus

Mark Terrel

unread,
Oct 9, 2020, 12:54:35 AM10/9/20
to Google Cloud Developers
Sorry, I noticed I should have given the command line more carefully. The command line from our CI/CI is not executed by a human with a shell. It's given to the OS as an array of strings, but the error output that I copied here doesn't attempt shell quoting. So if the command line I gave were correctly quoted for bash, the last argument would be '--args=-text,Adapt Test'.

Mark Terrel

unread,
Oct 11, 2020, 3:39:32 PM10/11/20
to Google Cloud Developers

After quite a bit of trial and error, I got everything working again.

The first thing I did that made some progress was to disable the Cloud Run Admin API and re-enable it. After that change, I was able to create a service using the example container from the Console, logged in as the project owner. I was also able to create a service using the example container from the CLI, logged in as the CI service account. However, the original command from my question still had identical behavior as before. I have no idea how the project got in this state, such that the project owner couldn't use Cloud Run.

The second thing I did was to re-push the container image I was trying to use (gcr.io/adapt-ci/http-echo) to GCR. I pushed the exact same image as was there previously. This finally allowed the CI system to successfully create the Service.

As part of my earlier troubleshooting, I had looked at Google Container Registry for this project and had confirmed that the needed image was still present. However, we had somewhat recently enabled a lifecycle policy on the Cloud Storage bucket to delete items older than a certain amount of time. So my best guess is that policy deleted some, but not all of the files associated with the gcr.io/adapt-ci/http-echo image and this resulted in the internal error instead of an error saying that the container image couldn't be found.

Herman Banken

unread,
Oct 29, 2020, 4:08:37 PM10/29/20
to Google Cloud Developers
I'm also seeing this error now, but I can't disable the Cloud Run Admin API [1] as there are still resources in the project that use it.

Anyone else knows what might technically cause this? It seems my problems started when I uploaded & created a revision of an image of 499 MB. That was a crazy large image, I know, but I wanted to check whether it was possible at all (having those resources on a CDN is not really convenient in our case).



Op zondag 11 oktober 2020 om 21:39:32 UTC+2 schreef mte...@gmail.com:

Herman Banken

unread,
Oct 29, 2020, 4:22:57 PM10/29/20
to Google Cloud Developers
Ah. In this case it is a global things... :-(

Op donderdag 29 oktober 2020 om 21:08:37 UTC+1 schreef Herman Banken:
Screenshot 2020-10-29 at 21.21.30.png

Olu

unread,
Oct 30, 2020, 3:15:22 PM10/30/20
to Google Cloud Developers
Quite rightly, Herman. There was a system-wide issue reported for deployments on Cloud RUN. As per the Cloud Run Incident #20004 on the GCP Status Dashboard[1], it seems the issue may have been resolved now.

Thanks for your understanding and for pointing this thread to that info. 

Reply all
Reply to author
Forward
0 new messages