We have a cluster of a bunch of machine with the latest gke kubernetes version, that we are using in a setup composed by gitlab + gitlab-ci.
This pod is composed by 3 container and it fails like this:
{kubelet gke-spark-op-services-gitlab-ci-0dcd135c-j1zf} Failed to create docker container "svc-0" of pod "runner-33463c05-project-250-concurrent-0h7wpj_gitlab(8d713316-0986-11e7-9f41-42010a84002f)" with error: operation timeout: context deadline exceeded
gitlab 2017-03-15 14:59:29 +0100 CET 2017-03-15 14:55:27 +0100 CET 3 runner-33463c05-project-250-concurrent-0h7wpj Pod Warning FailedSync {kubelet gke-spark-op-services-gitlab-ci-0dcd135c-j1zf} Error syncing pod, skipping: failed to "StartContainer" for "svc-0" with RunContainerError: "runContainer: operation timeout: context deadline exceeded"
gitlab 2017-03-15 14:53:00 +0100 CET 2017-03-15 14:50:58 +0100 CET 2 runner-33463c05-project-250-concurrent-011455 Pod spec.containers{build} Warning Failed {kubelet gke-spark-op-services-gitlab-ci-0dcd135c-j1zf} Failed to create docker container "build" of pod "runner-33463c05-project-250-concurrent-011455_gitlab(fd79c5ab-0985-11e7-9f41-42010a84002f)" with error: operation timeout: context deadline exceeded
gitlab 2017-03-15 14:53:00 +0100 CET 2017-03-15 14:50:59 +0100 CET 2 runner-33463c05-project-250-concurrent-011455 Pod Warning FailedSync {kubelet gke-spark-op-services-gitlab-ci-0dcd135c-j1zf} Error syncing pod, skipping: failed to "StartContainer" for "build" with RunContainerError: "runContainer: operation timeout: context deadline exceeded"
gitlab 2017-03-15 14:53:45 +0100 CET 2017-03-15 14:53:45 +0100 CET 1 runner-a1b569a9-project-119-concurrent-0pn0wx Pod spec.containers{svc-0} Warning Failed {kubelet gke-spark-op-services-gitlab-ci-0dcd135c-j1zf} Failed to create docker container "svc-0" of pod "runner-a1b569a9-project-119-concurrent-0pn0wx_gitlab(3e7f3b46-0986-11e7-9f41-42010a84002f)" with error: operation timeout: context deadline exceeded
gitlab 2017-03-15 14:53:45 +0100 CET 2017-03-15 14:53:45 +0100 CET 1 runner-a1b569a9-project-119-concurrent-0pn0wx Pod Warning FailedSync {kubelet gke-spark-op-services-gitlab-ci-0dcd135c-j1zf} Error syncing pod, skipping: [failed to "StartContainer" for "build" with RunContainerError: "runContainer: operation timeout: context deadline exceeded"
, failed to "StartContainer" for "svc-0" with RunContainerError: "runContainer: operation timeout: context deadline exceeded"
It seems sometimes that one of the node get stucked with this error and anything can be done except to delete the node and make them auto provisioned by insteance group.
Do you have any ideas about this very specific problems ? Googling for "operation timeout: context deadline exceeded" it gives a lot of different results or patches applied to the latest version of docker, but on gke we are stuck with docker 1.11.2.
Any idea or suggestione is highly appreciated.