[google-cloud-sdk] Issues running gcloud on jenkins slaves

Antonio Marino

unread,

Mar 16, 2017, 5:19:14 PM3/16/17

to Google Cloud Developers

Hello!

I've set up a Jenkins master-slave deployment on GKE using the official Helm chart, modified the Slave image configuration to pull my custom image, where I installed the Google Cloud SDK, instead of "jenkinsci/jnlp-slave" and I'm trying to run gcloud commands as part of a shell script in a jenkins job.

I've tried running other jobs (of any size) not dependent on gcloud and I have no issue, but when I run "gcloud info" (or any other gcloud command except "gcloud version"), the connection from the slaves seems to drop.

This is what I see in the logs:

[gcloud-test] $ /bin/sh -xe /tmp/hudson5806299107057008746.sh

+ gcloud info

Google Cloud SDK [146.0.0]

Platform: [Linux, x86_64]

Python Version: [2.7.9 (default, Mar 1 2015, 12:57:24) [GCC 4.9.2]]

Python Location: [/usr/bin/python2]

Site Packages: [Disabled]

Installation Root: [/opt/google-cloud-sdk]

Installed Components:

kubectl: []

core-nix: [2016.11.07]

core: [2017.02.28]

gcloud-deps: [2017.02.28]

gcloud: []

gsutil-nix: [4.18]

gcloud-deps-linux-x86_64: [2017.02.21]

gsutil: [4.22]

bq: [2.0.24]

kubectl-linux-x86_64: [1.5.3]

bq-nix: [2.0.24]

System PATH: [/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games]

Cloud SDK on PATH: [False]

Kubectl on PATH: [False]

Installation Properties: [/opt/google-cloud-sdk/properties]

User Config Directory: [/home/jenkins/.config/gcloud]

Active Configuration Name: [default]

Active Configuration Path: [/home/jenkins/.config/gcloud/configurations/config_default]

Account: [xxxxxxxx...@developer.gserviceaccount.com]

Project: [xxx]

Current Properties:

[core]

project: [xx]

account: [xxxxxx-...@developer.gserviceaccount.com]

disable_usage_reporting: [False]

log_http: [true]

Logs Directory: [/home/jenkins/.config/gcloud/logs]

Last Log File: [None]

FATAL: java.io.IOException: Unexpected EOF while receiving the data from the channel. FIFO buffer has been already closed

hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected EOF while receiving the data from the channel. FIFO buffer has been already closed

at hudson.remoting.Request.abort(Request.java:307)

at hudson.remoting.Channel.terminate(Channel.java:888)

at org.jenkinsci.remoting.nio.NioChannelHub$3.run(NioChannelHub.java:617)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)

at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

at ......remote call to Channel to /10.112.7.167(Native Method)

at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1537)

at hudson.remoting.Request.call(Request.java:172)

at hudson.remoting.Channel.call(Channel.java:821)

at hudson.Launcher$RemoteLauncher.kill(Launcher.java:984)

at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:540)

at hudson.model.Run.execute(Run.java:1728)

at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)

at hudson.model.ResourceController.execute(ResourceController.java:98)

at hudson.model.Executor.run(Executor.java:404)

Caused by: java.io.IOException: Unexpected EOF while receiving the data from the channel. FIFO buffer has been already closed

at org.jenkinsci.remoting.nio.NioChannelHub$3.run(NioChannelHub.java:617)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)

at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: org.jenkinsci.remoting.nio.FifoBuffer$CloseCause: Buffer close has been requested

at org.jenkinsci.remoting.nio.FifoBuffer.close(FifoBuffer.java:426)

at org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport.closeR(NioChannelHub.java:332)

at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:565)

... 6 more

Finished: FAILURE

This is the Dockerfile for my custom slave image:

FROM jenkinsci/jnlp-slave

USER root

ENV CLOUDSDK_CORE_DISABLE_PROMPTS 1

ENV PATH /opt/google-cloud-sdk/bin:$PATH

RUN apt-get update -y

RUN apt-get install -y jq

RUN curl https://sdk.cloud.google.com | bash && mv google-cloud-sdk /opt

RUN ln -s /opt/google-cloud-sdk/bin/gcloud /usr/local/bin/gcloud

RUN gcloud components install kubectl

The same issue happens on minikube. I am also using the Jenkins Google OAuth Credentials Plugin to import Google Cloud credentials from metadata or JSON file but nothing changes.

Any help would be much appreciated.

George Lu

unread,

Aug 1, 2017, 12:29:47 AM8/1/17

to Google Cloud Developers

I am getting the same issue. Whenever I run any gcloud command in my Jenkinsfile, that Jenkins slave node immediately dies.

I am running Jenkins via Kubernetes Helm (Jenkins 2.67)

This can be repeated by putting a timeout into the Jenkinsfile, sshing into the pod (before the pod runs any of the gcloud commands in the Jenkinsfile and dies), and verifying that running any gcloud command in the Jenkins pod kills the pod (e.g. gcloud).

Pulling the Jenkins slave image (gcr.io/cloud-solutions-images/jenkins-k8s-slave:latest) and running gcloud in it does not cause this issue though. This issue only seems to happen when Jenkins master calls up the agent during a build pipeline.

For example, this doesn't cause the issue:

$ docker run gcr.io/cloud-solutions-images/jenkins-k8s-slave:latest /bin/bash
gcloud info

The last lines from my Jenkins web console is:

[Pipeline] sh
[sample-app] Running shell script
+ gcloud container builds submit --config=cloudbuild.yaml .
Cannot contact default-2c3a35f60cd6: java.io.IOException: remote file operation failed: /home/jenkins/workspace/i_cd_jenkins_example_master-W7U2IE4MHWTC6U26FSD2GPI3RIP3ARNZIXMQGB35J53UF2QWTH3A/sample-app at hudson.remoting.Channel@e3cbdac:JNLP4-connect connection from 10.52.4.48/10.52.4.48:38190: hudson.remoting.ChannelClosedException: channel is already closed

George Lu

unread,

Aug 7, 2017, 5:06:04 PM8/7/17

to Google Cloud Developers

I was able to move past this issue by adding more memory to the Jenkins slaves. I did this via the Jenkins UI console

Manage Jenkins -> Configure System -> Cloud -> Kubernetes Pod Template -> Container Template -> click Advanced -> Update the Request Memory and Limit memory (I changed mine to 1024Mi), but I'm not sure yet what the minimum is. It seemed like the default of 256 Mi was not enough.

David Classen

unread,

Sep 19, 2017, 9:01:17 AM9/19/17

to Google Cloud Developers

I was having a the same issue with another cloud provider.

However increasing resourceRequestMemory: '512Mi', resourceLimitMemory: '512Mi . resolved for me too.

Reply all

Reply to author

Forward