[google-cloud-sdk] Issues running gcloud on jenkins slaves

1,648 views
Skip to first unread message

Antonio Marino

unread,
Mar 16, 2017, 5:19:14 PM3/16/17
to Google Cloud Developers
Hello! 

I've set up a Jenkins master-slave deployment on GKE using the official Helm chart, modified the Slave image configuration to pull my custom image, where I installed the Google Cloud SDK, instead of "jenkinsci/jnlp-slave" and I'm trying to run gcloud commands as part of a shell script in a jenkins job.

I've tried running other jobs (of any size) not dependent on gcloud and I have no issue, but when I run "gcloud info" (or any other gcloud command except "gcloud version"), the connection from the slaves seems to drop.

This is what I see in the logs:

[gcloud-test] $ /bin/sh -xe /tmp/hudson5806299107057008746.sh
+ gcloud info
Google Cloud SDK [146.0.0]

Platform: [Linux, x86_64]
Python Version: [2.7.9 (default, Mar  1 2015, 12:57:24)  [GCC 4.9.2]]
Python Location: [/usr/bin/python2]
Site Packages: [Disabled]

Installation Root: [/opt/google-cloud-sdk]
Installed Components:
  kubectl: []
  core-nix: [2016.11.07]
  core: [2017.02.28]
  gcloud-deps: [2017.02.28]
  gcloud: []
  gsutil-nix: [4.18]
  gcloud-deps-linux-x86_64: [2017.02.21]
  gsutil: [4.22]
  bq: [2.0.24]
  kubectl-linux-x86_64: [1.5.3]
  bq-nix: [2.0.24]
System PATH: [/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games]
Cloud SDK on PATH: [False]
Kubectl on PATH: [False]

Installation Properties: [/opt/google-cloud-sdk/properties]
User Config Directory: [/home/jenkins/.config/gcloud]
Active Configuration Name: [default]
Active Configuration Path: [/home/jenkins/.config/gcloud/configurations/config_default]

Project: [xxx]

Current Properties:
  [core]
    project: [xx]
    disable_usage_reporting: [False]
    log_http: [true]

Logs Directory: [/home/jenkins/.config/gcloud/logs]
Last Log File: [None]


FATAL: java.io.IOException: Unexpected EOF while receiving the data from the channel. FIFO buffer has been already closed
hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected EOF while receiving the data from the channel. FIFO buffer has been already closed
at hudson.remoting.Request.abort(Request.java:307)
at hudson.remoting.Channel.terminate(Channel.java:888)
at org.jenkinsci.remoting.nio.NioChannelHub$3.run(NioChannelHub.java:617)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
at ......remote call to Channel to /10.112.7.167(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1537)
at hudson.remoting.Request.call(Request.java:172)
at hudson.remoting.Channel.call(Channel.java:821)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:984)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:540)
at hudson.model.Run.execute(Run.java:1728)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:404)
Caused by: java.io.IOException: Unexpected EOF while receiving the data from the channel. FIFO buffer has been already closed
at org.jenkinsci.remoting.nio.NioChannelHub$3.run(NioChannelHub.java:617)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.jenkinsci.remoting.nio.FifoBuffer$CloseCause: Buffer close has been requested
at org.jenkinsci.remoting.nio.FifoBuffer.close(FifoBuffer.java:426)
at org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport.closeR(NioChannelHub.java:332)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:565)
... 6 more
Finished: FAILURE


This is the Dockerfile for my custom slave image:

FROM jenkinsci/jnlp-slave
USER root
ENV CLOUDSDK_CORE_DISABLE_PROMPTS 1
ENV PATH /opt/google-cloud-sdk/bin:$PATH
RUN apt-get update -y
RUN apt-get install -y jq
RUN curl https://sdk.cloud.google.com | bash && mv google-cloud-sdk /opt
RUN ln -s /opt/google-cloud-sdk/bin/gcloud /usr/local/bin/gcloud
RUN gcloud components install kubectl


The same issue happens on minikube. I am also using the Jenkins Google OAuth Credentials Plugin to import Google Cloud credentials from metadata or JSON file but nothing changes.

Any help would be much appreciated.

George Lu

unread,
Aug 1, 2017, 12:29:47 AM8/1/17
to Google Cloud Developers
I am getting the same issue. Whenever I run any gcloud command in my Jenkinsfile, that Jenkins slave node immediately dies.
I am running Jenkins via Kubernetes Helm (Jenkins 2.67)

This can be repeated by putting a timeout into the Jenkinsfile, sshing into the pod (before the pod runs any of the gcloud commands in the Jenkinsfile and dies), and verifying that running any gcloud command in the Jenkins pod kills the pod (e.g. gcloud).

Pulling the Jenkins slave image (gcr.io/cloud-solutions-images/jenkins-k8s-slave:latest) and running gcloud in it does not cause this issue though. This issue only seems to happen when Jenkins master calls up the agent during a build pipeline.

For example, this doesn't cause the issue:

$ docker run gcr.io/cloud-solutions-images/jenkins-k8s-slave:latest /bin/bash
gcloud info

The last lines from my Jenkins web console is:
[Pipeline] sh
[sample-app] Running shell script
+ gcloud container builds submit --config=cloudbuild.yaml .
Cannot contact default-2c3a35f60cd6: java.io.IOException: remote file operation failed: /home/jenkins/workspace/i_cd_jenkins_example_master-W7U2IE4MHWTC6U26FSD2GPI3RIP3ARNZIXMQGB35J53UF2QWTH3A/sample-app at hudson.remoting.Channel@e3cbdac:JNLP4-connect connection from 10.52.4.48/10.52.4.48:38190: hudson.remoting.ChannelClosedException: channel is already closed

George Lu

unread,
Aug 7, 2017, 5:06:04 PM8/7/17
to Google Cloud Developers
I was able to move past this issue by adding more memory to the Jenkins slaves. I did this via the Jenkins UI console

Manage Jenkins -> Configure System -> Cloud -> Kubernetes Pod Template -> Container Template -> click Advanced -> Update the Request Memory and Limit memory (I changed mine to 1024Mi), but I'm not sure yet what the minimum is. It seemed like the default of 256 Mi was not enough.

David Classen

unread,
Sep 19, 2017, 9:01:17 AM9/19/17
to Google Cloud Developers
I was having a the same issue with another cloud provider.  

However increasing resourceRequestMemory: '512Mi', resourceLimitMemory: '512Mi . resolved for me too.
Reply all
Reply to author
Forward
0 new messages