[JIRA] [docker-workflow-plugin] (JENKINS-34289) docker.image.inside fails unexpectedly with Jenkinsfile

44 views
Skip to first unread message

docwhat@gerf.org (JIRA)

unread,
Apr 15, 2016, 5:13:01 PM4/15/16
to jenkinsc...@googlegroups.com
Christian Höltje created an issue
 
Jenkins / Bug JENKINS-34289
docker.image.inside fails unexpectedly with Jenkinsfile
Issue Type: Bug Bug
Assignee: Jesse Glick
Components: docker-workflow-plugin
Created: 2016/Apr/15 9:12 PM
Environment: Jenkins 1.651.1 
CloudBees Docker Pipeline 1.4 
Pipeline 2.0
Priority: Major Major
Reporter: Christian Höltje

With a simple Jenkinsfile when building, at some point it'll fail for no obvious reason.

An example Jenkinsfile:

{{
def img = 'centos:7';

node('docker') {
stage "pulling";
sh "docker pull $

{img}"; // workaround for JENKINS-34288

checkout scm;

docker.image(img).inside { sh 'for i in $(seq 30); do sleep 1; echo $i; done'; sh 'ls -alh --color'; }
}
def img = 'centos:7';

node('docker') {
stage "pulling";
sh "docker pull ${img}

"; // workaround for JENKINS-34288

checkout scm;

docker.image(img).inside

{ sh 'for i in $(seq 30); do sleep 1; echo $i; done'; sh 'ls -alh --color'; }

}
}}

Partial output:

{{
[Pipeline] Run build steps inside a Docker container : Start
$ docker run -t -d -u 995:993 -w /var/lib/jenkins/workspace/tron/docwhat-test-jenkinsfile/master -v /var/lib/jenkins/workspace/tron/docwhat-test-jenkinsfile/master:/var/lib/jenkins/workspace/tron/docwhat-test-jenkinsfile/master:rw -v /var/lib/jenkins/workspace/tron/docwhat-test-jenkinsfile/master@tmp:/var/lib/jenkins/workspace/tron/docwhat-test-jenkinsfile/master@tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** centos:7 cat
[Pipeline] withDockerContainer

{ [Pipeline] sh [master] Running shell script ++ seq 30 + for i in '$(seq 30)' + sleep 1 [Pipeline] }

//withDockerContainer
$ docker stop 7fcbfd6ab39cf05257a43a774bd20b670bc39674a2047777fe603ee1a3162b10
$ docker rm -f 7fcbfd6ab39cf05257a43a774bd20b670bc39674a2047777fe603ee1a3162b10
[Pipeline] Run build steps inside a Docker container : End
[Pipeline] } //node
[Pipeline] Allocate node : End
[Pipeline] End of Pipeline
}}

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265)
Atlassian logo

docwhat@gerf.org (JIRA)

unread,
Apr 15, 2016, 5:17:01 PM4/15/16
to jenkinsc...@googlegroups.com
Christian Höltje updated an issue
Change By: Christian Höltje
With a simple {{Jenkinsfile}} when building, at some point it'll fail for no obvious reason.

An example {{Jenkinsfile}}:

{code: groovy java }

def img = 'centos:7';

node('docker') {
  stage "pulling";
  sh "docker pull ${img}"; // workaround for JENKINS-34288

  checkout scm;

  docker.image(img).inside {
    sh 'for i in $(seq 30); do sleep 1; echo $i; done';
    sh 'ls -alh --color';
  }
}
{code}

Partial output:

{noformat}

[Pipeline] Run build steps inside a Docker container : Start
$ docker run -t -d -u 995:993 -w /var/lib/jenkins/workspace/tron/docwhat-test-jenkinsfile/master -v /var/lib/jenkins/workspace/tron/docwhat-test-jenkinsfile/master:/var/lib/jenkins/workspace/tron/docwhat-test-jenkinsfile/master:rw -v /var/lib/jenkins/workspace/tron/docwhat-test-jenkinsfile/master@tmp:/var/lib/jenkins/workspace/tron/docwhat-test-jenkinsfile/master@tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** centos:7 cat
[Pipeline] withDockerContainer {
[Pipeline] sh
[master] Running shell script
++ seq 30
+ for i in '$(seq 30)'
+ sleep 1
[Pipeline] } //withDockerContainer
$ docker stop 7fcbfd6ab39cf05257a43a774bd20b670bc39674a2047777fe603ee1a3162b10
$ docker rm -f 7fcbfd6ab39cf05257a43a774bd20b670bc39674a2047777fe603ee1a3162b10
[Pipeline] Run build steps inside a Docker container : End
[Pipeline] } //node
[Pipeline] Allocate node : End
[Pipeline] End of Pipeline
{noformat}

docwhat@gerf.org (JIRA)

unread,
Apr 15, 2016, 5:17:01 PM4/15/16
to jenkinsc...@googlegroups.com
Christian Höltje updated an issue
With a simple {{Jenkinsfile}} when building, at some point it'll fail for no obvious reason.

An example {{Jenkinsfile}}:

{ {
def img = 'centos
code : 7';


node('docker') {
  stage "pulling";
  sh "docker pull ${img groovy } "; // workaround for JENKINS-34288

docwhat@gerf.org (JIRA)

unread,
Apr 18, 2016, 5:03:02 PM4/18/16
to jenkinsc...@googlegroups.com
Christian Höltje updated an issue
Change By: Christian Höltje
Environment:
Jenkins 1.651.1 
CloudBees Docker Pipeline 1.4 
Pipeline 2.0

Docker 1.11.0
RHEL 7.2

docwhat@gerf.org (JIRA)

unread,
Apr 18, 2016, 5:45:01 PM4/18/16
to jenkinsc...@googlegroups.com
Christian Höltje commented on Bug JENKINS-34289
 
Re: docker.image.inside fails unexpectedly with Jenkinsfile

So part of the problem appears to be the -u 995:993 option passed to docker run. This user doesn't exist inside the container and certain commands cause the container to just "exit" with an unsuccessful exit code.

I made a simple Dockerfile that creates the user and group and it helps with some cases (e.g. ps -ef works now) but my for loop still dies on the first sleep 1.

New Jenkinsfile :

<noformat>


def img = 'centos:7';

node('docker') {
stage "pulling";
sh "docker pull $

{img}

"; // workaround for JENKINS-34288

checkout scm;

docker.build('fish', '.').inside

{ sh 'for i in $(seq 30); do sleep 1; echo $i; done'; sh 'ls -alh --color'; }

}
<noformat>

Dockerfile:

FROM centos:7

RUN /sbin/groupadd -g 993 bgroup
RUN /sbin/adduser -m -u 995 -g 993 build

docwhat@gerf.org (JIRA)

unread,
Apr 18, 2016, 5:45:01 PM4/18/16
to jenkinsc...@googlegroups.com
Christian Höltje edited a comment on Bug JENKINS-34289
So part of the problem appears to be the {{-u 995:993}} option passed to {{docker run}}.  This user doesn't exist inside the container and certain commands cause the container to just "exit" with an unsuccessful exit code.


I made a simple {{Dockerfile}} that creates the user and group and it helps with some cases (e.g. {{ps -ef}} works now) but my {{for}} loop still dies on the first {{sleep 1}}.

New {{Jenkinsfile}} :

< { noformat > }
def img = 'centos:7';

node('docker') {
  stage "pulling";
  sh "docker pull ${img}"; // workaround for JENKINS-34288

  checkout scm;

  docker.build('fish', '.').inside {
    sh 'for i in $(seq 30); do sleep 1; echo $i; done';
    sh 'ls -alh --color';
  }
}
< { noformat > }


Dockerfile:

{noformat}

FROM centos:7

RUN /sbin/groupadd -g 993 bgroup
RUN /sbin/adduser -m -u 995 -g 993 build
{noformat}

docwhat@gerf.org (JIRA)

unread,
Apr 19, 2016, 11:45:01 AM4/19/16
to jenkinsc...@googlegroups.com

Looking into the logs, I see this error when it dies:

Apr 19, 2016 11:42:51 AM FINE org.jenkinsci.plugins.docker.workflow.client.DockerClient
Executing docker command docker inspect -f {{.Image}} ab2e6f4420e8f74916096a64cc5dda59739f63d5585e6d0eee23ee4ed47fd411
Apr 19, 2016 11:42:51 AM FINE org.jenkinsci.plugins.docker.workflow.WithContainerStep
execution failure container=ab2e6f4420e8f74916096a64cc5dda59739f63d5585e6d0eee23ee4ed47fd411
hudson.AbortException: script returned exit code -1
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.check(DurableTaskStep.java:198)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.run(DurableTaskStep.java:150)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

docwhat@gerf.org (JIRA)

unread,
Apr 19, 2016, 12:00:01 PM4/19/16
to jenkinsc...@googlegroups.com
Christian Höltje edited a comment on Bug JENKINS-34289
So part of the problem appears to be the {{-u 995:993}} option passed to {{docker run}}.  This user doesn't exist inside the container and certain commands cause the container to just "exit" with an unsuccessful exit code.

I made a simple {{Dockerfile}} that creates the user and group and it helps with some cases (e.g. {{ps -ef}} works now) but my {{for}} loop still dies on the first {{sleep 1}}.

New {{Jenkinsfile}} :

{noformat}
def img = 'centos:7';

node('docker') {
  stage "pulling";
  sh "docker pull ${img}"; // workaround for JENKINS-34288

  checkout scm;

  docker.build('fish', '.').inside {
    sh 'for i in $(seq 30); do sleep 1; echo $i; done';
    sh 'ls -alh --color';
  }
}
{noformat}

Dockerfile:

{noformat}
FROM centos:7

RUN /sbin/groupadd -g 993 bgroup
RUN /sbin/adduser -m -u 995 -g 993 build
{noformat}


Edits:

* Removed bogus steps

docwhat@gerf.org (JIRA)

unread,
Apr 19, 2016, 12:05:01 PM4/19/16
to jenkinsc...@googlegroups.com

How about this?

node('docker') {
    def img = docker.image('busybox');
    img.pull();
    img.inside {
        sh 'for i in $(seq 30); do sleep 1; echo $i; done';
    }
}

docwhat@gerf.org (JIRA)

unread,
Apr 19, 2016, 12:10:01 PM4/19/16
to jenkinsc...@googlegroups.com

I'm pretty sure that the shell step isn't even executed inside the container and that AbortException is being returned from something else. If I change the step to run sleep 1 || true it still aborts with -1.

docwhat@gerf.org (JIRA)

unread,
Apr 19, 2016, 12:34:02 PM4/19/16
to jenkinsc...@googlegroups.com

Interesting.

At the moment I don't have slaves; I'm using master to build stuff... Jenkins is running as the non-root user jenkins and using DOCKER_HOST=tcp://127.0.0.1:2375 via configuring the master node.

If I stand up a clean Jenkins running as root then my example code works. If I stand up a clean Jenkins running as a non-root user, then my example code fails on the first sleep 1.

docwhat@gerf.org (JIRA)

unread,
Apr 19, 2016, 12:36:01 PM4/19/16
to jenkinsc...@googlegroups.com
Christian Höltje edited a comment on Bug JENKINS-34289
How about this?

{code:java}
node
('docker')  {

    def img = docker.image('busybox');
    img.pull();
    img.inside {
        sh 'for i in $(seq 30); do sleep 1; echo $i; done';
    }
}
{code}

docwhat@gerf.org (JIRA)

unread,
Apr 19, 2016, 1:06:01 PM4/19/16
to jenkinsc...@googlegroups.com

Here's my "setup a test jenkins script" incase I'm doing something stupid. I use this as root and as a new user jbugs.

I pull the plugins from the existing "live-ish" Jenkins to speed things up and the init.groovy.d is just setting the UpdateCenter URL to "https://updates.jenkins-ci.org/stable-1.651/update-center.json"

#!/bin/bash

set -euo pipefail
set -x

if [ "$(id -u)" = 0 ]; then
  jenkins_port=5000
else
  jenkins_port=5010
fi

cd

rm -rf .jenkins
mkdir -p .jenkins/plugins

cp -avr /var/lib/jenkins/init.groovy.d/ .jenkins/init.groovy.d/
cp -avr /var/lib/jenkins/plugins/*.jpi .jenkins/plugins/

cat <<CONFIGXML > .jenkins/config.xml
<?xml version='1.0' encoding='UTF-8'?>
<hudson>
  <version>1.0</version>
  <nodeProperties>
    <hudson.slaves.EnvironmentVariablesNodeProperty>
      <envVars serialization="custom">
        <unserializable-parents/>
        <tree-map>
          <default>
            <comparator class="hudson.util.CaseInsensitiveComparator"/>
          </default>
          <int>1</int>
          <string>DOCKER_HOST</string>
          <string>tcp://127.0.0.1:2375</string>
        </tree-map>
      </envVars>
    </hudson.slaves.EnvironmentVariablesNodeProperty>
  </nodeProperties>
  <globalNodeProperties/>
</hudson>
CONFIGXML

exec java -jar /var/lib/jenkins/jenkins.war --httpPort="$jenkins_port"

docwhat@gerf.org (JIRA)

unread,
Apr 19, 2016, 3:49:02 PM4/19/16
to jenkinsc...@googlegroups.com

If it helps, I was able to use auditd to see that execve() was called for the docker exec for the command in question and execve() returned 0 (which is not the exit code, but just says the syscall worked). So the command seems to be run by Jenkins. It just isn't getting the exit code back correctly (and gets -1 instead).

Looking around, could this be related to

JENKINS-25727 ? If I'm reading the code correctly, this is falling afoul of this code: https://github.com/jenkinsci/durable-task-plugin/blob/66d80d2b9761ebdb4f0d3bb7b9edb82357e33399/src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java#L172-L174

docwhat@gerf.org (JIRA)

unread,
Apr 25, 2016, 3:18:01 PM4/25/16
to jenkinsc...@googlegroups.com

So I got some slaves up-and-running and the problem persists there. The slave.jar runs as the user "jenkins", not root. I suspect that if I ran it as root, it'd work like the master above.

docwhat@gerf.org (JIRA)

unread,
Apr 25, 2016, 3:36:03 PM4/25/16
to jenkinsc...@googlegroups.com

I just upgraded github-organization-folder from version 1.2 to 1.3 and lo-and-behold! It works!

W00T!

I also upgraded github-api from 1.72.1 to 1.75 as well, but I doubt it impacted this.

You guys are the greatest!

docwhat@gerf.org (JIRA)

unread,
Apr 25, 2016, 3:36:03 PM4/25/16
to jenkinsc...@googlegroups.com
Christian Höltje resolved as Fixed
 
Change By: Christian Höltje
Status: Open Resolved
Resolution: Fixed

jglick@cloudbees.com (JIRA)

unread,
Aug 3, 2016, 6:09:02 PM8/3/16
to jenkinsc...@googlegroups.com
Jesse Glick reopened an issue
 

No fix was made to address this problem, you just stopped running into it for some reason TBD.

Change By: Jesse Glick
Resolution: Fixed
Status: Resolved Reopened
This message was sent by Atlassian JIRA (v7.1.7#71011-sha1:2526d7c)
Atlassian logo

jglick@cloudbees.com (JIRA)

unread,
Aug 3, 2016, 6:09:03 PM8/3/16
to jenkinsc...@googlegroups.com
Jesse Glick resolved as Cannot Reproduce
Change By: Jesse Glick
Status: Reopened Resolved
Resolution: Cannot Reproduce

docwhat@gerf.org (JIRA)

unread,
Aug 3, 2016, 6:40:04 PM8/3/16
to jenkinsc...@googlegroups.com
 
Re: docker.image.inside fails unexpectedly with Jenkinsfile

Agreed. But it was definitely something in the change between 1.72.1 and 1.75. I should have closed it myself. Sorry.

Reply all
Reply to author
Forward
0 new messages