[JIRA] (JENKINS-59903) durable-task v1.31 breaks sh steps in pipeline when running in a Docker container

11 views
Skip to first unread message

n.jesper.a@gmail.com (JIRA)

unread,
Oct 23, 2019, 2:30:05 PM10/23/19
to jenkinsc...@googlegroups.com
Jesper Andersson created an issue
 
Jenkins / Bug JENKINS-59903
durable-task v1.31 breaks sh steps in pipeline when running in a Docker container
Issue Type: Bug Bug
Assignee: Unassigned
Components: durable-task-plugin
Created: 2019-10-23 14:29
Environment: Centos 7.7
Jenkins ver. 2.190.1
Durable Task Plugin v. 1.31
Priority: Blocker Blocker
Reporter: Jesper Andersson

A pipeline like this:

pipeline {
    agent {
        docker {
            label 'docker'
            image 'busybox'
        }
    }
    stages {
        stage("Test sh script in container") {
            steps {
              sh label: 'Echo "Hello World...', script: 'echo "Hello World!"'
            }
        }
    }
}

Fails with this log:

Running in Durability level: PERFORMANCE_OPTIMIZED
[Pipeline] Start of Pipeline (hide)
[Pipeline] node
Running on docker-node in /...
[Pipeline] {
[Pipeline] isUnix
[Pipeline] sh
+ docker inspect -f . busybox
.
[Pipeline] withDockerContainer
got-legaci-3 does not seem to be running inside a container
$ docker run -t -d -u 1002:1002 -w <<hidden>> busybox cat
$ docker top 645fd28fda5fa3c61a4b49e8a38e46e0eec331ddf6037d3f77821dd6984a185f -eo pid,comm
[Pipeline] {
[Pipeline] stage
[Pipeline] { (Test sh script in container)
[Pipeline] sh (Echo "Hello World...)
process apparently never started in /...
(running Jenkins temporarily with -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true might make the problem clearer)
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
$ docker stop --time=1 645fd28fda5fa3c61a4b49e8a38e46e0eec331ddf6037d3f77821dd6984a185f
$ docker rm -f 645fd28fda5fa3c61a4b49e8a38e46e0eec331ddf6037d3f77821dd6984a185f
[Pipeline] // withDockerContainer
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
ERROR: script returned exit code -2
Finished: FAILURE

Adding the -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true parameter gives this log:

Running in Durability level: PERFORMANCE_OPTIMIZED
[Pipeline] Start of Pipeline
[Pipeline] node
Running on docker-node in /...
[Pipeline] {
[Pipeline] isUnix
[Pipeline] sh
+ docker inspect -f . busybox
.
[Pipeline] withDockerContainer
got-legaci-3 does not seem to be running inside a container
$ docker run -t -d -u 1002:1002 -w <<hidden>> busybox cat
$ docker top 31b7474756f8ff5b1f0d12d0df952347e584b47113108d1f965adeeb0ee78e5e -eo pid,comm
[Pipeline] {
[Pipeline] stage
[Pipeline] { (Test sh script in container)
[Pipeline] sh (Echo "Hello World...)
OCI runtime exec failed: exec failed: container_linux.go:346: starting container process caused "exec: \"/var/jenkins/caches/durable-task/durable_task_monitor_1.31_unix_64\": stat /var/jenkins/caches/durable-task/durable_task_monitor_1.31_unix_64: no such file or directory": unknown
process apparently never started in /...
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
$ docker stop --time=1 31b7474756f8ff5b1f0d12d0df952347e584b47113108d1f965adeeb0ee78e5e
$ docker rm -f 31b7474756f8ff5b1f0d12d0df952347e584b47113108d1f965adeeb0ee78e5e
[Pipeline] // withDockerContainer
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
ERROR: script returned exit code -2
Finished: FAILURE

Tested on three different Jenkins masters with similar, but no identical, configurations.

Reverting to Durable Task Plugin v. 1.30 "solves" the problem.

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)
Atlassian logo

n.jesper.a@gmail.com (JIRA)

unread,
Oct 23, 2019, 2:33:01 PM10/23/19
to jenkinsc...@googlegroups.com
Jesper Andersson updated an issue
Change By: Jesper Andersson
Environment: Centos 7.7
Jenkins ver. 2.190.1
(installed by yum)
Durable Task Plugin v. 1.31

n.jesper.a@gmail.com (JIRA)

unread,
Oct 23, 2019, 2:58:02 PM10/23/19
to jenkinsc...@googlegroups.com
Jesper Andersson commented on Bug JENKINS-59903
 
Re: durable-task v1.31 breaks sh steps in pipeline when running in a Docker container

A different workaround is adding args '-v /var/jenkins-legaci-lab/caches:/var/jenkins-legaci-lab/caches' to the {{docker

{ ... }

}} declaration in the pipeline, like this:

pipeline {
    agent {
        docker {
            label 'docker'
            image 'busybox'
            args '-v /var/jenkins/caches:/var/jenkins/caches'
        }
    }
    stages {
        stage("Test sh script in container") {
            steps {
              sh label: 'Echo "Hello World...', script: 'echo "Hello World!"'
            }
        }
    }
}
 
                                                            
  • Perhaps this should be solved in some declarative pipeline component?

n.jesper.a@gmail.com (JIRA)

unread,
Oct 23, 2019, 2:59:02 PM10/23/19
to jenkinsc...@googlegroups.com
Jesper Andersson edited a comment on Bug JENKINS-59903
A different workaround is adding {{args '-v /var/jenkins-legaci-lab/caches:/var/jenkins-legaci-lab/caches'}} to the {{docker \ { ... \
} }} declaration in the pipeline, like this:
{code:groovy}

pipeline {
    agent {
        docker {
            label 'docker'
            image 'busybox'
            args '-v /var/jenkins/caches:/var/jenkins/caches'
        }
    }
    stages {
        stage("Test sh script in container") {
            steps {
              sh label: 'Echo "Hello World...', script: 'echo "Hello World!"'
            }
        }
    }
}
{code}

- Perhaps this should be solved in some declarative pipeline component?

n.jesper.a@gmail.com (JIRA)

unread,
Oct 23, 2019, 2:59:04 PM10/23/19
to jenkinsc...@googlegroups.com

n.jesper.a@gmail.com (JIRA)

unread,
Oct 23, 2019, 3:02:03 PM10/23/19
to jenkinsc...@googlegroups.com
Jesper Andersson edited a comment on Bug JENKINS-59903
A different workaround is adding { color:#00875A} { { args '-v /var/jenkins-legaci-lab/caches:/var/jenkins-legaci-lab/caches'}} {color} to the { color:#00875A} { { docker \{ ... \}}} {color} declaration in the pipeline , like .
Like
this:
{code:groovy}
pipeline {
    agent {
        docker {
            label 'docker'
            image 'busybox'
            args '-v /var/jenkins/caches:/var/jenkins/caches'
        }
    }
    stages {
        stage("Test sh script in container") {
            steps {
              sh label: 'Echo "Hello World...', script: 'echo "Hello World!"'
            }
        }
    }
}
{code}

- Perhaps this should be solved in some declarative pipeline component?

zerkms@zerkms.ru (JIRA)

unread,
Oct 24, 2019, 1:22:03 AM10/24/19
to jenkinsc...@googlegroups.com

I confirm it affects me as well:

Jenkins ver. 2.190.1, docker, kubernetes and other plugins: latest stable version.

Jenkins runs on linux (inside a docker container)

r.fuereder@xortex.com (JIRA)

unread,
Oct 24, 2019, 6:20:03 AM10/24/19
to jenkinsc...@googlegroups.com

Also a problem when running Jenkins not in Docker and executing pipeline on Jenkins master (=> psst!); problem is definitely caused by new "Durable Task" plugin v1.31 bug:

  • Latest version of Jenkins core (v2.201) running on Ubuntu 16.04
  • The logs seem to indicate the problem already appears when trying to start the Docker container in the pipeline, but maybe the logs are just mangled?
    • See "// !!!" comment in belows build log...
  • Build log (proprietary):
    ...
    [Pipeline] // stage
    [Pipeline] stage
    [Pipeline] { (linkchecker)
    [Pipeline] script
    [Pipeline] {
    [Pipeline] withEnv
    [Pipeline] {
    [Pipeline] withDockerRegistry
    [Pipeline] {
    [Pipeline] isUnix
    [Pipeline] sh
    06:37:20  + docker inspect -f . ACME/linkchecker:5
    06:37:20  
    06:37:20  Error: No such object: ACME/linkchecker:5
    06:37:20  06:37:20.408218 durable_task_monitor.go:63: exit status 1              // !!!
    [Pipeline] isUnix
    [Pipeline] sh
    06:37:20  + docker inspect -f . dockerregistry.ACME.com/ACME/linkchecker:5
    06:37:20  .
    [Pipeline] withDockerContainer
    06:37:20  Jenkins does not seem to be running inside a container
    06:37:20  $ docker run -t -d -u 10112:10005 -w /var/lib/jenkins/workspace/Sandbox/ACME.linkCheckerPipeline -v /var/lib/jenkins/workspace/Sandbox/ACME.linkCheckerPipeline:/var/lib/jenkins/workspace/Sandbox/ACME.linkCheckerPipeline:rw,z -v /var/lib/jenkins/workspace/Sandbox/ACME.linkCheckerPipeline@tmp:/var/lib/jenkins/workspace/Sandbox/ACME.linkCheckerPipeline@tmp:rw,z -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** dockerregistry.ACME.com/ACME/linkchecker:5 cat
    06:37:21  $ docker top 39f784ea27cbf6593fd40c1faaf04948daae94e97eb8ba42517f7c2f5e40c21e -eo pid,comm
    [Pipeline] {
    [Pipeline] sh
    06:42:27  process apparently never started in /var/lib/jenkins/workspace/Sandbox/ACME.linkCheckerPipeline@tmp/durable-aed939a9      // !!!
    06:42:27  (running Jenkins temporarily with -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS=true might make the problem clearer)
    [Pipeline] }
    ...
    ERROR: script returned exit code -2
    Finished: FAILURE
    
  • Pipeline code (in shared Jenkins pipeline library):
    ...
      void execute(Closure configBody) {
        LinkCheckerRunDSL config = calcConfiguration(configBody)
    
        // Then build, based on the configuration provided:
        script.docker.withRegistry(Constants.ACME_DOCKER_REGISTRY_URL) {
          script.docker.image(config.dockerImage).inside() { c ->
            script.sh 'linkchecker --version'
    ...
    

n.jesper.a@gmail.com (JIRA)

unread,
Oct 24, 2019, 7:35:04 PM10/24/19
to jenkinsc...@googlegroups.com
Jesper Andersson updated an issue
Change By: Jesper Andersson
Environment: Centos 7.7
Jenkins ver. 2.190.1 (installed by yum
, not in container )

Durable Task Plugin v. 1.31

cosbug@gmail.com (JIRA)

unread,
Oct 24, 2019, 9:02:03 PM10/24/19
to jenkinsc...@googlegroups.com
Constantin Bugneac commented on Bug JENKINS-59903
 
Re: durable-task v1.31 breaks sh steps in pipeline when running in a Docker container

Having the same issue with the 1.31 version of durable-task plugin. Had to rollback to 1.30.

eric.gehrman@projekt202.com (JIRA)

unread,
Oct 25, 2019, 1:02:02 AM10/25/19
to jenkinsc...@googlegroups.com

Also having the same issue with the 1.31 version of durable-task plugin. Also fixed by rolling back to 1.30

dbeck@cloudbees.com (JIRA)

unread,
Oct 25, 2019, 9:22:10 AM10/25/19
to jenkinsc...@googlegroups.com

bjoern.pedersen@frm2.tum.de (JIRA)

unread,
Oct 25, 2019, 9:32:04 AM10/25/19
to jenkinsc...@googlegroups.com
Björn Pedersen commented on Bug JENKINS-59903
 
Re: durable-task v1.31 breaks sh steps in pipeline when running in a Docker container

Here we are also affected (rolled back to 1.30) for docker.inside steps. 

The problems is that  the new wrapper binary is at a location that is (on purpose) not exposed inside the docker container.

Only the workspace and auxilliary workspace (workspace@tmp ) are currently  mapped into the container by default.

nitrogear@gmail.com (JIRA)

unread,
Oct 25, 2019, 11:53:05 AM10/25/19
to jenkinsc...@googlegroups.com

Faced the same issue on Jenkins 2.201. Downgrading to durable-task plugin v1.30 helped to resolve the issue

kuisathaverat@gmail.com (JIRA)

unread,
Oct 25, 2019, 12:52:03 PM10/25/19
to jenkinsc...@googlegroups.com

the same, we face it with docker.inside after upgrade to 2.201, the works thing it is that you do not see the error until you enable `org.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS`

  

pipeline \{
    agent {label 'linux && immutable'}
    stages {
        stage("Test sh script in container") {
            steps {
              docker.image('node:12').inside(){
                echo "Docker inside"
                sh label: 'Echo "Hello World...', script: 'echo "Hello World!"'
              }
            }
        }
    }
}
 
 
                                                            

I'd share the script to change the property from the Jenkins console

 

import static org.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS

println("LAUNCH_DIAGNOSTICS=" + LAUNCH_DIAGNOSTICS)
LAUNCH_DIAGNOSTICS = true
println("LAUNCH_DIAGNOSTICS=" + LAUNCH_DIAGNOSTICS)

chisel@chizography.net (JIRA)

unread,
Oct 25, 2019, 12:58:02 PM10/25/19
to jenkinsc...@googlegroups.com

We hit the same issue after upgrading this morning. Downgrading to 1.30 resolved the issue for us too.

kuisathaverat@gmail.com (JIRA)

unread,
Oct 25, 2019, 1:20:07 PM10/25/19
to jenkinsc...@googlegroups.com

in our case, it seems related to docker.inside, if we run a similar docker command it works

```
pipeline {
agent

{label 'linux && immutable'}

stages {
stage("Test sh script in container") {
steps {

sh label: 'This works', script: """
docker run -t -v ${env.WORKSPACE}:${env.WORKSPACE} -u \$(id -u):\$(id -g) -w ${env.WORKSPACE} -e HOME=${env.WORKSPACE} node:12 echo 'Hello World!'
"""
script {
docker.image('node:12').inside()

{ echo "Docker inside" sh label: 'I'm gonna fail', script: 'echo "Hello World!"' }

}
}
}
}
}

```

kuisathaverat@gmail.com (JIRA)

unread,
Oct 25, 2019, 1:20:08 PM10/25/19
to jenkinsc...@googlegroups.com
the same, we face it with docker.inside after upgrade to 2.201, the works thing it is that you do not see the error until you enable `{{org.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS}}`

  {code}
pipeline
\ {

    agent {label 'linux && immutable'}
    stages {
        stage("Test sh script in container") {
            steps {
              script {
              docker.image('node:12').inside(){
                echo "Docker inside"
                sh label: 'Echo "Hello World...', script: 'echo "Hello World!"'
              }
            }
        }
    }
}
}
 {code}


I'd share the script to change the property from the Jenkins console



 {code}

import static org.jenkinsci.plugins.durabletask.BourneShellScript.LAUNCH_DIAGNOSTICS

println("LAUNCH_DIAGNOSTICS=" + LAUNCH_DIAGNOSTICS)
LAUNCH_DIAGNOSTICS = true
println("LAUNCH_DIAGNOSTICS=" + LAUNCH_DIAGNOSTICS)
{code}

kuisathaverat@gmail.com (JIRA)

unread,
Oct 25, 2019, 1:21:09 PM10/25/19
to jenkinsc...@googlegroups.com
in our case, it seems related to docker.inside, if we run a similar docker command it works

``` {code}
pipeline {
    agent {label 'linux && immutable'}
    stages {
        stage("Test sh script in container") {
            steps {
                sh label: 'This works', script: """
                  docker run -t -v ${env.WORKSPACE}:${env.WORKSPACE} -u \$(id -u):\$(id -g) -w ${env.WORKSPACE} -e HOME=${env.WORKSPACE} node:12 echo 'Hello World!'
                """
                script {
                  docker.image('node:12').inside(){
                    echo "Docker inside"
                    sh label: 'I'm gonna fail', script: 'echo "Hello World!"'
                  }
                }
            }
        }
    }
}

```
{code}

kuisathaverat@gmail.com (JIRA)

unread,
Oct 25, 2019, 1:21:10 PM10/25/19
to jenkinsc...@googlegroups.com
in our case, it seems related to docker.inside, if we run a similar docker command it works

{code}
pipeline {
    agent {label 'linux && immutable'}
    stages {
        stage("Test sh script in container") {
            steps {
                sh label: 'This works', script: """
                  docker run -t -v ${env.WORKSPACE}:${env.WORKSPACE} -u \$(id -u):\$(id -g) -w ${env.WORKSPACE} -e HOME=${env.WORKSPACE} node:12 echo 'Hello World!'
                """
                script {
                  docker.image('node:12').inside(){
                    echo "Docker inside"
                    sh label: ' I'm Im gonna fail', script: 'echo "Hello World!"'
                  }
                }
            }
        }
    }
}
{code}

kuisathaverat@gmail.com (JIRA)

unread,
Oct 25, 2019, 2:04:06 PM10/25/19
to jenkinsc...@googlegroups.com

I found a workaround but it is horrible for some reason the durable task is looking for the Jenkins cache inside the docker container that obviously is not there so if you mount the cache folder you resolve the issue but this means I have to change every docker.inside, I think we can back to 1.29 before the latest changes on the way to manage sh step

```
pipeline {
agent

{label 'linux && immutable'}

stages {
stage("Test sh script in container") {
steps {

script {
docker.image('node:12').inside("-v /var/lib/jenkins/caches/durable-task:/var/lib/jenkins/caches/durable-task")

{ echo "Docker inside" sh label: 'Im gonna fail', script: 'echo "Hello World!"' }

}
}
}
}
}
```

kuisathaverat@gmail.com (JIRA)

unread,
Oct 25, 2019, 2:05:08 PM10/25/19
to jenkinsc...@googlegroups.com
I found a workaround but it is horrible for some reason the durable task is looking for the Jenkins cache inside the docker container that obviously is not there so if you mount the cache folder you resolve the issue but this means I have to change every docker.inside, I think we can back to 1.29 before the latest changes on the way to manage sh step


``` {code}
pipeline {
    agent {label 'linux && immutable'}
    stages {
        stage("Test sh script in container") {
            steps {
                script {
                  docker.image('node:12').inside("-v /var/lib/jenkins/caches/durable-task:/var/lib/jenkins/caches/durable-task"){
                    echo "Docker inside"
                    sh label: 'Im gonna fail', script: 'echo "Hello World!"'
                  }
                }
            }
        }
    }
}
``` {code}

kuisathaverat@gmail.com (JIRA)

unread,
Oct 25, 2019, 2:30:03 PM10/25/19
to jenkinsc...@googlegroups.com

kuisathaverat@gmail.com (JIRA)

unread,
Oct 25, 2019, 2:46:05 PM10/25/19
to jenkinsc...@googlegroups.com
I have found the cause the changes in this commit https://github.com/jenkinsci/durable-task-plugin/commit/1f59c5229b9ff83709add3e202f8e49ff463106c it is related to a new binary launcher


I can confirm it if you disable the new binary launcher it works, you can disable the property on runtime by executing this script in the Jenkins console

{code}
import static org.jenkinsci.plugins.durabletask. BourneShellScript.FORCE_SHELL_WRAPPER

println("FORCE_SHELL_WRAPPER=" + FORCE_SHELL_WRAPPER)
FORCE_SHELL_WRAPPER = true
println("FORCE_SHELL_WRAPPER=" + FORCE_SHELL_WRAPPER)
{code}

gp@gpcentre.net (JIRA)

unread,
Oct 25, 2019, 3:09:03 PM10/25/19
to jenkinsc...@googlegroups.com

I can confirm downgrading "Durable Task Plugin" to v1.30 fixed the issue for us as well. We're running Jenkins 2.176.2.

gp@gpcentre.net (JIRA)

unread,
Oct 25, 2019, 3:32:04 PM10/25/19
to jenkinsc...@googlegroups.com

We're running Alpine linux builds. I briefly saw an error is one of the failures about not being able to run `ps` – it could be related to this block: https://github.com/jenkinsci/durable-task-plugin/commit/1f59c5229b9ff83709add3e202f8e49ff463106c#diff-b7cdd655e1fb1fd95154b2fbcb20e8e3R525

 switch (platform) {
            case SLIM:
            // (See JENKINS-58656) Running in a container with no init process is guaranteed to leave a zombie. Just let this test pass.
                // Debian slim does not have ps
               // [...]
}
   do {
         // //[...]
   } while (psString.contains(exitString));

gp@gpcentre.net (JIRA)

unread,
Oct 25, 2019, 3:35:04 PM10/25/19
to jenkinsc...@googlegroups.com
Philip Zozobrado edited a comment on Bug JENKINS-59903
We're running Alpine linux builds. I briefly saw an error is one of the failures with a failure about not being able to run `ps` -- it could be related to this block: https://github.com/jenkinsci/durable-task-plugin/commit/1f59c5229b9ff83709add3e202f8e49ff463106c#diff-b7cdd655e1fb1fd95154b2fbcb20e8e3R525

{code:java}

switch (platform) {
            case SLIM:
            // (See JENKINS-58656) Running in a container with no init process is guaranteed to leave a zombie. Just let this test pass.
                // Debian slim does not have ps
               // [...]
}
{code}
{code:java}

   do {
         // //[...]
   } while (psString.contains(exitString));
{code}

aaaustin10@gmail.com (JIRA)

unread,
Oct 25, 2019, 5:52:03 PM10/25/19
to jenkinsc...@googlegroups.com

I feel that I should mention that I also have the issue with 1.31 (downgrade to 1.30 fixes it) without using Docker.

pipeline {
    agent {
        label 'raspberry-build'
    }
    stages {
        stage("Test sh script in container"
) {
            steps {
              sh label: 'Echo "Hello World...', script: 'echo "Hello World!"'
            }
        }
    }
}

 

 

aaaustin10@gmail.com (JIRA)

unread,
Oct 25, 2019, 5:55:03 PM10/25/19
to jenkinsc...@googlegroups.com
Austin Stewart edited a comment on Bug JENKINS-59903
I feel that I should mention that I also have the issue with 1.31 (downgrade to 1.30 fixes it) without using Docker.

Borrowing from a comment above:
{code :java }

pipeline {
    agent {
        label 'raspberry-build'
    }
    stages {
        stage("Test sh script in container") {
            steps {
              sh label: 'Echo "Hello World...', script: 'echo "Hello World!"'
            }
        }
    }
}
{code}
 

 

coreybolson@gmail.com (JIRA)

unread,
Oct 25, 2019, 8:06:03 PM10/25/19
to jenkinsc...@googlegroups.com

I just ran into this issue today too.  Anyone know where I can get the .hpi file in order to downgrade to v 1.30?

chisel@chizography.net (JIRA)

unread,
Oct 25, 2019, 9:00:05 PM10/25/19
to jenkinsc...@googlegroups.com

n.jesper.a@gmail.com (JIRA)

unread,
Oct 25, 2019, 9:50:05 PM10/25/19
to jenkinsc...@googlegroups.com

Austin Stewart From the looks of the label in your example, you might be having the JENKINS-59907 problem, where the new wrapper doesn't work on all platforms.

Philip Zozobrado I suspect you might be having some similar problem, if you are not using the agent{docker{...}} or docker.inside()} pipeline instructions.

n.jesper.a@gmail.com (JIRA)

unread,
Oct 25, 2019, 9:52:06 PM10/25/19
to jenkinsc...@googlegroups.com
Jesper Andersson updated an issue
 
Change By: Jesper Andersson
Component/s: docker
Component/s: docker-plugin
Component/s: pipeline

coreybolson@gmail.com (JIRA)

unread,
Oct 25, 2019, 10:13:03 PM10/25/19
to jenkinsc...@googlegroups.com
Corey Olson commented on Bug JENKINS-59903
 
Re: durable-task v1.31 breaks sh steps in pipeline when running in a Docker container

Chisel Wright this was a fresh install today, so I didn't have that downgrade option.  Thanks for the link; that worked.

n.jesper.a@gmail.com (JIRA)

unread,
Oct 25, 2019, 11:27:05 PM10/25/19
to jenkinsc...@googlegroups.com

Also confirming that 

org.jenkinsci.plugins.durabletask.BourneShellScript.FORCE_SHELL_WRAPPER = true

is a working workaround, and that it can be set when Jenkins starts, just as LAUNCH_DIAGNOSTICS.

It would be great if the durable-task plugin could detect that it's running inside a container started by Jenkins, and only disable the wrapper in those steps, if that is to be the solution. "durable_task_monitor_1.31_unix_64" probably contains something of value, so disabling it system-wide doesn't feel like a solution.

jenkins@albersweb.de (JIRA)

unread,
Oct 26, 2019, 2:44:04 PM10/26/19
to jenkinsc...@googlegroups.com

I also have a usecase where no docker.inside is involved.

  • Jenkins master is the official docker image jenkins/jenkins:2.190.1-alpine
  • Agent is based on the adoptopenjdk/openjdk11:x86_64-ubuntu-jdk-11.0.4_11 image and connects to the master via 
    swarm plugin
  • Master and agent running on Docker 19.03.4 in swarm mode. The hosts are Ubuntu 18.04 LTS on VMware.

This pipeline code:

node('jdk11') {
    stage('test') {
        sh 'echo hi.'
    }
}
  • works on both master and agent with durable-task-plugin 1.30.
  • works on the master with durable-task-plugin 1.31.
  • fails on the agent when durable-task-plugin 1.31 is installed:
[Pipeline] Start of Pipeline
[Pipeline] node
Running on build_agent-java11-docker4 in /workspace/hugo
[Pipeline] {
[Pipeline] stage (hide)
[Pipeline] { (test)
[Pipeline] sh
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from 192.168.0.6/192.168.0.6:35540
		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1743)
		at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
		at hudson.remoting.Channel.call(Channel.java:957)
		at hudson.FilePath.act(FilePath.java:1072)
		at hudson.FilePath.act(FilePath.java:1061)
		at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:169)
		at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:99)
		at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:317)
		at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:286)
		at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:179)
		at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)
		at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
		at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
		at java.lang.reflect.Method.invoke(Method.java:498)
		at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
		at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
		at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42)
		at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
		at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
		at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:160)
		at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23)
		at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:157)
		at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:158)
		at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:162)
		at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:132)
		at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:132)
		at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17)
		at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:84)
		at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)
		at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)
		at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
		at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
		at java.lang.reflect.Method.invoke(Method.java:498)
		at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
		at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
		at com.cloudbees.groovy.cps.Next.step(Next.java:83)
		at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
		at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
		at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
		at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
		at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
		at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
		at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)
		at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:186)
		at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:370)
		at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:93)
		at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:282)
		at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:270)
		at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:66)
		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
		at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
		at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
		at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
		at java.lang.Thread.run(Thread.java:748)
java.nio.file.AccessDeniedException: /caches
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
	at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:385)
	at java.nio.file.Files.createDirectory(Files.java:689)
	at java.nio.file.Files.createAndCheckIsDirectory(Files.java:796)
	at java.nio.file.Files.createDirectories(Files.java:782)
	at org.jenkinsci.plugins.durabletask.BourneShellScript$GetAgentInfo.invoke(BourneShellScript.java:473)
	at org.jenkinsci.plugins.durabletask.BourneShellScript$GetAgentInfo.invoke(BourneShellScript.java:440)
	at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3052)
	at hudson.remoting.UserRequest.perform(UserRequest.java:212)
	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
	at hudson.remoting.Request$2.run(Request.java:369)
	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
	at java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93)
	at java.lang.Thread.run(Thread.java:834)
Finished: FAILURE

Looks like a problem accessing some cache: java.nio.file.AccessDeniedException: /caches.

gp@gpcentre.net (JIRA)

unread,
Oct 27, 2019, 3:56:03 PM10/27/19
to jenkinsc...@googlegroups.com

Philip Zozobrado I suspect you might be having some similar problem, if you are not using the agent{docker{...}} or docker.inside()} pipeline instructions.

Yes. This is what we're using:

withDockerContainer([image: "php:latest", args: "-v ${WORKSPACE}:/project"]) {
    sh "echo 'started a container'"
}

gunter@grodotzki.co.za (JIRA)

unread,
Oct 27, 2019, 5:12:04 PM10/27/19
to jenkinsc...@googlegroups.com

Not wanting to sound awful in the comments section, and I highly appreciate the endless efforts of (mostly?) volunteering developers. However it is a really bad experience if a minor version introduces breaking changes.

 

Why not just bump the version in such cases?

gunter@grodotzki.co.za (JIRA)

unread,
Oct 27, 2019, 6:00:09 PM10/27/19
to jenkinsc...@googlegroups.com
Gunter Grodotzki edited a comment on Bug JENKINS-59903
Not wanting to sound awful in the comments section, and I highly appreciate the endless efforts of (mostly?) volunteering developers. However it is a really bad experience if a minor version introduces breaking changes.

 

Why not just bump the version in such cases?


 

BTW a more portable fix piggy backing on [~njesper]'s solution:

 
{code:java}
args '--user=root --privileged -v ${HOME}/caches:${WORKSPACE}/../../caches'{code}
 

This _should_ fix it no matter what your configuration is, because it uses the same logic as implemented in the plugin: [https://github.com/jenkinsci/durable-task-plugin/pull/106/files#diff-b7cdd655e1fb1fd95154b2fbcb20e8e3R485]

pyssling@ludd.ltu.se (JIRA)

unread,
Oct 27, 2019, 10:28:04 PM10/27/19
to jenkinsc...@googlegroups.com

I noted on the merge request for this stuff now that the approach isn't that great either - they are shipping a statically compiled go binary to use as an execution wrapper. This is bad as it breaks Jenkins on other architectures than x86 such as arm and ppc.

 

cchiou@cloudbees.com (JIRA)

unread,
Oct 27, 2019, 11:31:04 PM10/27/19
to jenkinsc...@googlegroups.com

jaap@jcz.nl (JIRA)

unread,
Oct 28, 2019, 5:56:03 AM10/28/19
to jenkinsc...@googlegroups.com
Jaap Crezee commented on Bug JENKINS-59903
 
Re: durable-task v1.31 breaks sh steps in pipeline when running in a Docker container

> Reverting to Durable Task Plugin v. 1.30 "solves" the problem.

This + restart works for me (for now).

jaap@jcz.nl (JIRA)

unread,
Oct 28, 2019, 5:56:09 AM10/28/19
to jenkinsc...@googlegroups.com
Jaap Crezee edited a comment on Bug JENKINS-59903
> Reverting to Durable Task Plugin v. 1.30 "solves" the problem.

This + restart (Jenkins) works for me (for now).

cchiou@cloudbees.com (JIRA)

unread,
Oct 28, 2019, 7:56:06 AM10/28/19
to jenkinsc...@googlegroups.com

So, apologies for this taking so long to address. There is currently a fix in the works right now for this issue and JENKINS-59907 as well. I will also update the changelog that is currently being migrated to github. Caching will be disabled when the cache directory is unavailable to the agent.

The PR can be found here: https://github.com/jenkinsci/durable-task-plugin/pull/114
ci.jenkins.io is quite unstable right now. Hopefully things will get better sooner.

kuisathaverat@gmail.com (JIRA)

unread,
Oct 28, 2019, 8:47:04 AM10/28/19
to jenkinsc...@googlegroups.com

IMHO the introduction of the new binary is an error, I do not see a reason to do not make the same behavior on Java

The compatibility of the libc whit the binary was compiled could be a potential issue on different Linux distributions

if the stream is not open, launcherCmd woudl be "" What happened with the script in that case? I think that will do not execute anything and does not show any error
https://github.com/jenkinsci/durable-task-plugin/blob/b122d6b0f924c533b0a26d99a71779bbafc3c543/src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java#L207-L222

the interpreter would be launch with '-xe' options this could leak commands that you don't want to show
https://github.com/jenkinsci/durable-task-plugin/blob/b122d6b0f924c533b0a26d99a71779bbafc3c543/src/main/go/org/jenkinsci/plugins/durabletask/durable_task_monitor.go#L86

I dunno how this command behaves in Cygwin for example
https://github.com/jenkinsci/durable-task-plugin/blob/b122d6b0f924c533b0a26d99a71779bbafc3c543/src/main/go/org/jenkinsci/plugins/durabletask/durable_task_monitor.go#L109

n.jesper.a@gmail.com (JIRA)

unread,
Oct 28, 2019, 9:02:03 AM10/28/19
to jenkinsc...@googlegroups.com

Great news, Carroll Chiou!

Apart from this issue ("cache folder not available"), and JENKINS-59907 ("binary cant run"), I've seen some comments that indicate that there might be a third kind of problems (which might not have it's own issue yet): Odd/rare distros or special configurations/installations, where the new binary wrapper can't find all the libs/tools it needs. Judging from the description of PR 114, this third type of problems might not be addressed.

If this use of a cache folder on the node is following Jenkins design guidelines I think it would be a good idea to bring up the question with relevant Docker integration plugin(s) why the cache folder isn't mounted when Jenkins starts the container. - I guess the new wrapper brings some kind of value, so it should be made to work in containers as well, if relevant.

kuisathaverat@gmail.com (JIRA)

unread,
Oct 28, 2019, 9:23:03 AM10/28/19
to jenkinsc...@googlegroups.com
IMHO the introduction of the new binary is an error, I do not see a reason to do not make the same behavior on Java , also it copies a binary on the workspace that it is not mentioned anywhere and it comes from outside of the workspace I see potential security issues on that behavior [~wfollonier] WDYT

The compatibility of the libc whit the binary was compiled could be a potential issue on different Linux distributions

if the stream is not open, launcherCmd woudl be "" What happened with the script in that case? I think that will do not execute anything and does not show any error
https://github.com/jenkinsci/durable-task-plugin/blob/b122d6b0f924c533b0a26d99a71779bbafc3c543/src/main/java/org/jenkinsci/plugins/durabletask/BourneShellScript.java#L207-L222

the interpreter would be launch with '-xe' options this could leak commands that you don't want to show
https://github.com/jenkinsci/durable-task-plugin/blob/b122d6b0f924c533b0a26d99a71779bbafc3c543/src/main/go/org/jenkinsci/plugins/durabletask/durable_task_monitor.go#L86

I dunno how this command behaves in Cygwin for example
https://github.com/jenkinsci/durable-task-plugin/blob/b122d6b0f924c533b0a26d99a71779bbafc3c543/src/main/go/org/jenkinsci/plugins/durabletask/durable_task_monitor.go#L109

william.whittle@gmail.com (JIRA)

unread,
Oct 28, 2019, 12:05:05 PM10/28/19
to jenkinsc...@googlegroups.com

We've also seen this essential issue on Solaris, AIX, and IBM i (AS/400). It was OK on Linux AMD64 and Windows x64. Reverting to Durable Task Plugin 1.30 resolved the issues.

haridara@gmail.com (JIRA)

unread,
Oct 28, 2019, 12:31:04 PM10/28/19
to jenkinsc...@googlegroups.com

I got this issue resolved by reverting to the previous version, but I am wondering why the console or Jenkins log had no information on what the underlying issue is. Isn't there a lack of sufficient logging and perhaps some sort of error handling here?

jonathan@riv.al (JIRA)

unread,
Oct 28, 2019, 2:29:04 PM10/28/19
to jenkinsc...@googlegroups.com

Since upgrading durable-task 1.30 to 1.31, we're seeing a lot of intermittent

Cannot run program "/home/ubuntu/caches/durable-task/durable_task_monitor_1.31_unix_64" (in directory "/home/ubuntu/workspace/Utilities/environment-management/db-stop"): error=26, Text file busy

and, less often,

Cannot run program "/home/ubuntu/caches/durable-task/durable_task_monitor_1.31_unix_64" (in directory "/home/ubuntu/workspace/logy_entitlement-service_PR-1721"): error=13, Permission denied

This is all running on standard amd64 Ubuntu (no exotic OS or architecture) and not in Docker agents. Should I file a separate issue?

jonathan@riv.al (JIRA)

unread,
Oct 28, 2019, 2:30:03 PM10/28/19
to jenkinsc...@googlegroups.com
Jonathan B edited a comment on Bug JENKINS-59903
Since upgrading durable-task 1.30 to 1.31, we're seeing a lot of intermittent

{code}
Cannot run program "/home/ubuntu/caches/durable-task/durable_task_monitor_1.31_unix_64" (in directory "/home/ubuntu/workspace/
Utilities path / environment-management to / db-stop job "): error=26, Text file busy
{code}

and, less often,

{code}
Cannot run program "/home/ubuntu/caches/durable-task/durable_task_monitor_1.31_unix_64" (in directory "/home/ubuntu/workspace/
logy_entitlement-service_PR-1721 path/to/job "): error=13, Permission denied
{code}


This is all running on standard amd64 Ubuntu (no exotic OS or architecture) and not in Docker agents. Should I file a separate issue?

dnusbaum@cloudbees.com (JIRA)

unread,
Oct 28, 2019, 3:23:03 PM10/28/19
to jenkinsc...@googlegroups.com

Hi everyone, sorry for the issues. I filed https://github.com/jenkins-infra/update-center2/pull/305 to suspend durable-task 1.31 from distribution for now. As a workaround, you can roll back to 1.30 or add org.jenkinsci.plugins.durabletask.BourneShellScript.FORCE_SHELL_WRAPPER=true as a system property to the JVM running Jenkins (or set the same Groovy variable to true dynamically via the script console, though that will be unset if you restart Jenkins).

For some context, the new binary was intended to improve some long-standing robustness issues with the existing shell wrapper by being able to use utilities like setsid, and to make it more maintainable going forward. The code that detects whether to use the binary or the existing shell wrapper obviously needs to handle additional cases, and we need to add some more testing for other platforms where possible, in particular the Docker-based workflows that were broken by the change. Ideally, changes to implementation details like this would be transparent to users and wouldn't cause breaking changes, but this plugin handles lot of subtly different platforms at the same time and can only test on some of them, so changes always seem to cause problems.

cchiou@cloudbees.com (JIRA)

unread,
Oct 28, 2019, 5:34:04 PM10/28/19
to jenkinsc...@googlegroups.com

Jonathan B I ran into this issue when I was testing out the fix and running tests on it the very first time. It was solved immediately by running a mvn clean install, so unfortunately I was not able to investigate deeper into the issue as I still can't reproduce it. I think this issue will be solved by reverting to 1.30 and installing again. If that does not solve it, I think it warrants a separate issue.

Jesper Andersson I don't think there has been anything official on using caches on the agent. Not many plugins use caching, but i think is something that we should explore further since I think most people want to reduce the workload of the masters.

Hari Dara Yes, there should be more error handling involved. I am looking to add that in. The tricky part is the script is supposed to be launched as a fire and forget, this includes the original shell wrapper as well. But of course, it's one thing if your shell fails to launch vs this binary.

theeandrewlane@gmail.com (JIRA)

unread,
Oct 28, 2019, 6:56:03 PM10/28/19
to jenkinsc...@googlegroups.com

Ubuntu 18.04 Jenkins 2.202 - I can confirm downgrading durable-task-plugin resolved this issue 

kuisathaverat@gmail.com (JIRA)

unread,
Oct 28, 2019, 8:01:04 PM10/28/19
to jenkinsc...@googlegroups.com

Carroll Chiou if all the thing is related to reduce the load on the master I think there are simpler and better ways to make it, first, the basics with console output avoid to have insane verbose console output, I mean, have a 5-10GB of console logs is stupid, if you need to review this file probably you drive crazy trying to open the file, so reduce the console output is a key, if you need verbose output for some command to redirect the output to files and at the end of the job archive them on Jenkins. If after all you still think that this cache is needed, make it with something standard, do not reinvent the wheel, named pipes works on every Unix implementation, there is also an implementation for windows, they are plain files easy to manage from Java so you would avoid a ton of problems related with the platform.

Because durable-task-plugin is something you could not rid of it if you use pipelines is a critical component, maybe this cache would go in another plugin and keep the durable-task as it is, or allow to rid completely from durable-task it causes more issues than benefits if you do not want to restart pipelines from any point after a failure, that it is an antipattern IMHO the pipeline should pass on one round if not it is not well designed and you should split it.

cchiou@cloudbees.com (JIRA)

unread,
Oct 28, 2019, 9:12:03 PM10/28/19
to jenkinsc...@googlegroups.com

So a new release 1.32 is out. Until we have a fix out resolving this ticket and, at least, JENKINS-59907, the binary will be disabled by default.

Ivan Fernandez CalvoHi Ivan, actually the caching was added as a way to reduce the number of times the master is transmitting the binary over to the agent. What was not taken into account was that the cache directory chosen may not be accessible to the job. A fix is in the works.

The binary wrapper itself was added to make the original shell wrapper script more maintainable rather than mystical. There was also an attempt to reduce the issues where the script itself was being terminated for unknown reasons. One of the ways to do this was to use setsid instead of nohup (See JENKINS-25503). The reason the launched script's output is being redirected to a file is so that the output can be transmitted to master in order to display the script's output.

jenkins@albersweb.de (JIRA)

unread,
Oct 29, 2019, 8:42:07 AM10/29/19
to jenkinsc...@googlegroups.com

Carroll Chiou 1.32 does not resolve the issue for me.

When running a sh step remotely on a dockerized agent as described above, I still get java.nio.file.AccessDeniedException: /caches, see details above.

n.jesper.a@gmail.com (JIRA)

unread,
Oct 29, 2019, 10:08:08 AM10/29/19
to jenkinsc...@googlegroups.com

Harald Albers How are you running your container?

I'm guessing wildly here, but to me it looks like your node config is setting "Remote root directory" to /. And I'm also guessing that you are running the container as a specific user, e.g. '-u jenkins:jenkins' and probably mount the workspace like e.g. '-v /home/jenkins/workspace:/workspace'. And then start the agent inside the container.

With such a setup the Jenkins agent will probably not have enough permissions to create '/cache', which the plugin perhaps still is trying to do even if it's set to not use the new wrapper.

Try adding e.g. '-v /home/jenkins/cache:/cache' (modified to your config) or pre-creating a /cache folder in your image that is owned by 'jenkins:jenkins' (the user you run the container as).

bmathus+ossjira@cloudbees.com (JIRA)

unread,
Oct 29, 2019, 12:38:04 PM10/29/19
to jenkinsc...@googlegroups.com

jenkins@albersweb.de (JIRA)

unread,
Oct 29, 2019, 1:48:04 PM10/29/19
to jenkinsc...@googlegroups.com
Harald Albers commented on Bug JENKINS-59903
 
Re: durable-task v1.31 breaks sh steps in pipeline when running in a Docker container

Jesper Andersson Your questions pointed me to a solution, thanks a lot.

But first the answers:

The Docker image of the agent runs as the user jenkins. The swarm client plugin sets the "Remote root directory" to "/" when connecting to the master and dynamically creating an agent. The image has an existing /workspace directory that is writable for the user jenkins. The user jenkins obviously does not have sufficient permissions to create a directory in /.

The swarm client can be configured to use a specific root directory. If I set it to a directory where the user jenkins has write permission, the build will successfully create a directory caches alongside the workspace directory.

Another solution would be to pre-create the /caches directory in the image as well.

I'm fine with this solution.

But the bottom line is that we need documentation that the user who performs the build must have sufficient permissions to create directories in the build root, or that specific directories need to exist with appropriate permissions.

cchiou@cloudbees.com (JIRA)

unread,
Oct 29, 2019, 3:23:05 PM10/29/19
to jenkinsc...@googlegroups.com

I apologize, what 1.31 did was disable the binary wrapper as default, but it did not resolve the caching issue because the cache dir is stil trying to be created. I am in the process of merging in my current fix (https://github.com/jenkinsci/durable-task-plugin/pull/114) into master.

Harald Albers once the fix gets through, those users who do not have permissions to create directories in the build root will have caching disabled.

cchiou@cloudbees.com (JIRA)

unread,
Oct 29, 2019, 9:25:03 PM10/29/19
to jenkinsc...@googlegroups.com

So version 1.33 has now been released. This includes the fix for disabling cache when there are insufficient permissions to access the cache dir. The binary is still disabled by default.

jenkins@albersweb.de (JIRA)

unread,
Oct 30, 2019, 11:04:05 AM10/30/19
to jenkinsc...@googlegroups.com

Carroll Chiou 1.33 works for my usecase (build root in /, user not having permissions to create /caches directory)

cchiou@cloudbees.com (JIRA)

unread,
Oct 30, 2019, 10:29:08 PM10/30/19
to jenkinsc...@googlegroups.com
Carroll Chiou started work on Bug JENKINS-59903
 
Change By: Carroll Chiou
Status: Open In Progress

cchiou@cloudbees.com (JIRA)

unread,
Oct 31, 2019, 6:26:06 PM10/31/19
to jenkinsc...@googlegroups.com
Carroll Chiou resolved as Fixed
 
Change By: Carroll Chiou
Status: In Progress Resolved
Resolution: Fixed
Released As: 1.33

don@hardwarehacks.org (JIRA)

unread,
Mar 6, 2020, 7:42:04 PM3/6/20
to jenkinsc...@googlegroups.com
Don L commented on Bug JENKINS-59903
 
Re: durable-task v1.31 breaks sh steps in pipeline when running in a Docker container

I've found this also reproduces when using build agents in Kubernetes, not just Docker. The problem here is that Kubernetes launches two containers into a pod with a shared mount: a JNLP slave container, which Jenkins does have permission to write the cache directory in, and a build container (in my case kubectl, but could be any container without a Jenkins user) where it does not necessarily have the same permission, in which code actually runs. The plugin runs its test inside the JNLP container, enables the wrapper, and then exhibits the same hanging behavior when commands are run in the kubectl container.

Tests run on the latest (v1.33) of durable-task.

Logs with LAUNCH_DIAGNOSTICS set:

sh: 1: cannot create /home/jenkins/agent/workspace/REDACTED_PR-5140@tmp/durable-cca9ec47/jenkins-log.txt: Permission denied
sh: 1: cannot create /home/jenkins/agent/workspace/REDACTED_PR-5140@tmp/durable-cca9ec47/jenkins-result.txt.tmp: Permission denied
touch: cannot touch '/home/jenkins/agent/workspace/REDACTED_PR-5140@tmp/durable-cca9ec47/jenkins-log.txt': Permission denied
mv: cannot stat '/home/jenkins/agent/workspace/REDACTED_PR-5140@tmp/durable-cca9ec47/jenkins-result.txt.tmp': No such file or directory
touch: cannot touch '/home/jenkins/agent/workspace/REDACTED_PR-5140@tmp/durable-cca9ec47/jenkins-log.txt': Permission denied
[ last line repeated ~100 times ]                                                                                          
process apparently never started in /home/jenkins/agent/workspace/REDACTED_PR-5140@tmp/durable-cca9ec47  

In the JNLP container:

bash-4.4$ cd /home/jenkins/agent/caches
bash-4.4$ ls -l
total 0
drwxr-xr-x    2 jenkins  jenkins          6 Mar  6 15:47 durable-task 

In the kubectl container:

I have no name!@<REDACTED>:/home/jenkins/agent/caches$ ls -l
total 0
drwxr-xr-x 2 1000 1000 6 Mar  6 15:47 durable-taskI have no name!@<REDACTED>:/home/jenkins/agent/caches$ id
uid=1001 gid=0(root) groups=0(root) 

I've had some success today working around this by adding a security context to my pods, forcing a run as Jenkins's UID (which for me is 1000 - YMMV depending on how Jenkins is running), e.g.:

kind: Pod                                                                                                                  
metadata:                                                                                                                  
  name: kubectl                                                                                                            
spec:                                                                                                                      
  containers:                                                                                                              
  - command:                                                                                                               
    - cat                                                                                                                  
    image: bitnami/kubectl:1.14                                     
    imagePullPolicy: Always                                                                                                
    name: kubectl                                                                                                          
    tty: true                                                                                                              
  securityContext:                                                                                                         
    runAsUser: 1000 
This message was sent by Atlassian Jira (v7.13.12#713012-sha1:6e07c38)
Atlassian logo

dnusbaum@cloudbees.com (JIRA)

unread,
Mar 6, 2020, 8:46:05 PM3/6/20
to jenkinsc...@googlegroups.com

Don L Please open a new issue instead of commenting here. In durable-task 1.33, the caches directory is not actually used by default, so I think you can ignore it. The problem in your case looks like permissions on the control directory for the script, and I think that you would run into the same problems on durable-task 1.30 or older, so I would check for similar bugs reported against Durable Task Plugin and/or Kubernetes Plugin.

Reply all
Reply to author
Forward
0 new messages