[JIRA] (JENKINS-59893) bat calls hang in Windows Docker container in declarative pipeline script

1 view
Skip to first unread message

anonymousaccounts@icloud.com (JIRA)

unread,
Oct 22, 2019, 2:22:03 PM10/22/19
to jenkinsc...@googlegroups.com
a b updated an issue
 
Jenkins / Bug JENKINS-59893
bat calls hang in Windows Docker container in declarative pipeline script
Change By: a b
Summary: bat calls hang in Windows Docker container in declarative pipeline script
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)
Atlassian logo

anonymousaccounts@icloud.com (JIRA)

unread,
Oct 22, 2019, 2:23:02 PM10/22/19
to jenkinsc...@googlegroups.com
a b updated an issue
h3. Description

{{bat}} steps hang (endless spinning wheel in the jobs console output) even for simple Windows containers.
{code:java}
bat "echo test inside"{code}
h3. Troubleshooting & Additional info

powershell and all other commands tried so far work without issue. Even using powershell to wrap cmd.exe commands works fine. Example:

 
{code:java}
powershell "cmd /c echo test inside"{code}
 

Running the image manually on the node host exhibits no issues. i.e. can run docker run -it microsoft/windowsservercore:ltsc2016 and happy use cmd and all other commands without issue.

Similarly we can attach to the container spun up by the Jenkins job while it's hung and execute the same echo command (or any other) without issue.

Others have not had this issue so it could be something specific in our setup, but I have not been able to pinpoint anything. [https://github.com/jenkinsci/docker-workflow-plugin/pull/184#issuecomment-539213785]

The job console output shows no errors and neither does the main Jenkins log under /log/all. No errors if any kind while the job is running / hung.
h3. Setup

Jenkins node host: {{Windows Server 2016 (1607)}}

Docker image: {{microsoft/windowsservercore:ltsc2016}}

Happens regardless if {{docker {}}} or {{dockerfile {}}} syntax is used.

Specifically using declarative pipeline scripts. Have not tested other methods
{code:java}
pipeline {
    agent {
        docker {
            image 'microsoft/windowsservercore:ltsc2016'
            label 'windows'
        }
    }
    stages {
        stage('Example Build') {
            steps{
                bat "echo test inside"
            }
        }
    }
}{code}

henryborchers@yahoo.com (JIRA)

unread,
Oct 22, 2019, 2:47:04 PM10/22/19
to jenkinsc...@googlegroups.com
Henry Borchers commented on Bug JENKINS-59893
 
Re: bat calls hang in Windows Docker container in declarative pipeline script

I'm also have this problem. However for me, it's also hanging on powershell.

henryborchers@yahoo.com (JIRA)

unread,
Oct 22, 2019, 3:09:02 PM10/22/19
to jenkinsc...@googlegroups.com
Henry Borchers edited a comment on Bug JENKINS-59893
I'm also have having this problem. However for me, it's also hanging on powershell as well .

Edit: typo

anonymousaccounts@icloud.com (JIRA)

unread,
Oct 22, 2019, 4:33:02 PM10/22/19
to jenkinsc...@googlegroups.com
a b commented on Bug JENKINS-59893

What host OS and Docker images are you using?

henryborchers@yahoo.com (JIRA)

unread,
Oct 22, 2019, 5:58:02 PM10/22/19
to jenkinsc...@googlegroups.com

a b

Host OS:, Windows Server 2019 

Docker images tried:

Same experience on every one. 

 

Currently my test pipeline looks like. this

 

pipeline {  
    agent any  
    stages {
        stage('hello world') {
            parallel {
                // this one doesn't work
                stage('Hello world on Windows') {
                    agent {
                      docker {
                          label 'Windows&&Docker&&aws'
                          image 'mcr.microsoft.com/windows/servercore:ltsc2019'
                      }
                     }
                    options {
                     timeout(1) // in case the pipeline hangs
                    }
                    steps {
                     // hangs here
                     powershell 'powershell "cmd /c echo test inside"'
                    }
                }
                // This one works just fine
                stage('Hello world on Linux') {
                    agent {
                      docker {
                          label 'linux&&docker&&aws'
                          image 'alpine:latest'
                      }
                    }
                    options {
                     timeout(1) // in case the pipeline hangs
                    }
                    steps {
                     sh 'echo "hello world"'
                    }
                }
      }
    }
  }
}

 

 

anonymousaccounts@icloud.com (JIRA)

unread,
Oct 22, 2019, 7:12:03 PM10/22/19
to jenkinsc...@googlegroups.com
a b commented on Bug JENKINS-59893

Interesting. So different Windows Server OS and different images that mine. Also interesting that your powershell commands hand while ours do not.  Do things work as expected when you manually start or attach to the containers?

Might be irrelevant but what method did you use to install Docker on the host node?  We used this.

Install-Module -Name DockerMsftProvider -Repository PSGallery -Force
Install-Package -Name docker -ProviderName DockerMsftProvider
Restart-Computer -Force

And also had to do these steps (install git-bash and update paths) to get around a nohup error. When running the Jenkins jobs.

https://stackoverflow.com/questions/45140614/jenkins-pipeline-sh-fail-with-cannot-run-program-nohup-on-windows/53395989#53395989

anonymousaccounts@icloud.com (JIRA)

unread,
Oct 22, 2019, 7:16:02 PM10/22/19
to jenkinsc...@googlegroups.com
a b edited a comment on Bug JENKINS-59893
Interesting. So different Windows Server OS and different images that mine. Also interesting that your powershell commands hand while ours do not.  Do things work as expected when you manually start or attach to the containers?

Might be irrelevant but what method did you use to install Docker on the host node?  We used this.
{code:java}

Install-Module -Name DockerMsftProvider -Repository PSGallery -Force
Install-Package -Name docker -ProviderName DockerMsftProvider
Restart-Computer -Force{code}

And also had to do these steps (install git-bash and update paths) to get around a {{nohup}} error. When running the Jenkins jobs.

[https://stackoverflow.com/questions/45140614/jenkins-pipeline-sh-fail-with-cannot-run-program-nohup-on-windows/53395989#53395989]

 


I'm curious if the chosen workaround for nohup has something to do with it in our case. There is a PR to address the root of the nohup issue. Perhaps I will try building that PR and see if it solves the issue. But would still like to know more about your setup.

[https://github.com/jenkinsci/pipeline-model-definition-plugin/pull/354]

anonymousaccounts@icloud.com (JIRA)

unread,
Oct 22, 2019, 7:16:03 PM10/22/19
to jenkinsc...@googlegroups.com

anonymousaccounts@icloud.com (JIRA)

unread,
Oct 22, 2019, 7:22:02 PM10/22/19
to jenkinsc...@googlegroups.com
a b edited a comment on Bug JENKINS-59893
Interesting. So different Windows Server OS and different images that than mine. Also interesting that your powershell commands hand hang while ours do not.  Do things work as expected when you manually start or attach to the containers?


Might be irrelevant but what method did you use to install Docker on the host node?  We used this.
{code:java}
Install-Module -Name DockerMsftProvider -Repository PSGallery -Force
Install-Package -Name docker -ProviderName DockerMsftProvider
Restart-Computer -Force{code}
And also had to do these steps (install git-bash and update paths) to get around a {{nohup}} error. When running the Jenkins jobs.

[https://stackoverflow.com/questions/45140614/jenkins-pipeline-sh-fail-with-cannot-run-program-nohup-on-windows/53395989#53395989]

 I'm curious if the chosen workaround for nohup has something to do with it in our case. There is a PR to address the root of the nohup issue. Perhaps I will try building that PR and see if it solves the issue. But would still like to know more about your setup.

[https://github.com/jenkinsci/pipeline-model-definition-plugin/pull/354]

henryborchers@yahoo.com (JIRA)

unread,
Oct 23, 2019, 4:18:03 PM10/23/19
to jenkinsc...@googlegroups.com

I'm taking a research day, so I'm working from home and haven't tried anything on the Windows server at work.

Anyways, I tried to get a Windows Docker container working in Jenkins on my home Windows 10 machine and it worked.

I'm thinking the server at work must have a configuration incorrectly set. I just wish I could figure out what the heck the problem is. I'm not even sure where to look.

anonymousaccounts@icloud.com (JIRA)

unread,
Oct 23, 2019, 6:48:03 PM10/23/19
to jenkinsc...@googlegroups.com
a b commented on Bug JENKINS-59893

I agree it may be server config related but hard to understand why or where to start down that path. I feel like the nohup thing is a logical place to start as it has to do with how the plugin interfaces with the container shells.

Did you run into the nohup initially and if so how did you solve it?

josephp90@gmail.com (JIRA)

unread,
Oct 28, 2019, 6:07:03 PM10/28/19
to jenkinsc...@googlegroups.com

We haven't experienced this.

Our host OS is windows 2019.

nohup is never called during bat.

Our workaround for nohup when using `sh` is to install git using chocolatey with GitAndUnixToolsOnPath  parameter.

This is our Jenkins windows agents (our Windows base images only have Docker installed)

Today I would use the existing DockerImage created in Jenkins org: https://github.com/jenkinsci/docker-jnlp-slave/blob/master/Dockerfile-windows

 

FROM mcr.microsoft.com/windows/servercore:ltsc2019

SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]

RUN Set-ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString(''https://chocolatey.org/install.ps1'')); `
  cinst docker-cli docker-compose adoptopenjdk8jre powershell-preview -y --no-progress; `
  cinst git --params '/GitAndUnixToolsOnPath /SChannel' -y --no-progress

# temporary fix for powershell preview
RUN [Environment]::SetEnvironmentVariable("Path", $env:Path + ";C:\Program Files\PowerShell\7-preview", [EnvironmentVariableTarget]::Machine)

# Adding Jenkins Setup last to avoid rebuild
ADD JenkinsAgentSetup.ps1 .

ENTRYPOINT ["powershell.exe","-executionpolicy", "bypass", "./JenkinsAgentSetup.ps1"]

This is how we run the agent

docker run `
  --dns 10.0.0.1 `
  --dns 1.1.1.1 `
  --dns-search company.io `
  --name jenkins.agent `
  -v "$($WORKSPACE):$($JENKINS_WORKSPACE)" `
  -v \\.\pipe\docker_engine:\\.\pipe\docker_engine `
  -e "JENKINS_MASTER_URL=$JENKINS_MASTER_URL" `
  -e "JENKINS_AGENT_NAME=$JENKINS_AGENT_NAME" `
  -e "JENKINS_AGENT_PARAMETERS=$JENKINS_AGENT_PARAMETERS" `
  -e "JENKINS_AGENT_SECRET=$JENKINS_AGENT_SECRET" `
  artifactory.company.io/docker/jenkins.agent.windows:latest

 

josephp90@gmail.com (JIRA)

unread,
Oct 28, 2019, 6:12:02 PM10/28/19
to jenkinsc...@googlegroups.com

I'd highly encourage using Windows Server 2019 cause it allows you to pipe your docker engine.

Our experience with Windows Server 2016 was a HUGE nope.

josephp90@gmail.com (JIRA)

unread,
Oct 28, 2019, 6:18:05 PM10/28/19
to jenkinsc...@googlegroups.com
Joseph Petersen edited a comment on Bug JENKINS-59893
We haven't experienced this.

Our host OS is windows 2019.

nohup is never called during bat.

Our workaround for nohup when using `sh` is to install git using chocolatey with GitAndUnixToolsOnPath  parameter.

This is our Jenkins windows agents (our Windows base images only have Docker installed)

Today I would use the existing DockerImage created in Jenkins org: [https://github.com/jenkinsci/docker-jnlp-slave/blob/master/Dockerfile-windows]

 
{code:java}

FROM mcr.microsoft.com/windows/servercore:ltsc2019

SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]

RUN Set-ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString('

  cinst docker-cli docker-compose adoptopenjdk8jre powershell-preview -y --no-progress; `
  cinst git --params '/GitAndUnixToolsOnPath /SChannel' -y --no-progress

# temporary fix for powershell preview
RUN [Environment]::SetEnvironmentVariable("Path", $env:Path + ";C:\Program Files\PowerShell\7-preview", [EnvironmentVariableTarget]::Machine)

# Adding Jenkins Setup last to avoid rebuild
ADD JenkinsAgentSetup.ps1 .

ENTRYPOINT ["powershell.exe","-executionpolicy", "bypass", "./JenkinsAgentSetup.ps1"]

{code}

This is how we run the agent
{code:java}

docker run `
  --dns 10.0.0.1 `
  --dns 1.1.1.1 `
  --dns-search company.io `
  --name jenkins.agent `
  -v "$($WORKSPACE):$($JENKINS_WORKSPACE)" `
  -v \\.\pipe\docker_engine:\\.\pipe\docker_engine `
  -e "JENKINS_MASTER_URL=$JENKINS_MASTER_URL" `
  -e "JENKINS_AGENT_NAME=$JENKINS_AGENT_NAME" `
  -e "JENKINS_AGENT_PARAMETERS=$JENKINS_AGENT_PARAMETERS" `
  -e "JENKINS_AGENT_SECRET=$JENKINS_AGENT_SECRET" `
  artifactory.company.io/docker/jenkins.agent.windows:latest
{code}
 

henryborchers@yahoo.com (JIRA)

unread,
Oct 28, 2019, 7:15:02 PM10/28/19
to jenkinsc...@googlegroups.com

It started working on our end when we upgraded the version of Docker to the latest. 

fabio.heer@dvbern.ch (JIRA)

unread,
Oct 30, 2019, 10:46:03 AM10/30/19
to jenkinsc...@googlegroups.com
Fabio Heer edited a comment on Bug JENKINS-59893
It looks related to JENKINS-59903, which was fixed in Jenkins version 2.201.
Sorry, it was reported, not fixed.

fabio.heer@dvbern.ch (JIRA)

unread,
Oct 30, 2019, 10:46:03 AM10/30/19
to jenkinsc...@googlegroups.com
 
Re: bat calls hang in Windows Docker container in declarative pipeline script

It looks related to JENKINS-59903, which was fixed in Jenkins version 2.201.

anonymousaccounts@icloud.com (JIRA)

unread,
Oct 30, 2019, 4:08:03 PM10/30/19
to jenkinsc...@googlegroups.com
a b commented on Bug JENKINS-59893

Henry Borchers Which version are you currently running?

Joseph Petersen Our current server is 2016 thus we cannot use base images beyond microsoft/windowsservercore:ltsc2016 more or less. As per Henry Borchers 2019 host and images can exhibit the same issues. 

However, for other reasons we need to provision a second server so we are shooting for 2019. I will try to keep the rest of the setup the same and see if we experience the same issue.

Our use case is different and we are not running Jenkins agents inside the container, but I suppose we could try that Jenkins image you reference if we continue to have problems on 2019.

anonymousaccounts@icloud.com (JIRA)

unread,
Oct 30, 2019, 4:11:03 PM10/30/19
to jenkinsc...@googlegroups.com
a b edited a comment on Bug JENKINS-59893
[~henryborchers] Which version are you currently running?

[~casz] Our current server is 2016 thus we cannot use base images beyond microsoft/windowsservercore:ltsc2016 more or less. As per [~henryborchers] 2019 host and images can exhibit the same issues. 


However, for other reasons we need to provision a second server so we are shooting for 2019. I will try to keep the rest of the setup the same and see if we experience the same issue.

Our use case is different and we are not running Jenkins agents inside the container, but I suppose we could try that Jenkins image you reference if we continue to have problems on 2019. We won't be able to use it on 2016 because it seems to be based on windowsservercore 1809 which is beyond what we can run on 2016 as far as I can tell.

anonymousaccounts@icloud.com (JIRA)

unread,
Oct 30, 2019, 4:17:03 PM10/30/19
to jenkinsc...@googlegroups.com
a b edited a comment on Bug JENKINS-59893
[~henryborchers] Which version are you currently running? We installed fresh just a couple weeks ago but I can't check the version right now.

[~casz] Our current server is 2016 thus we cannot use base images beyond microsoft/windowsservercore:ltsc2016 more or less. As per [~henryborchers] 2019 host and images can exhibit the same issues. 

However, for other reasons we need to provision a second server so we are shooting for 2019. I will try to keep the rest of the setup the same and see if we experience the same issue.

Our use case is different and we are not running Jenkins agents inside the container, but I suppose we could try that Jenkins image you reference if we continue to have problems on 2019. We won't be able to use it on 2016 because it seems to be based on windowsservercore 1809 which is beyond what we can run on 2016 as far as I can tell.


*Edit*: Confirmed... 
{code:java}
>docker pull jenkins/agent:latest-windows
latest-windows: Pulling from jenkins/agent
a Windows version 10.0.17763-based image is incompatible with a 10.0.14393 host{code}

jerrywiltse@gmail.com (JIRA)

unread,
Nov 3, 2019, 11:30:02 AM11/3/19
to jenkinsc...@googlegroups.com

If you pass `–isolation=hyperv` you can run images based on any windows kernel, regardless of what kernel the host is on.

anonymousaccounts@icloud.com (JIRA)

unread,
Nov 4, 2019, 2:02:03 PM11/4/19
to jenkinsc...@googlegroups.com
a b edited a comment on Bug JENKINS-59893
[~solvingj] Does not work for me. At least not when trying to pull a 1903 image on Server 2019 1809. Maybe I only works for backward comparability, not forward?
{code:java}
>docker build --isolation="hyperv" -t "test_full" -f Dockerfile_1903 .
Sending build context to Docker daemon 13.82kB
Step 1/7 : FROM mcr.microsoft.com/powershell:7.0.0-preview.5-nanoserver-1903
7.0.0-preview.5-nanoserver-1903: Pulling from powershell
a Windows version 10.0.18362-based image is incompatible with a 10.0.17763 host{code}

anonymousaccounts@icloud.com (JIRA)

unread,
Nov 4, 2019, 2:02:03 PM11/4/19
to jenkinsc...@googlegroups.com
a b commented on Bug JENKINS-59893

jerry wiltse Does not work for me. At least not when trying to pull a 1903 image on Server 2019 1809. Maybe I only works for backward comparability, not forward?

anonymousaccounts@icloud.com (JIRA)

unread,
Nov 4, 2019, 3:07:04 PM11/4/19
to jenkinsc...@googlegroups.com
a b edited a comment on Bug JENKINS-59893
[~solvingj] Does not work for me. At least not when trying to pull a 1903 image on Server 2019 1809. Maybe I only works for backward comparability, not forward?  
**
{code:java}
>docker build --isolation="hyperv" -t "test_full" -f Dockerfile_1903 .

Sending build context to Docker daemon 13.82kB
Step 1/7 : FROM mcr.microsoft.com/powershell:7.0.0-preview.5-nanoserver-1903
7.0.0-preview.5-nanoserver-1903: Pulling from powershell
a Windows version 10.0.18362-based image is incompatible with a 10.0.17763 host{code}

*Edit*: If I try and older version I get `"The container operating system does not match the host operating system."` on a powershell step.without the `--isolation` flag. When adding the flag I get `The request is not supported.` on the same step.

anonymousaccounts@icloud.com (JIRA)

unread,
Nov 4, 2019, 3:08:03 PM11/4/19
to jenkinsc...@googlegroups.com

anonymousaccounts@icloud.com (JIRA)

unread,
Nov 4, 2019, 3:10:07 PM11/4/19
to jenkinsc...@googlegroups.com
a b edited a comment on Bug JENKINS-59893
[~solvingj] Does not work for me. At least not when trying to pull a 1903 image on Server 2019 1809. Maybe I only works for backward comparability, not forward? 


{code:java}
>docker build --isolation="hyperv" -t "test_full" -f Dockerfile_1903 . Sending build context to Docker daemon 13.82kB Step 1/7 : FROM mcr.microsoft.com/powershell:7.0.0-preview.5-nanoserver-1903 7.0.0-preview.5-nanoserver-1903: Pulling from powershell a Windows version 10.0.18362-based image is incompatible with a 10.0.17763 host{code}
*Edit*: If I try and older version like _nanoserver-1803_ I get " The _The container operating system does not match the host operating system system_ ." on a powershell step.without the --isolation flag. When adding the flag I get "The request is not supported." on the same step.

jerrywiltse@gmail.com (JIRA)

unread,
Nov 4, 2019, 3:10:07 PM11/4/19
to jenkinsc...@googlegroups.com

I think you either need to be on a newer version of docker, or you need to enable experimental features.  

jerrywiltse@gmail.com (JIRA)

unread,
Nov 4, 2019, 3:10:07 PM11/4/19
to jenkinsc...@googlegroups.com
jerry wiltse edited a comment on Bug JENKINS-59893
I don't use quotes, but i don't think that's the issue.  I think you either need to be on a newer version of docker, or you need to enable experimental features.  

anonymousaccounts@icloud.com (JIRA)

unread,
Nov 4, 2019, 3:11:02 PM11/4/19
to jenkinsc...@googlegroups.com
a b edited a comment on Bug JENKINS-59893
[~solvingj] Does not work for me. At least not when trying to pull a 1903 image on Server 2019 1809. Maybe I only works for backward comparability, not forward? 
{code:java}
>docker build --isolation="hyperv" -t "test_full" -f Dockerfile_1903 . Sending build context to Docker daemon 13.82kB Step 1/7 : FROM mcr.microsoft.com/powershell:7.0.0-preview.5-nanoserver-1903 7.0.0-preview.5-nanoserver-1903: Pulling from powershell a Windows version 10.0.18362-based image is incompatible with a 10.0.17763 host{code}
*Edit*: If I try and older version like _nanoserver-1803_ I get "_The container operating system does not match the host operating system_." on a powershell step.without the --isolation flag. When adding the flag I get " The _The request is not supported supported_ ." on the same step.

anonymousaccounts@icloud.com (JIRA)

unread,
Nov 4, 2019, 3:14:03 PM11/4/19
to jenkinsc...@googlegroups.com
a b commented on Bug JENKINS-59893

We're on the latest version 19.03.4 on Server 2019 (1809) now. Just edited my previous comment with more info. I get different results from docker build if I add the --isolation flag. Fails either way but getting a different result makes me thing it's attempting to apply the isolation setting. Either way I don't think it will solve our issues.

jerrywiltse@gmail.com (JIRA)

unread,
Nov 4, 2019, 3:20:02 PM11/4/19
to jenkinsc...@googlegroups.com

anonymousaccounts@icloud.com (JIRA)

unread,
Nov 6, 2019, 1:02:02 PM11/6/19
to jenkinsc...@googlegroups.com
a b commented on Bug JENKINS-59893

We've now tried this on a Windows Server 2019 host (1809) and a vanilla{{ }}mcr.microsoft.com/windows/servercore:1809 image and the bat call still hangs.  Even worse when we try a nanoserver image such as mcr.microsoft.com/powershell:nanoserver-1809 it will hang on both powershell and bat and when the Jenkins job is manually cancelled the server will experience a critical error and reboot itself!

 

Critical: The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

 

josephp90@gmail.com (JIRA)

unread,
Nov 8, 2019, 12:08:03 PM11/8/19
to jenkinsc...@googlegroups.com

anonymousaccounts@icloud.com (JIRA)

unread,
Nov 8, 2019, 12:21:03 PM11/8/19
to jenkinsc...@googlegroups.com
a b commented on Bug JENKINS-59893

That is the method we used for install.

The curious thing is that sometimes the bat calls work. We have some large and complex pipeline scripts in which many or even most of the bat calls will work but some still fail and the hello world example above fails ever time.

I suspect that something else in the larger pipelines are inadvertently sidestepping / “fixing” the issue in real time. So there might be some kind of context in which the calls work and another (like the hello world) where they don’t. Haven’t been able to narrow it down yet.

josephp90@gmail.com (JIRA)

unread,
Nov 8, 2019, 12:27:02 PM11/8/19
to jenkinsc...@googlegroups.com

Wish I could help you but we haven't experienced the issue

Reply all
Reply to author
Forward
0 new messages