[JIRA] (JENKINS-60701) reuseNode should default to true / docker agent stage: Waiting for next available executor

1 view
Skip to first unread message

accounts.jenkins.io@terrorise.me.uk (JIRA)

unread,
Jan 8, 2020, 1:41:02 PM1/8/20
to jenkinsc...@googlegroups.com
frankie fisher created an issue
 
Jenkins / Bug JENKINS-60701
reuseNode should default to true / docker agent stage: Waiting for next available executor
Issue Type: Bug Bug
Assignee: Carlos Sanchez
Attachments: jenkins docker agent reproduction.txt
Components: core, docker
Created: 2020-01-08 18:40
Environment: Jenkins 2.190.3
docker pipeline 1.21
docker commons 1.16
Priority: Minor Minor
Reporter: frankie fisher

If you have a multi-stage pipeline with an outer "agent any" specification, and then a particular stage runs inside a docker container, so you use an "agent {docker{}}" section, you will deadlock all your executors if you set off more jobs of this type, than you have executors.

This happens because the outer agent attempts to find a new executor to run the docker agent, but as you have set off as many jobs of this type as you have executors, all executors are currently in use, so the outer agent never finds a new executor, thus never frees itself. Deadlock. I suspect this problem is not exclusive to docker agents.

You can solve this problem by setting "reuseNode true" inside the docker agent specification.

Solution

Because of the simplicity of deadlocking all executors, I think reuseNode=true should be the default value for agents. I don't know enough about Jenkins to understand the possible downsides to doing this, so there may be good reasons to reject this Issue. But it would be a good idea for someone who understands the implications of reuseNode to look at this Issue and make the decision.

Reproduction

To reproduce this issue, create a build slave with 1 executor with the name "test_executor". Then set off a single build job with the attached minimal jenkins pipeline. This pipeline should have "reuseNode" unset or set to false. What you should observe is that the first stage completes, then the job gets stuck at the start of the second stage with "Waiting for next available executor". The same thing can happen if you have N executors, and set off N instances of a job like this at the same time.

Once you've done this, cancel that build, update the attached pipeline so that "reuseNode" is set to true, and rebuild the job. You should see that the build completes successfully.

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)
Atlassian logo
Reply all
Reply to author
Forward
0 new messages