[JIRA] [maven-plugin] (JENKINS-26048) Jenkins no longer cleaning up child processes when build stopped - as of 1.587

18 views
Skip to first unread message

rddesmond@gmail.com (JIRA)

unread,
Jun 19, 2015, 11:25:01 AM6/19/15
to jenkinsc...@googlegroups.com
Ryan Desmond commented on Bug JENKINS-26048
 
Re: Jenkins no longer cleaning up child processes when build stopped - as of 1.587

I just marked

JENKINS-28968 as a duplicate of this. Over there, I have a simple procedure for creating an offending Maven build type job that has this problem. One thing I noticed was that one process is killed after aborting.

Just before aborting:

$ ps aux | grep sure
[user] 4220 0.0 0.0 113120 1188 ? S 10:12 0:00 /bin/sh -c cd /home/ussuser/jenkins/workspace/sleeptest && /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.51-2.4.5.5.el7.x86_64/jre/bin/java -jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefirebooter449566822541979931.jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire7684357083774633779tmp /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire_01856690741005869733tmp
[user] 4222 13.0 0.1 6102248 31668 ? Sl 10:12 0:00 java -jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefirebooter449566822541979931.jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire7684357083774633779tmp /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire_01856690741005869733tmp

After aborting:
$ ps aux | grep sure
[user] 4222 5.2 0.1 6102248 31612 ? Sl 10:12 0:00 java -jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefirebooter449566822541979931.jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire7684357083774633779tmp /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire_01856690741005869733tmp

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265)
Atlassian logo

rddesmond@gmail.com (JIRA)

unread,
Jun 19, 2015, 11:26:01 AM6/19/15
to jenkinsc...@googlegroups.com
Ryan Desmond edited a comment on Bug JENKINS-26048
I just marked [JENKINS-28968] as a duplicate of this.  Over there, I have a simple procedure for creating an offending Maven build type job that has this problem.  One thing I noticed was that one process is killed after aborting.

Just before aborting:
{quote}

$ ps aux | grep sure
[user] 4220 0.0 0.0 113120 1188 ? S 10:12 0:00 /bin/sh -c cd /home/ussuser/jenkins/workspace/sleeptest && /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.51-2.4.5.5.el7.x86_64/jre/bin/java -jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefirebooter449566822541979931.jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire7684357083774633779tmp /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire_01856690741005869733tmp
[user] 4222 13.0 0.1 6102248 31668 ? Sl 10:12 0:00 java -jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefirebooter449566822541979931.jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire7684357083774633779tmp /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire_01856690741005869733tmp
{quote}

After aborting:
{quote}
$ ps aux | grep sure
[user] 4222 5.2 0.1 6102248 31612 ? Sl 10:12 0:00 java -jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefirebooter449566822541979931.jar /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire7684357083774633779tmp /home/[user]/jenkins/workspace/sleeptest/target/surefire/surefire_01856690741005869733tmp
{quote}

rddesmond@gmail.com (JIRA)

unread,
Jun 26, 2015, 3:19:01 PM6/26/15
to jenkinsc...@googlegroups.com

I had a chance to look at it a little more. I don't understand it fully, but I think the big difference is that the process tree is not referenced in the maven job. As such, we kill the maven instance itself but child processes (such as surefire) aren't stopped. I'm not sure why maven isn't taking care of that itself – does Jenkins kill -9 maven?

I think the other question is why Jenkins isn't using the process tree to kill the entire tree in maven jobs. It seems like it would be easy for a maven job to spin of a child (like surefire), which could exec another child.

Alexander@Kriegisch.name (JIRA)

unread,
Feb 9, 2016, 9:18:05 AM2/9/16
to jenkinsc...@googlegroups.com

This has been a massive impediment for our team for almost a year now. We have a story branch workflow with 10+ jobs in parallel on average. Every time a build is aborted I have to manually log onto the server via SSH and kill Surefire or Failsafe instances along with their child PhantomJS instances because otherwise we run out of memory quickly.

Please fix the bug and in the meantime offer a workaround for Maven jobs, if possible.

grayaii@gmail.com (JIRA)

unread,
Feb 9, 2016, 9:31:07 AM2/9/16
to jenkinsc...@googlegroups.com

We get bitten by this too. A bunch of our jobs have a pre build step that basically does a:

Unable to find source-code formatter for language: basg. Available languages are: actionscript, html, java, javascript, none, sql, xhtml, xml
ps aux | grep [the-thing-you-want-to-kill] | grep -v grep | awk "{ print \$2 }"

and pass those pids to kill -9

That kills every "zombie/parent-less" process BEFORE the build runs. This ensures the build will run in a relatively clean environment.
Just be careful what you kill

This is only a workaround, but it works for us.
Hope this helps!

grayaii@gmail.com (JIRA)

unread,
Feb 9, 2016, 9:32:04 AM2/9/16
to jenkinsc...@googlegroups.com
Alex Gray edited a comment on Bug JENKINS-26048
We get bitten by this too. A bunch of our jobs have a pre build step that basically does a:

{code: bash none }

ps aux | grep [the-thing-you-want-to-kill] | grep -v grep | awk "{ print \$2 }"
{code}


and pass those pids to kill -9

That kills every "zombie/parent-less" process BEFORE the build runs.  This ensures the build will run in a relatively clean environment.
Just be careful what you kill :)


This is only a workaround, but it works for us.
Hope this helps!

grayaii@gmail.com (JIRA)

unread,
Feb 9, 2016, 9:32:04 AM2/9/16
to jenkinsc...@googlegroups.com
Alex Gray edited a comment on Bug JENKINS-26048
We get bitten by this too. A bunch of our jobs have a pre build step that basically does a:

{code: basg bash }

ps aux | grep [the-thing-you-want-to-kill] | grep -v grep | awk "{ print \$2 }"
{code}

and pass those pids to kill -9

That kills every "zombie/parent-less" process BEFORE the build runs.  This ensures the build will run in a relatively clean environment.
Just be careful what you kill :)

This is only a workaround, but it works for us.
Hope this helps!

Steffen.Breitbach@1und1.de (JIRA)

unread,
Feb 9, 2016, 10:24:04 AM2/9/16
to jenkinsc...@googlegroups.com

After a bit of investigation, we found out that all of the processes that stay after a job has been terminated "abnormally" will have INIT as their parent process. So if you kill the process(tree) that starts with a java process which has INIT as parent, you should be safe (and in my opinion safer as grepping for strings).

You could for example create a cronjob for this task.

dbogardus1@yahoo.com (JIRA)

unread,
Feb 9, 2016, 11:59:02 AM2/9/16
to jenkinsc...@googlegroups.com

I've kept my company on Jenkins version 1.586 this whole time. It does not suffer from this bug and cleans up all surefire and phantomjs processes nicely.

mihelich@google.com (JIRA)

unread,
Feb 23, 2016, 8:30:02 PM2/23/16
to jenkinsc...@googlegroups.com

I have a simple repro case with a freestyle project and "Execute shell" build steps. I'm on Jenkins 1.625.3, Linux master, Linux slave.

I hit this because we have freestyle projects where several build processes execute in parallel, but use flock to gate access to a shared resource. Sometimes, when a job is aborted, one of more of these child processes persist and prevent future jobs on the slave from acquiring the shared lock. The flock stuff below isn't necessary to repro - just sleep gives similar behavior - but it matches my use case and makes it easy to identify affected processes with fuser.

Create a freestyle project with two build steps:

  1. Execute shell
    #!/bin/bash -ex
    nohup flock /var/lock/mylockfile sleep 1h &
    
  2. Execute shell
    #!/bin/bash -ex
    sleep 1h
    

Then abort the job (manually or by timeout). flock and its child sleep process persist, and continue to hold the lock.

This is the simplest project configuration I could construct. In all of these cases, the child processes are killed as expected:

  • Omitting the second "Execute shell."
  • Combining them into a single "Execute shell."
  • Failing by means other than abort, e.g. /bin/false in the second "Execute shell."

Sample results below. While the job is running, the lock is in use as expected:

$ fuser /var/lock/mylockfile
22733 22734

$ ps -p 22733,22734 -o pid,ppid,stat,lstart,args
  PID  PPID STAT                  STARTED COMMAND
22733     1 S    Wed Feb 24 00:57:51 2016 flock /var/lock/mylockfile sleep 1h
22734 22733 S    Wed Feb 24 00:57:51 2016 sleep 1h

Then abort the job:

[experimental_jenkins_26048] $ /bin/bash -ex /tmp/hudson8042917752397215577.sh
+ nohup flock /var/lock/mylockfile sleep 1h
[experimental_jenkins_26048] $ /bin/bash -ex /tmp/hudson4924658810125221857.sh
+ sleep 1h
Build timed out (after 3 minutes). Marking the build as aborted.
Build was aborted
Finished: ABORTED

Afterwards, the processes are still alive:

$ ps -p 22733,22734 -o pid,ppid,stat,lstart,args
  PID  PPID STAT                  STARTED COMMAND
22733     1 S    Wed Feb 24 00:57:51 2016 flock /var/lock/mylockfile sleep 1h
22734 22733 S    Wed Feb 24 00:57:51 2016 sleep 1h

BUILD_ID is unchanged, so ProcessTreeKiller should find them:

$ strings /proc/22733/environ | grep BUILD_ID
BUILD_ID=17
$ strings /proc/22734/environ | grep BUILD_ID
BUILD_ID=17

mihelich@google.com (JIRA)

unread,
Feb 23, 2016, 9:17:02 PM2/23/16
to jenkinsc...@googlegroups.com

mihelich@google.com (JIRA)

unread,
Feb 23, 2016, 9:18:01 PM2/23/16
to jenkinsc...@googlegroups.com
 
Re: Jenkins no longer cleaning up child processes when build stopped - as of 1.587

Adding core component, since I can repro without using plugins.

dbeck@cloudbees.com (JIRA)

unread,
Apr 13, 2016, 8:02:02 PM4/13/16
to jenkinsc...@googlegroups.com

sorin.sbarnea@gmail.com (JIRA)

unread,
Jun 21, 2016, 3:17:06 PM6/21/16
to jenkinsc...@googlegroups.com
Sorin Sbarnea commented on Bug JENKINS-26048
 
Re: Jenkins no longer cleaning up child processes when build stopped - as of 1.587

I faced the same bug with Jenkins 2.9 and some shell jobs started via pipelines. It does not always happen so I don't really know what is causing it.

This message was sent by Atlassian JIRA (v7.1.7#71011-sha1:2526d7c)
Atlassian logo

dbeck@cloudbees.com (JIRA)

unread,
Jul 27, 2016, 2:37:02 PM7/27/16
to jenkinsc...@googlegroups.com

claudio.curzi@4ts.it (JIRA)

unread,
Aug 1, 2016, 4:29:04 AM8/1/16
to jenkinsc...@googlegroups.com

Hi, I have the same problem on my windows server when build running on the slave machine.

Right now I have resolved by running builds criticism on the master.

claudio.curzi@4ts.it (JIRA)

unread,
Aug 1, 2016, 4:29:05 AM8/1/16
to jenkinsc...@googlegroups.com
Claudio Curzi edited a comment on Bug JENKINS-26048
Hi, I have the same problem on my windows server when build running on the slave machine.

Right now I have resolved by running builds criticism on the master.

I use Jenkins 2.11.

claudio.curzi@4ts.it (JIRA)

unread,
Aug 1, 2016, 5:16:03 AM8/1/16
to jenkinsc...@googlegroups.com

claudio.curzi@4ts.it (JIRA)

unread,
Aug 1, 2016, 5:17:01 AM8/1/16
to jenkinsc...@googlegroups.com
Claudio Curzi assigned an issue to Unassigned

jordans@crytek.de (JIRA)

unread,
Aug 8, 2016, 12:40:02 PM8/8/16
to jenkinsc...@googlegroups.com
Jordan Stefanelli commented on Bug JENKINS-26048
 
Re: Jenkins no longer cleaning up child processes when build stopped - as of 1.587

I also have seen Jenkins not killing off subprocesses when a step (regular, multiphase, and the new pipeline plugin) jobs.

Claudio Curzi What is the build criticism on master you are referring to?

jordans@crytek.de (JIRA)

unread,
Aug 8, 2016, 12:42:01 PM8/8/16
to jenkinsc...@googlegroups.com
Jordan Stefanelli edited a comment on Bug JENKINS-26048
I also have seen Jenkins not killing off subprocesses when a step (regular, multiphase, and the new pipeline plugin) jobs , using Jenkins 2 . 15, on Windows 10 systems.  I'm happy to provide further machine specs & repro.

My test case was running 'python -c "import time;time.sleep(100000)"'  and aborting/terminating the job (both) from Jenkins.  Using pipeline.

This can leave file handles open and block subsequent attempts to build in the same workspace, causing failures.

[~claudiocurzi] What is the build criticism on master you are referring to?

claudio.curzi@4ts.it (JIRA)

unread,
Aug 9, 2016, 3:32:02 AM8/9/16
to jenkinsc...@googlegroups.com

I have only builds using "Execute Windows Batch Command".

All my builds when running on the slave server and it being aborted by the timeout settings on the log, the process on the slave server continue to run.

ppaczyn@gmail.com (JIRA)

unread,
Sep 19, 2016, 10:54:01 AM9/19/16
to jenkinsc...@googlegroups.com

For us it seems to work fine when jobs are run on master, but processes are not killed when job runs on slave (Maven job type).

claudio.curzi@4ts.it (JIRA)

unread,
Sep 19, 2016, 11:07:06 AM9/19/16
to jenkinsc...@googlegroups.com
Claudio Curzi updated an issue
 
Change By: Claudio Curzi
Environment: CentOS 6, JRE 7 or 8
Windows Server 2008 R2, JRE 7 or 8

jchatham@acuitus.com (JIRA)

unread,
Sep 30, 2016, 2:54:02 PM9/30/16
to jenkinsc...@googlegroups.com
jchatham commented on Bug JENKINS-26048
 
Re: Jenkins no longer cleaning up child processes when build stopped - as of 1.587

I recently encountered this issue in the context of using the build-timeout plugin to stop a maven build; I don't have an actual fix for the bug in the maven module, but was able to put together some code to make the build timeout plugin kill off appropriate child processes. See my comment on JENKINS-28125 if you think that code might be of use to you.

tib@hms.se (JIRA)

unread,
Nov 22, 2016, 8:11:03 AM11/22/16
to jenkinsc...@googlegroups.com

We are seeing this issue.
Just regular pipeline jobs, starting a background process like this:

bat "start MyProc.exe"

From my understanding the process should be killed when the job is done, but it keeps running.

tib@hms.se (JIRA)

unread,
Nov 22, 2016, 8:14:06 AM11/22/16
to jenkinsc...@googlegroups.com
Timmy Brolin edited a comment on Bug JENKINS-26048
We are seeing this issue.
Linux master, Windows slaves.
No Maven.
Just regular pipeline jobs, starting a background process like this:

bat "start MyProc.exe"

From my understanding the process should be killed when the job is done
, but ? But it keeps running.

jglick@cloudbees.com (JIRA)

unread,
Jan 24, 2017, 11:48:06 AM1/24/17
to jenkinsc...@googlegroups.com

Timmy Brolin currently Pipeline sh/bat steps kill all processes when the build is interrupted, but not when the main process simply exits on its own. Could be filed under durable-task-plugin. Unrelated to this issue.

o.v.nenashev@gmail.com (JIRA)

unread,
Mar 4, 2018, 6:18:04 PM3/4/18
to jenkinsc...@googlegroups.com

Apparently the 32bit OS support was partially broken in WinP: https://github.com/kohsuke/winp/issues/48
Does anybody see the issue on 64bit systems?

This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)
Atlassian logo

o.v.nenashev@gmail.com (JIRA)

unread,
Mar 4, 2018, 6:25:04 PM3/4/18
to jenkinsc...@googlegroups.com

o.v.nenashev@gmail.com (JIRA)

unread,
Jan 2, 2019, 5:12:06 AM1/2/19
to jenkinsc...@googlegroups.com
 
Re: Jenkins no longer cleaning up child processes when build stopped - as of 1.587

I will have no time to work on it anytime soon, please see https://groups.google.com/d/msg/jenkinsci-dev/uc6NsMoCFQI/AIO4WG1UCwAJ for the context. I will unassign it so that somebody else can work on it

This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

o.v.nenashev@gmail.com (JIRA)

unread,
Jan 2, 2019, 5:12:06 AM1/2/19
to jenkinsc...@googlegroups.com

o.v.nenashev@gmail.com (JIRA)

unread,
Jan 2, 2019, 5:12:08 AM1/2/19
to jenkinsc...@googlegroups.com

zolnaczpiotr8@gmail.com (JIRA)

unread,
Jul 15, 2019, 2:57:04 AM7/15/19
to jenkinsc...@googlegroups.com

zolnaczpiotr8@gmail.com (JIRA)

unread,
Jul 15, 2019, 2:58:03 AM7/15/19
to jenkinsc...@googlegroups.com
Piotr Zolnacz commented on Bug JENKINS-26048
 
Re: Jenkins no longer cleaning up child processes when build stopped - as of 1.587

Hi, 

I'm going to fix this bug, this is quite critical for my daily work with jenkins. 

Oleg Nenashev it seems like it doesn't appear in 64 bit slaves.

Reply all
Reply to author
Forward
0 new messages