Jenkins Job spawning a thread that creates an ever-growing log file

82 views
Skip to first unread message

Jeff Dickerson

unread,
Jul 15, 2015, 1:07:08 PM7/15/15
to jenkins...@googlegroups.com
I have a Jenkins job running some JBehave tests. The job fails, but isn't listed in the build queue any more. Unfortunately, it appears to have spawned a thread that is stuck in an infinite loop. It keeps throwing an interrupt exception, printing a stack trace, and then doing it again. The problem is that it's writing the exception and stack trace to a log file in the /tmp directory on the Jenkins master. The /tmp directory on master is on it's own mount point, but when it's full, it does stop Jenkins from performing any builds, since temporary log files are written there.

The only way to recover from this, is to restart the Jenkins service on the master. If I kill the slave the log file stops growing, but it won't release the disk space, even if the file is deleted, until the Jenkins master restarts.

Both the master and the slave are running on RHEL6. I'm running Jenkins 1.580.3 LTS. I have the logfilesizechecker plugin installed. But it doesn't catch this log.

Has anyone encountered a similar case?

I really don't like the idea that one rogue job can bring down the entire system.

James Nord

unread,
Jul 15, 2015, 5:37:19 PM7/15/15
to jenkins...@googlegroups.com
On 15/07/2015 18:07, Jeff Dickerson wrote:
> The only way to recover from this, is to restart the Jenkins service
> on the master.
Well depending what that thread is you could kill it in the script console.
(Killing threads is considered 7/10 on my evil scale)


> I really don't like the idea that one rogue job can bring down the
> entire system.

It doesn't sound like it is the job that is the problem but a plugin.
A search for a imilar ticket in the issue tracker
(issues.jenkins-ci.org) and if non present a new ticket against the
plugin may help.

I just wanted to check that you're also not running the build on the
master are you - you are using slaves?

/James

Jeff Dickerson

unread,
Jul 16, 2015, 12:12:05 PM7/16/15
to jenkins...@googlegroups.com
I am not running the build on the master. They run on a separate slave. Thanks for the suggestion to look for a ticket in the issue tracker for the plugin. I'll try that.

Jeff

Jeff Dickerson

unread,
Jul 21, 2015, 2:36:51 PM7/21/15
to jenkins...@googlegroups.com
After a lot of investigation, I now have a way to reliably reproduce this error, and can avoid it. The issue occurs when Jenkins tries to abort the job during JBehave testing. It originally occurred when someone manually aborted a job, but became consistent when I enabled a Jenkins plugin that monitored the build time and aborted the job if it ran too long. Since the Dev/QA team had recently added a bunch of new tests, the job was taking significantly longer, and so Jenkins was trying to abort the job.

The thing is, the job would successfully abort, but JBehave would spin up a thread on the SLAVE that was writing a log file to the /tmp directory on the MASTER, with the following stack trace, over and over, in an infinite loop until the disk filled up. The class/method referenced in the error message is the main JBehave class.

ha:AAAAYx+LCAAAAAAAAP9b85aBtbiIQSWjNKU4P0+vJLE4u1gvPjexLDVPzxdEhicW5WXmpfvll6S2fNly5fzGzauYGBgqihikoFqS8/OK83NS9ZwhNEghAwQwghQWAACwxA+XYgAAAA==[WARNING] interrupted while joining against thread Thread[com.xxx.xxx.xxx.test.TestOrderCallMain.main(),5,com.com.xxx.xxx.xxx.test.TestOrderCallMain]
java
.lang.InterruptedException
 at java
.lang.Object.wait(Native Method)
 at java
.lang.Thread.join(Thread.java:1281)
 at org
.codehaus.mojo.exec.ExecJavaMojo.joinThread(ExecJavaMojo.java:415)
 at org
.codehaus.mojo.exec.ExecJavaMojo.joinNonDaemonThreads(ExecJavaMojo.java:405)
 at org
.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:317)
 at org
.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
 at org
.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
 at org
.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
 at org
.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
 at org
.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
 at org
.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
 at org
.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
 at org
.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
 at org
.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
 at org
.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
 at org
.jvnet.hudson.maven3.launcher.Maven3Launcher.main(Maven3Launcher.java:117)
 at sun
.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun
.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at sun
.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java
.lang.reflect.Method.invoke(Method.java:606)
 at org
.codehaus.plexus.classworlds.launcher.Launcher.launchStandard(Launcher.java:329)
 at org
.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:239)
 at org
.jvnet.hudson.maven3.agent.Maven3Main.launch(Maven3Main.java:178)
 at sun
.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun
.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at sun
.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java
.lang.reflect.Method.invoke(Method.java:606)
 at hudson
.maven.Maven3Builder.call(Maven3Builder.java:134)
 at hudson
.maven.Maven3Builder.call(Maven3Builder.java:69)
 at hudson
.remoting.UserRequest.perform(UserRequest.java:121)
 at hudson
.remoting.UserRequest.perform(UserRequest.java:49)
 at hudson
.remoting.Request$2.run(Request.java:324)
 at hudson
.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
 at java
.util.concurrent.FutureTask.run(FutureTask.java:262)
 at java
.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at java
.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

 at java
.lang.Thread.run(Thread.java:745)

I've mostly avoided the issue by disabling the Jenkins plugins that automatically abort the job run. There is still risk, however, because users can manually abort runs.
Reply all
Reply to author
Forward
0 new messages