Build hang at start while cleaning up

1,234 views
Skip to first unread message

Allen Cronce

unread,
Jun 21, 2012, 10:10:28 AM6/21/12
to Jenkins Users
Hi all,

One of our nightly builds has started regularly failing due to a hang. It gets stuck at the very beginning of the job attempting to clean up the workspace. When look at the console output, it's just got the spinning progress wheel.

The job in question is configured with the "Always check out a fresh copy" SVN checkout strategy. The job runs on a build slave via ssh. The job configuration itself has not changed in a long time. But the SVN project is very large and very active. The only recent Jenkins environment changes have been the addition of some plugins used by other jobs, including the "Jenkins Workspace Cleanup Plugin" ironically.

There is nothing interesting in the log when the failure happens. I only see this when I manually stop the build after it's been trying to clean up for 6 hours:

--snip--
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at hudson.remoting.Request.call(Request.java:127)
at hudson.remoting.Channel.call(Channel.java:681)
at hudson.FilePath.act(FilePath.java:777)
at hudson.FilePath.act(FilePath.java:770)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:743)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:685)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1197)
at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:579)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:468)
at hudson.model.Run.run(Run.java:1410)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:238)
--snip--

Note that operating under the assumption that maybe an open file is hanging the build, I tried restarting the build slave, the restarting the build. Same hang. The only work around has been to manually throw the workspace away on the slave, then start the build.

The problem sounds exactly like a post to this group with the subject "jenkins hanging on build" from August 10, 2011. But I don't see any reply or resolution.

This problem is really killing us because we're in total crunch mode right now and this is the first build in a series that takes nearly 6 hours. Every time the build hangs, we have to restart it manually, which loses valuable testing time.

As a desperate work around attempt, I'm trying the Jenkins Workspace Cleanup Plugin, which we recently installed, but is not being currently used with this job. It's been churning away for 20 minutes with no visible progress at the console level. But at least it seems to be doing something under the hood. I see the java task on the client side as 100+% (it's a multi core system), and according to fs_usage (it's a Mac) it is actually unlinking files. Maybe someday it will finish...

If anyone else has any experience with this problem or suggestions for a work around, I'd appreciate it.

Best,
--
Allen Cronce

Allen Cronce

unread,
Jun 22, 2012, 10:59:03 AM6/22/12
to Jenkins Users
Hi all,

We ended up working around this problem by using a script to perform the clean up of the workspace. The script takes about 10 seconds to run, as opposed to the estimated hour it used to take the Java svn implementation before it started taking literally forever.

While we were at it, we decided to move to using a script to perform a platform specific, sparse directory checkout. This is the same script that our developers use to get only the stuff that they need to build on a given platform. No need to have pre-built Windows libraries checked out on a Mac, for example.

In total, moving away from letting Jenkins and the underlying SVN library do the clean up and checkout has shaved 2 hours off of our build time.

What we've lost is the ability to nicely associate svn changes with a given build on Jenkins. To attempt address this, I tried to have the build job that's downstream of the cleanup/checkout job do an SVN update of the workspace, but after watching it run for an hour, I gave up and terminated it. The same update of a sparse check out workspace with the command line svn tool takes 7 seconds.

Honestly, I don't know if the current underlying Java svn lib simply cannot handle updating a sparse directory (which would be a bug), or it's just so slow that it's unreasonable to use with our source tree. Either way it looks like we're going to stick with the pragmatic approach of going "native" for portions of our jobs when Jenkins doesn't give us with the results we need.

Best,
--
Allen Cronce

Sami Tikka

unread,
Jun 23, 2012, 4:51:25 PM6/23/12
to jenkins...@googlegroups.com
Jenkins usually isn't much slower checking out source code. I do not use many SVN repositories but I do know I had problems with one SVN repo that had a couple of gigabytes of files. Jenkins was running out of memory.

You should check the output of the Jenkins process and/or the system logs under Manage Jenkins page. If you see out-of-memory exceptions, you can try to increase the relevant memory pools of the jvm. You can also use jconsole to connect to the jvm running Jenkins and monitor how the pools are doing. Google will tell you how you need to start the jvm to allow jconsole to connect to it.

You will also have much faster builds if you configure Jenkins to only update an existing workspace and not check out of fresh copy on every build. If you are concerned you will not get a clean build otherwise, I believe Jenkins has an option to delete any files not controlled by SVN, which should give you the same result.

Another tip: When Jenkins is hanging or otherwise not doing what it should, go to $JENKINS_URL/threadDump and have a read. If you cannot figure it out, include the thread dump into your question on this list.

-- Sami
Reply all
Reply to author
Forward
0 new messages