Re: Job is getting stuck before it can even start building

55 views
Skip to first unread message

Kohsuke Kawaguchi

unread,
Nov 23, 2010, 12:40:25 AM11/23/10
to hudson...@googlegroups.com, sho...@yahoo.com, geoffrey...@gmail.com
Please see http://wiki.hudson-ci.org/display/HUDSON/Build+is+hanging
and get us the stack trace, so that we can see where the hang is
happening.

2010/11/21 shobhad <sho...@yahoo.com>:
>
> Hello,
>
>   I am facing the following issue of a stuck job execution.
>
>   Once a build is fired, the console output keeps showing the first 2 lines
> of job execution.
>
>   Started by user xyz
>    Building remotely on  slave-machine-name
>
>
>   I cannot even cancel the job, I have to restart tomcat. It is painful
> since there are other jobs building.  This happens randomly on any machine
> any job thus making it harder to debug. Jobs that do not use the perforce
> SCM plugin are also getting affected.
>
>  This is what I have done so far to debug.
>    1.  Added -XX:MaxPermSize=256m for Tomcat in the Options Registry key.
> Added Keys JVMms=218, JVMmx=512.
>    2.  Upgraded hudson verion to 1.381 after finding out that there are
> some defects fixed for the same or similar issues.
>
>   But yesterday the bug crept in again. The Hudson server is on a Windows 7
> machine. The hudson and catalina logs don't show much.
>
> Appreciate the help.
>
> Thanks
> Shobha
>
>
> --
> View this message in context: http://hudson.361315.n4.nabble.com/Job-is-getting-stuck-before-it-can-even-start-building-tp3053144p3053144.html
> Sent from the Hudson users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-un...@hudson.dev.java.net
> For additional commands, e-mail: users...@hudson.dev.java.net
>
>

--
Kohsuke Kawaguchi

ShobhaD

unread,
Dec 15, 2010, 11:28:04 PM12/15/10
to Hudson Users
I thought after upgrading Hudson and upgrading the perforce plugin,
the problem had gone away but it is reappearing.

It happened today and here is the threadDump on the particular job/
slave that got stuck.

Channel reader thread: slave2
"Channel reader thread: slave2" Id=123 Group=main RUNNABLE (in native)
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
- locked java.io.BufferedInputStream@483fd4
at java.io.FilterInputStream.read(Unknown Source)
at hudson.remoting.BinarySafeStream
$1._read(BinarySafeStream.java:149)
at hudson.remoting.BinarySafeStream
$1.read(BinarySafeStream.java:80)
at java.io.ObjectInputStream$PeekInputStream.peek(Unknown
Source)
at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown
Source)
at java.io.ObjectInputStream
$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)

...
...

Executor #0 for slave2 : executing job2-Continuous #1403
"Executor #0 for slave2 : executing job2-Continuous #1403" Id=70
Group=main BLOCKED on hudson.remoting.Channel@d68b39 owned by
"Workspace clean-up thread" Id=1419
at hudson.remoting.Request.call(Request.java:100)
- blocked on hudson.remoting.Channel@d68b39
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.FilePath.act(FilePath.java:742)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.mkdirs(FilePath.java:801)
at hudson.model.AbstractProject.checkout(AbstractProject.java:
1090)
at hudson.model.AbstractBuild
$AbstractRunner.checkout(AbstractBuild.java:479)
at hudson.model.AbstractBuild
$AbstractRunner.run(AbstractBuild.java:411)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at
hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)

From the ThreadDump, it appears that the executor is waiting and is
stuck on a "workspace cleanup" process. I don't know how to kill it
either as it does not appear in the task manager.

Let me know if you need anything more from the ThreadDump as it is
really very huge to post it entirely over here.

Thanks
Shobha

On Nov 23, 10:40 am, Kohsuke Kawaguchi <k...@kohsuke.org> wrote:
> Please seehttp://wiki.hudson-ci.org/display/HUDSON/Build+is+hanging
> and get us the stack trace, so that we can see where the hang is
> happening.
>
> 2010/11/21 shobhad <shob...@yahoo.com>:
>
>
>
>
>
> > Hello,
>
> >   I am facing the following issue of a stuck job execution.
>
> >   Once a build is fired, the console output keeps showing the first 2 lines
> > of job execution.
>
> >   Started by user xyz
> >    Building remotely on  slave-machine-name
>
> >   I cannot even cancel the job, I have to restart tomcat. It is painful
> > since there are other jobs building.  This happens randomly on any machine
> > any job thus making it harder to debug. Jobs that do not use the perforce
> > SCM plugin are also getting affected.
>
> >  This is what I have done so far to debug.
> >    1.  Added -XX:MaxPermSize=256m for Tomcat in the Options Registry key.
> > Added Keys JVMms=218, JVMmx=512.
> >    2.  Upgraded hudson verion to 1.381 after finding out that there are
> > some defects fixed for the same or similar issues.
>
> >   But yesterday the bug crept in again. The Hudson server is on a Windows 7
> > machine. The hudson and catalina logs don't show much.
>
> > Appreciate the help.
>
> > Thanks
> >Shobha
>
> > --
> > View this message in context:http://hudson.361315.n4.nabble.com/Job-is-getting-stuck-before-it-can...
> > Sent from the Hudson users mailing list archive at Nabble.com.
>
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscr...@hudson.dev.java.net
> > For additional commands, e-mail: users-h...@hudson.dev.java.net
>
> --
> Kohsuke Kawaguchi

ShobhaD

unread,
Feb 7, 2011, 11:28:00 PM2/7/11
to Jenkins Users, itssh...@gmail.com
This issue is not resolved yet for me.
I have to restart Tomcat each time this happens. I have now started
tomcat in a command window
This is the first time I have deployed hudson /Tomcat on a Windows
( windows 7 32-bit) machine. I have never had this issue with my
previous Linux based hudson setups.


I noticed in the command window that hudson is stuck at node
monitoring when a job is stuck while building (only the first two
lines ( Building remotely on <slave x>....) ) are displayed.
I have gathered the console output of execution of Tomcat6.exe //TS//
Tomcat6) of when this happens

Feb 8, 2011 9:20:37 AM
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init>
WARNING: Previous Free Swap Space monitoring activity still in
progress. Interrupting
Feb 8, 2011 9:20:37 AM
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init>
WARNING: Previous Free Disk Space monitoring activity still in
progress. Interrupting
Feb 8, 2011 9:20:37 AM
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init>
WARNING: Previous Free Temp Space monitoring activity still in
progress. Interrupting
Feb 8, 2011 9:20:37 AM
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init>
WARNING: Previous Architecture monitoring activity still in progress.
Interrupting
Feb 8, 2011 9:20:37 AM
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init>
WARNING: Previous Clock Difference monitoring activity still in
progress. Interrupting
Feb 8, 2011 9:31:57 AM org.apache.coyote.http11.Http11AprProtocol
pause
INFO: Pausing Coyote HTTP/1.1 on http-8080
<<--------------------------------------------------------------------------------------------------------------
cancelled execution at this point
Feb 8, 2011 9:31:57 AM org.apache.coyote.ajp.AjpAprProtocol pause
INFO: Pausing Coyote AJP/1.3 on ajp-8009
Feb 8, 2011 9:31:58 AM org.apache.catalina.core.StandardService stop
INFO: Stopping service Catalina
Feb 8, 2011 9:32:24 AM
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init>
WARNING: Previous Response Time monitoring activity still in progress.
Interrupting
Feb 8, 2011 9:32:24 AM
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init>
WARNING: Previous Free Swap Space monitoring activity still in
progress. Interrupting
Feb 8, 2011 9:32:24 AM
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init>
WARNING: Previous Free Disk Space monitoring activity still in
progress. Interrupting
Feb 8, 2011 9:32:24 AM
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init>
WARNING: Previous Free Temp Space monitoring activity still in
progress. Interrupting
Feb 8, 2011 9:32:24 AM
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init>
WARNING: Previous Architecture monitoring activity still in progress.
Interrupting
Feb 8, 2011 9:32:24 AM
hudson.node_monitors.AbstractNodeMonitorDescriptor$Record <init>
WARNING: Previous Clock Difference monitoring activity still in
progress. Interrupting
Feb 8, 2011 9:33:27 AM hudson.slaves.SlaveComputer tryReconnect
INFO: Attempting to reconnect <slave x..>
Feb 8, 2011 9:33:31 AM
com.youdevise.hudson.slavestatus.SlaveListenerInitiator onOnline
INFO: Starting slave-status listener on <slave x>
Feb 8, 2011 9:33:31 AM hudson.slaves.CommandLauncher launch
INFO: slave agent launched for <slave x>
Feb 8, 2011 9:36:59 AM org.apache.catalina.startup.Catalina stopServer
SEVERE: Catalina.stop:
java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
org.apache.catalina.startup.Catalina.stopServer(Catalina.java:408)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at
org.apache.catalina.startup.Bootstrap.stopServer(Bootstrap.java:338)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:
416)

Also, I get a message on the Dashboard under Manage Hudson that says
that that "there are too many SCM polling threads running than can be
handled".

Appreciate any help.

Thanks
Shobha
> > 2010/11/21shobhad<shob...@yahoo.com>:

Mark Waite

unread,
Feb 8, 2011, 12:30:57 AM2/8/11
to jenkins...@googlegroups.com
I wonder if I'm seeing the same condition you're seeing, even though I'm seeing
it without tomcat. My builds become "stuck" and there may be a correlation
between the Windows slave that is hosting one of the builds and the "stuck".

The Hudson master on Linux consumes 100% of the CPU and the build on Windows
makes no further progress. A request to view console output for the "stuck"
Windows job from the web interface never returns.

Can you tell me how you generated the thread dump on Windows, and I can perform
a similar thread dump?

Mark Waite

>-

ShobhaD

unread,
Feb 8, 2011, 2:01:11 AM2/8/11
to Jenkins Users
Type in http://<url to your hudson server>/hudson/threadDump
It will give a summary of what's happening on hudson. It says the scm-
polling thread is stuck and execution of the build is hanged but not
sure why.

Please update this posting with your findings as well.


Thanks
Shobha
> are ...
>
> read more »

Wayne Fay

unread,
Feb 8, 2011, 2:03:58 PM2/8/11
to jenkins...@googlegroups.com
> Can you tell me how you generated the thread dump on Windows, and I can perform
> a similar thread dump?

Get the PID from Task Manager (or ps -ef on *nix) and run "jstack
$pid" on the command line to get the stack trace out of any Java
program running on Windows and other OSes too...

Wayne

Reply all
Reply to author
Forward
0 new messages