Hanging Master

28 views
Skip to first unread message

Shanti Subramanyam (gmail)

unread,
Feb 20, 2011, 10:21:26 PM2/20/11
to faban...@googlegroups.com
I've run into this issue several times now. Here's the scenario.
I have a Driver that throws ConfigurationException if something is incorrectly specified in the config file. This is pretty standard.
What I'm noticing is that when the driver throws this exception, the run doesn't exit.

I do see the following in the log:
TestAgent[0].0: Error initializing driver object.

and the stack trace shows the exception correctly. 

But after this, the driver continues and I see the following messages:
INFO:       Ramp up started
WARNING: TestAgent[0]: Killing benchmark run
INFO:       TestAgent[0]: Performing busy timer check
...

INFO:       Steady state completed


So, even though there is a WARNING message that it is killing the benchmark run, it actually doesn't do so. The run proceeds and I get to Steady state completed. At this point everything hangs. I have to manually find and kill the MasterImpl java process. 

I took a jstack dump before killing the process, and I see this (let me know if you want the full jstack output and I can send it) :
"main" prio=5 tid=103000800 nid=0x100501000 waiting on condition [100500000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <10987a460> (a java.util.concurrent.CountDownLatch$Sync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:207)
        at com.sun.faban.driver.engine.AgentImpl.join(AgentImpl.java:589)
        at com.sun.faban.driver.engine.MasterImpl.executeRun(MasterImpl.java:788)
        at com.sun.faban.driver.engine.MasterImpl.runBenchmark(MasterImpl.java:276)
        at com.sun.faban.driver.engine.MasterImpl.main(MasterImpl.java:1567)

Larry1234

unread,
Feb 21, 2011, 8:23:06 AM2/21/11
to faban-users
I too, have this issue. It's never been that big of a deal,but I'd
gladly test any fix.

On Feb 20, 9:21 pm, "Shanti Subramanyam (gmail)"
<shanti.subraman...@gmail.com> wrote:

Akara Sucharitakul

unread,
Feb 22, 2011, 2:47:49 AM2/22/11
to faban...@googlegroups.com, Shanti Subramanyam (gmail)
I see. AgentImpl seems to be waiting on a latch that never counts down. Need to see the other threads what they're up to and what code path causes these thread to miss the countdown.

-Akara

Larry Sanders

unread,
Mar 1, 2011, 4:08:13 PM3/1/11
to faban...@googlegroups.com
On Tue, Feb 22, 2011 at 1:47 AM, Akara Sucharitakul
<akara.suc...@gmail.com> wrote:
> I see. AgentImpl seems to be waiting on a latch that never counts down. Need
> to see the other threads what they're up to and what code path causes these
> thread to miss the countdown.
> -Akara


C:\Program Files\Java\jdk1.6.0_16\bin>jstack.exe 13980
2011-03-01 15:06:33
Full thread dump Java HotSpot(TM) Client VM (14.2-b01 mixed mode, sharing):

"DestroyJavaVM" prio=6 tid=0x002b7000 nid=0xc44 waiting on condition
[0x00000000]
java.lang.Thread.State: RUNNABLE

"RMI Reaper" prio=6 tid=0x02e71800 nid=0x3594 in Object.wait() [0x0320f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x22a822d0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x22a822d0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at sun.rmi.transport.ObjectTable$Reaper.run(ObjectTable.java:333)
at java.lang.Thread.run(Thread.java:619)

"RMI TCP Accept-0" daemon prio=6 tid=0x02e73400 nid=0x61c runnable [0x031bf000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
- locked <0x22a823a8> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:453)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:369)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341)
at java.lang.Thread.run(Thread.java:619)

"GC Daemon" daemon prio=2 tid=0x02e5c400 nid=0x3fa8 in Object.wait()
[0x0316f000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x22ee6a28> (a sun.misc.GC$LatencyLock)
at sun.misc.GC$Daemon.run(GC.java:100)
- locked <0x22ee6a28> (a sun.misc.GC$LatencyLock)

"RMI RenewClean-[172.18.120.221:4225]" daemon prio=6 tid=0x02e5f000
nid=0x34a4 in Object.wait() [0x0311f000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x22ee6678> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x22ee6678> (a java.lang.ref.ReferenceQueue$Lock)
at sun.rmi.transport.DGCClient$EndpointEntry$RenewCleanThread.run(DGCClient.java:516)
at java.lang.Thread.run(Thread.java:619)

"RMI Scheduler(0)" daemon prio=6 tid=0x02e5d800 nid=0x1564 waiting on
condition [0x030cf000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x22ed6448> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.
ava:1963)
at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:583)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:576)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)

"Low Memory Detector" daemon prio=6 tid=0x02ad9000 nid=0x1850 runnable
[0x00000000]
java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x02ad6800 nid=0x3740 waiting on
condition [0x00000000]
java.lang.Thread.State: RUNNABLE

"Attach Listener" daemon prio=10 tid=0x02ad1800 nid=0xe8 waiting on
condition [0x00000000]
java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x02ad0400 nid=0x3c00 runnable
[0x00000000]
java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=8 tid=0x02a8e800 nid=0xeb8 in Object.wait() [0x02c5f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x22eac2c8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x22eac2c8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x02a8d000 nid=0x120c in
Object.wait() [0x02c0f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x22eac350> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
- locked <0x22eac350> (a java.lang.ref.Reference$Lock)

"VM Thread" prio=10 tid=0x02a8b800 nid=0x109c runnable

"VM Periodic Task Thread" prio=10 tid=0x02adb400 nid=0x6ec waiting on condition

JNI global references: 918

Reply all
Reply to author
Forward
0 new messages