webserver refresh executor info timeout cause Reached handleNoExecutorSelectedCase stage

204 views
Skip to first unread message

qingz...@gmail.com

unread,
May 24, 2016, 11:11:49 PM5/24/16
to azkaban
hi

I encounted a problem .when i restart 3 executors and then restart webserver.  I found webserver fresh executor info timeout .and Reached handleNoExecutorSelectedCase stage.

logs :

2016/05/17 11:20:37.685 +0800 ERROR [ExecutorManager] [Azkaban] Timed out while waiting for ExecutorInfo refreshclassa-bigdata0.server.163.org:12321 (id: 1)
java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at azkaban.executor.ExecutorManager.refreshExecutors(ExecutorManager.java:259)
        at azkaban.executor.ExecutorManager.access$1500(ExecutorManager.java:65)
        at azkaban.executor.ExecutorManager$QueueProcessorThread.processQueuedFlows(ExecutorManager.java:1929)
        at azkaban.executor.ExecutorManager$QueueProcessorThread.run(ExecutorManager.java:1897)
2016/05/17 11:20:42.686 +0800 ERROR [ExecutorManager] [Azkaban] Timed out while waiting for ExecutorInfo refreshhzabj-bigdata4.server.163.org:12321 (id: 2)
java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at azkaban.executor.ExecutorManager.refreshExecutors(ExecutorManager.java:259)
        at azkaban.executor.ExecutorManager.access$1500(ExecutorManager.java:65)
        at azkaban.executor.ExecutorManager$QueueProcessorThread.processQueuedFlows(ExecutorManager.java:1929)
        at azkaban.executor.ExecutorManager$QueueProcessorThread.run(ExecutorManager.java:1897)
2016/05/17 11:20:47.687 +0800 ERROR [ExecutorManager] [Azkaban] Timed out while waiting for ExecutorInfo refreshhzabj-bigdata5.server.163.org:12321 (id: 3)
java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at azkaban.executor.ExecutorManager.refreshExecutors(ExecutorManager.java:259)
        at azkaban.executor.ExecutorManager.access$1500(ExecutorManager.java:65)
        at azkaban.executor.ExecutorManager$QueueProcessorThread.processQueuedFlows(ExecutorManager.java:1929)
        at azkaban.executor.ExecutorManager$QueueProcessorThread.run(ExecutorManager.java:1897)
2016/05/17 11:20:47.687 +0800 INFO [ExecutorManager] [Azkaban] Using dispatcher for execution id :60691
2016/05/17 11:20:47.688 +0800 INFO [ExecutorManager] [Azkaban] Reached handleNoExecutorSelectedCase stage for exec 60691 with error count 0


 it shows three executors are all timeout 。 I use curl command manually to invoke the refresh call.  and find only the executorid timeout . the other two executos are allright . 
I shutdown executor 1.  and webserver now can fetch the other two executors infos.  

the question is  it seems one executors fresh timeout cause fresh other executors timout.  and why executor timeout ?  (i forget to dump the stack.......)
                                                                                                      
                                                                                       qingzhao


Reply all
Reply to author
Forward
0 new messages