Cascading waiting for long even when all hadoop jobs are stopped

39 views
Skip to first unread message

Pushpender Garg

unread,
Apr 17, 2015, 11:43:07 AM4/17/15
to cascadi...@googlegroups.com
Not sure where am I going wrong. We are using CDH 5.2
We have a use case where if any of the flowstep fail we want to abort entire flow. So we used flowsteplistener and onthrowable we call flow.stop(), It is killing all hadoop jobs(rest of the flowsteps) but even after killing it is waiting for 5 mins after printing a message that 
"shutting down job executor"
I checked some code and found that in BaseFlow it calls this method:

  protected void handleExecutorShutdown()
    {
    if( spawnStrategy.isCompleted( this ) )
      return;

    logInfo( "shutting down job executor" );

    try
      {
      spawnStrategy.complete( this, 5 * 60, TimeUnit.SECONDS );
      }
    catch( InterruptedException exception )
      {
      // ignore
      }

    logInfo( "shutdown complete" );
    }
Then this spawnStrategy.compIete is timing out after 5 mins as given duration. On checking further it calls ThreadPoolExecutor.awaitTermination

Not sure why this method is still waiting for termination even when all the hadoop jobs are killed remotely by calling flow.stop method?

Andre Kelpe

unread,
Apr 17, 2015, 2:07:33 PM4/17/15
to cascadi...@googlegroups.com
It is possible that the API calls towards the RM have not returned
yet. Technically CDH 5.2 has not been verified to work with Cascading,
only 5.3 has: http://www.cascading.org/support/compatibility/ . Would
you mind running the compat tests to see if all tests pass?
https://github.com/Cascading/cascading.compatibility

- André
> --
> You received this message because you are subscribed to the Google Groups
> "cascading-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cascading-use...@googlegroups.com.
> To post to this group, send email to cascadi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/cascading-user.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cascading-user/e575bcee-44a2-4329-9ec5-6f98b00ec319%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
André Kelpe
an...@concurrentinc.com
http://concurrentinc.com

Pushpender Garg

unread,
Apr 18, 2015, 6:40:32 AM4/18/15
to cascadi...@googlegroups.com
Possible..is there anyway I can check that?
Sorry I have CDH 5.1 and cascading 2.5.3

Pushpender Garg

unread,
Apr 18, 2015, 7:07:53 AM4/18/15
to cascadi...@googlegroups.com
One more thing....I tried using Hadoop API to kill jobs inside flowstep listener instead of calling flow.stop() and that is working fine as it then doesnt wait for timeout

Andre Kelpe

unread,
Apr 20, 2015, 5:17:18 AM4/20/15
to cascadi...@googlegroups.com
Yes, you can check by running the compat tests:
https://github.com/Cascading/cascading.compatibility It is all
explained in the README and should be straight-forward.

- André

On Sat, Apr 18, 2015 at 12:40 PM, Pushpender Garg
> https://groups.google.com/d/msgid/cascading-user/a2be6d07-b58a-46c0-8999-6a592633ce82%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages