Jonathan Herzig
unread,Mar 17, 2011, 11:09:37 AM3/17/11Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Jaql Users
Hi,
I'm running jaql over a cluster of 6 machines.
When i run my jobs on small data it runs smoothly.
However, when i use larger data (~4G) the following occurs:
I can see that alot of tasks which have been completed, go back to
"pending" state.
When this happens i get exceptions that look like:
"Task attempt_201103161639_0002_m_000000_0 failed to report status for
607 seconds. Killing!"
Most of the time the cluster gets stuck ater a while , apperantly from
memory loss, and should be restarted.
What can possibly be wrong?
Are there any parameters i should change?
Thanks,
Jonathan