Monitor Gearman Job's Status

Edwin Fuquen

unread,

Oct 3, 2012, 12:17:19 PM10/3/12

to Gearman

This is a specific question about a general problem, monitoring
specific jobs in Gearman. Most previous threads I've seen talk about
monitoring from a very coarse grained perspective (number of jobs in
queues, number of workers on queues, etc) that are available via
workers and status commands. We have a use case where we would like
to monitor some long running jobs to see their status. Unfortunately
it seems that the gearman server doesn't provide much (if any) help in
introspecting on what jobs are queued or running on the server.

Looking at the different status calls in the binary protocol,
specifically GET_STATUS and WORK_STATUS, the primary use case seems to
be to have status communication from the worker to the single gearman
client that submitted a background job. The relevant excerpt is from
the WORK_STATUS doc, quoted below:

"The job server should store this information so a client who issued a
background command may retrieve it later with GET_STATUS request."

What we want is to be able to monitor all long running background jobs
on a queue (they would be doing video transcoding) to see their
status. From the docs you would need a job handle to do the
GET_STATUS call, but there doesn't seem to be a way to query the
gearman server to get any job handles, for the entire server or for a
specific queue. The only option I see available would be to store the
job handles ourselves every time a job was created somewhere else and
then query the gearman servers with GET_STATUS calls. This seems
somewhat wasteful considering that the gearman server itself has
knowledge of all the jobs it's running and we would rather not have to
keep track of all the handles ourselves when gearman is already doing
it for us.

Has anyone else had this use case? Is there a better solution with
the gearman-server? If the feature doesn't exist can we get it in
there? It would seem like it would be something easy to add.

Thanks!

Clint Byrum

unread,

Oct 3, 2012, 1:00:19 PM10/3/12

to gearman

Excerpts from Edwin Fuquen's message of 2012-10-03 09:17:19 -0700:

Right, querying for all the handles would not scale well as it would have
to lock the structure to make a copy and then slam it out the network port.

Gearmand is kept simple to reduce the impact and chance of failure. The
more you can push off to workers and clients, the more distributed and
scalable your system becomes.

For this problem, you can run lots of workers which keep track, there
is no need to persist anything, unless you really are super paranoid
about losing track.

In pseudo php:

$statusObject = AggregateStatusThing('memcached_server+key_or_something');
$client = GearmanClient();
$client = GearmanWorker();

record_status($task) {
$statusObject->updateReport($task->jobHandle(), $task->taskDenominator())
}
$client->setStatusCallback('record_status');

record_complete($task){
$statusObject->finishJob($task->jobHandle());
}
$client->setCompleteCallback('record_complete');

marshall_work($job) {
$client->addTask('_real_transcode', $job->getPayload());
}
$worker->registerFunction('transcode', 'marshall_work');

// Better to use NON_BLOCKING mode here so everything can
// run at one time, but I think you get the point
while($worker->work()) {
$client->runTasks();
}

Worst case, one of these workers dies and you lose track of the work it
was tracking (the work will still run, you just won't have insight into
completion). You can mitigate the impact of these failures by running
many of these tracker workers.

>
> Has anyone else had this use case? Is there a better solution with
> the gearman-server? If the feature doesn't exist can we get it in
> there? It would seem like it would be something easy to add.
>

I'd rather see the above included as a helper worker to run in many
places than see it built into gearmand.

Brian Aker

unread,

Oct 7, 2012, 3:31:08 AM10/7/12

to gea...@googlegroups.com

Hi,

On Oct 3, 2012, at 9:17 AM, Edwin Fuquen <efu...@gmail.com> wrote:

> workers and status commands. We have a use case where we would like
> to monitor some long running jobs to see their status. Unfortunately
> it seems that the gearman server doesn't provide much (if any) help in
> introspecting on what jobs are queued or running on the server.

If you know the jobs you can just queue up some requests to find the state.

You can always find out what the state of all jobs are, but the interface is very primitive (this is how gearadmin works). It is likely to change/be formalize at some point in the near future.

> "The job server should store this information so a client who issued a
> background command may retrieve it later with GET_STATUS request."

Right now the only way to know about status once the job has completely finished is by mining the logs. Which,... you can just hook up a worker to do (look online for the "distributed grep" and you can get an idea about how this works).

> status. From the docs you would need a job handle to do the
> GET_STATUS call, but there doesn't seem to be a way to query the
> gearman server to get any job handles, for the entire server or for a
> specific queue. The only option I see available would be to store the

Look at how gearadmin gets its data.

> somewhat wasteful considering that the gearman server itself has
> knowledge of all the jobs it's running and we would rather not have to
> keep track of all the handles ourselves when gearman is already doing
> it for us.

You can also just mine the logs with a worker.

> Has anyone else had this use case? Is there a better solution with
> the gearman-server? If the feature doesn't exist can we get it in
> there? It would seem like it would be something easy to add.

Typically I have seen folks either track the state of the jobs by keeping a log of the handles they wish to watch (sometimes by sending them to a worker who does the tracking for many jobs at once), or by mining the logs.

Cheers,
-Brian

Reply all

Reply to author

Forward