Execute Jobs based on load

35 views
Skip to first unread message

KriRad

unread,
Feb 7, 2012, 2:58:44 AM2/7/12
to gearman
Hi,
I am tring to understand how to use gearman to post jobs and
execute them based on a load policy. We receive steraming messages and
these have to be added to a queue for processing. They can either be
processed immediately or after we aggregate a certain number of
messages.

How do I specify a load policy like this ? At the end of the day all
messages should be processed but we can do it incrementally.

Are there any articles about this ?
What are some of the larger gearman deployments that I can refer to ?

Thanks,

Shane Harter

unread,
Feb 7, 2012, 4:12:23 PM2/7/12
to gea...@googlegroups.com
So what you're saying is that sometimes you want to stop processing messages as they're added to the queue so you can wait until N messages have been enqueued and then process them in batch? 

Obviously if you expect to get any benefit out of that, you'll have to code that into your workers. For example code a worker that will know to read in N jobs and process them in some batch way. 

So taking all that into account: 

The only real way to accomplish this will be for you to write code to dynamically start/stop workers. There is nothing in the gearman protocol or daemon that can assist you with this. 

How i would handle this is probably to implement a daemon that watches your persistent queue store -- whether it's MySQL/Drizzle or anything else -- and looks at how many jobs are enqueued. And then dynamically start and stop worker processes. For example you could have a worker that's designed to process jobs as they come-in, and a variation of that worker designed to process them in batch. You could stop the one-by-one worker and start the batch worker (both would listen to the same queue name) within this daemon that manages everything. 

I would prefer to code such a daemon in Java or perhaps Python. But if you want to use PHP for whatever reason, I've released an OSS daemon library that I've used in production with Gearman here: https://github.com/shaneharter/PHP-Daemon  and there's also a full-featured gearman daemon to manage workers called Gearman Manager (I'm sure you can google for it). However, that is designed primarily to start and keep-running N worker processes and does so by forking itself for each worker you want running. I've used it before and in my opinion it's just not a great fit for the work you're describing. Though Brian Moon, its author, could offer his opinion, I know he reads this thread. 

Good Luck!
Shane

KriRad

unread,
Feb 9, 2012, 2:46:18 AM2/9/12
to gearman
Can I ask about some of the larger usecases ?

What are some of the larger gearman deployments that I can refer to ?

Thanks.

Ankur Gupta

unread,
Feb 9, 2012, 3:28:57 AM2/9/12
to gea...@googlegroups.com
During my tenure at HP we deployed Gearman to process approx 40TB worth log files over 2 machines with 15 cores each and 50 GB of RAM each. Approx 24 gearman workers with one server and client.  The software would find violations of thresholds in near real time and report the same. Workers were in perl, sed, awk. Client in Python.

Regards
Ankur
--
Homepage -> http://uptosomething.in

Ankur Gupta

unread,
Feb 9, 2012, 3:32:38 AM2/9/12
to gea...@googlegroups.com
I am sorry I merely read the line "Can I ask about some of the larger usecases ?" and send the email before reading the entire text. My bad. Apologies.

Ankur

KriRad

unread,
Feb 10, 2012, 12:56:13 AM2/10/12
to gearman
Your deployment is large.

I was trying to locate such deployments in banks or other trading
firms where financial settlements are processed. That is our usecase
here.

Mohan

On Feb 9, 1:32 pm, Ankur Gupta <verses...@gmail.com> wrote:
> I am sorry I merely read the line "Can I ask about some of the larger
> usecases ?" and send the email before reading the entire text. My bad.
> Apologies.
>
> Ankur
>
>
>
>
>
> On Thu, Feb 9, 2012 at 1:58 PM, Ankur Gupta <verses...@gmail.com> wrote:
> > During my tenure at HP we deployed Gearman to process approx 40TB worth
> > log files over 2 machines with 15 cores each and 50 GB of RAM each. Approx
> > 24 gearman workers with one server and client.  The software would find
> > violations of thresholds in near real time and report the same. Workers
> > were in perl, sed, awk. Client in Python.
>
> > Regards
> > Ankur
>
> > On Thu, Feb 9, 2012 at 1:16 PM, KriRad <radhakrishnan.mo...@gmail.com>wrote:
>
> >> Can I ask about some of the larger usecases ?
>
> >> What are some of the larger gearman deployments that I can refer to ?
>
> >> Thanks.
>
> > --
> > Homepage ->http://uptosomething.in
>
> --
> Homepage ->http://uptosomething.in- Hide quoted text -
>
> - Show quoted text -
Reply all
Reply to author
Forward
0 new messages