Setting task frequency at runtime

75 views
Skip to first unread message

Paul Tiseo

unread,
Aug 15, 2016, 3:50:24 PM8/15/16
to actionHero.js
I want users to be able to supply a config file that can set the frequency of various tasks. Task frequency seems to be "hard-coded" into the task's definition itself. If I want to let users set this, is the best approach to set frequency = 0 then load it via api.tasks.enqueue... functions in an initializer?

Also, if I read the docs correctly, only one task runs at a time per server. If the task is long-running, and we have a cluster of two servers, will a second one begin to run if the freqency dictates it, even if the first hasn't completed?

And, from http://www.actionherojs.com/docs/#job-schedules, should I ensure that only the master scheduler perform the api.tasks.enqueue() call, if I expect there to be multiple servers?

Evan Tahler

unread,
Aug 15, 2016, 4:05:33 PM8/15/16
to actionHero.js
There's a lot to unpack here:

Lets start bottoms-up:

The point of api.resque.scheduler.master (which be be true || false) is that only one node in your cluster will be true at any moment.  It uses pessimistic locking, so there also may be a few moments (default 3 min) where NO server could be acting as master.  If there's something you want to only happen once thought your whole cluster, no matter how many nodes you have, use this method.  This method uses a redis, so be sure that everyone is connected to the same Redis server.

The number of workers running on a NODE is set by api.config.tasks.minTaskProcessors and api.config.tasks.minTaskProcessors.  I say "node", because you should be using `actionhero start cluster` to run more than one process per server (# nodes ~= number of CPU cores; assuming you have the RAM for it).  Since tasks (just like web requests) are async, you can probably run more than one of them at a time, as you'll be spending most of the time waiting for an external service (database, etc).  Setting this number is a function of your workload, but it's likely to be more than 1. 

Regarding the frequency, that can be more technical explained as: "When I complete, if I'm successful, how long in the future should I enqueue my next instance to be run from now".  So if you have a 5s job with a frequency of 1 min, you'll actually see the job run every 65 seconds.  And yes, normal queue priority applies here, so if the queue you place these jobs into is long, or all workers are busy doing something else, it will take some time to get to the job in question.  You can mitigate this for high-priority jobs via queue order and which queues collections of workers work on.  You can assign all workers to work on * (default), or assign some explicitly to "high-priority" and the rest to *.

Finally, the frequency is only set for RECURRING tasks. You can always enqueue a non-recurring task directly from an action, another task, etc.  See http://www.actionherojs.com/docs/#enqueuing-a-task.  
If you really do want "user-configurable" tasks, you have a few options.  You can mess with `api.tasks.jobs` directly in-memory of the server.  You can also require your config file from with the task definition file and set it that way, i.e.: task.frequency = require(__dirname + '/../myConfig/tasks.js)[name_of_task].frequency

Paul Tiseo

unread,
Aug 17, 2016, 12:30:48 PM8/17/16
to actionHero.js
If there are multiple (fifo?) task queues, how does AH read from the set of them? Round-robin?

Evan Tahler

unread,
Aug 17, 2016, 1:41:16 PM8/17/16
to actionHero.js
It's explicit ordering.  If you list `api.config.tasks.queues = ['a', 'b', 'c']`, all jobs from A will be worked before all jobs from B, before all jobs from C.  In your case, you might have queue names like ['high', 'low'].  You might want one node to work ['high', 'low'], and then another to work ['low', 'high'].  This way, you can be sure that at least one node will be working your HIGH queue in all cases, but the LOW queue one't back up that far either.

The default, "*", tells ActionHero/Resque, to just ask redis, at boot, for a list of all the queues that exist in the namespace, and works them in a (probably) alphabetical priority order. 

Paul Tiseo

unread,
Aug 18, 2016, 10:40:02 AM8/18/16
to actionHero.js
So, if I understand correctly:

- using clustering, I can start one or more AH nodes
- for each node, I can start one or more task processors
- the master node delegates the tasks (from the designated Redis or Redis-like store) out of the queues to any available processor on any node (presumably in some order based on the list of nodes known to the master)
- in config/tasks.js, I can name/create one or more fifo queues
- the queues are processed in the order defined in the api.config.tasks.queues array, one queue at a time until fully depleted or there are no tasks due by timestamp, then moving on in the order stated in the array
- how often a periodic task runs (one that has a frequency property > 0) is the sum of it's run time + it's freqency; a periodic task auto-issues an enqueueIn() at end to re=schedule itself
- any blocking/synchronous activity in one task prevents any other task or action from executing for a whole node
- if there are no queued tasks for an available processor, it will sleep for api.config.tasks.timeout ms.

Is

PS: What happens if I put a task's queue property to one not listed in api.config.tasks.queues?
PPS: What exactly do api.config.tasks.checkTimeout and api.config.tasks.maxEventLoopDelay control?

Evan Tahler

unread,
Aug 18, 2016, 11:54:10 AM8/18/16
to actionHero.js
1) yep!
2) yep!
3) no -> The only thing the master does is the SCHEDULER.  The scheduler is the one moving jobs from the delayed queues into the normal queues once their time has come.  That's it.  Since there is a leader-election process happening here, it's a convent thing to hook into for all other activities you might need which you want to only happen once in your cluster. There's nothing bad that happens if there's more than one scheduler process running at once... it would just add unneeded load to redis.
4) almost: The fifo queues do not have timestamps.  It's one queue.  *all* delayed tasks are put in separate queues (named after the time they should be worked) and are then moved into the normal queues by the scheduler at the proper time.
5) yep!
6) yep! (just like all of node in general)
7) yep!

ps: it will make that new queue, and probably no one will work it.  This is however very handy when you have some apps producing jobs for other applications to consume
pps: The comments in the config files should help you here:
- timeout: how long to sleep between jobs / scheduler checks when there is no work to do
- checkTimeout: how often should we check the event loop to spawn more taskProcessors?
- maxEventLoopDelay: how many ms would constitue an event loop delay to halt taskProcessors spawning?
Reply all
Reply to author
Forward
0 new messages