Scheduler Replacement

140 views
Skip to first unread message

Boris Aramis Aguilar Rodríguez

unread,
Nov 21, 2018, 1:31:13 PM11/21/18
to web2py-users
TL;DR : Do you guys know or has experience with an alternative scheduler to web2py's one that you would recommend? we use features such as scheduling tasks at a specific time, repetitions, timeouts, and check the task current state. Any further comments, advice and experience is really appreciated.

Details:

We've beeng experiencing several issues with web2py's scheduler limits, our scenario currently looks something like:

 - 30,000+ tasks queued daily.
 - Around 40+ workers distributed into four different servers (somewhere between 10 workers per server): We do this because of the nature of the tasks we are distributing, we need them to be run on a specific server located somewhere in the world that has access to do what we need.
 - A centralized database that holds tasks+runs (PostgreSQL)
 - A centralized REDIS that manages the scheduler (with redis_scheduler.py module wihtin gluon/contrib/redis_scheduler.py)

We previously used web2py's (default) non-redis scheduler but we found the total workers that the database supported was around ~13-15 (too few for our usage scenario, above that number of workers started to get deadlocks from the database and ticker would fail to assign tasks) so we tried using redis_scheduler which worked (with one really weird bug we found a way to overcome but have found no fix yet) for some time until we needed more workers, now we are having issues, again with the relational database that holds the tasks (scheduler_task and scheduler_run tables, database connection gets closed after sometime when we have somewhere between 50+ workers); 

So I can only see that we need to replace web2py's scheduler on our application to support more workers and bring back stability to our project's hunger for workers nature...

Do anyone of you guys know or has experience with an alternative scheduler? I've seen several options (rabbit-mq, python-rq, mrq, so on); but I'm not sure about the limitations of those schedulers... any further comments are really appreciated.

Dave S

unread,
Nov 21, 2018, 4:19:40 PM11/21/18
to web2py-users


On Wednesday, November 21, 2018 at 10:31:13 AM UTC-8, Boris Aramis Aguilar Rodríguez wrote:
TL;DR : Do you guys know or has experience with an alternative scheduler to web2py's one that you would recommend? we use features such as scheduling tasks at a specific time, repetitions, timeouts, and check the task current state. Any further comments, advice and experience is really appreciated.
[...]
Do anyone of you guys know or has experience with an alternative scheduler? I've seen several options (rabbit-mq, python-rq, mrq, so on); but I'm not sure about the limitations of those schedulers... any further comments are really appreciated.

Nope, and my scheduler runs all of 8 tasks a day (creeping up to 16), so I've hardly pushed the limit.

But I would look at just running web2py instances on those other servers, and having the "master" do nothing more than send https requests to each of them, and each local scheduler then deals with its own workers only.  Not quite like sharding on Elastisearch, but definitely division of labor.

If Niphlod was still monitoring the group, that would get you the best comments on the in-box scheduler.  He's the expert -- wrote the current scheduler and the test suite, runs databases on diverse systems (including MSSQL behind IIS, IIRC). He also did the JWT implementation.

/dps

Niphlod

unread,
Nov 23, 2018, 2:57:08 AM11/23/18
to web2py-users

Dave S

unread,
Nov 23, 2018, 3:20:53 AM11/23/18
to web2py-users


On Thursday, November 22, 2018 at 11:57:08 PM UTC-8, Niphlod wrote:

Simone, I bow before the master.

/dps

Niphlod

unread,
Nov 23, 2018, 3:20:59 AM11/23/18
to web2py-users
jokes aside, yeah, you definitely hit scheduler's limits, or, for better saying, limits using a relational database as a queue table.
web2py's scheduler can still be optimized, and I feel that 30k tasks are manageable, if they are spread throughout the day (20 tasks a minute if math is not failing me)
Managing 15-20 workers is not a problem with a good database backend, 30ish or more is asking for disasters to happen, and that's why redis_scheduler was born.
The redis backed one moves to redis the heavy concurrency part, which is the scheduler_worker table and the polling algo for looking for new tasks, but it doesn't move EVERYTHING out of the database.
60-80 workers is probably the limit for redis_backed.
Both of them are prolly less than 2k lines of code and they use DAL and redis, but nothing else from the standard library so they're definitely good, but not on par with "specialized" alternatives.

One "grin" of the implementation is that it forks a new process every task, and that's for a basic design principle others do not enforce which is to terminate a long running process going into timeout (and that's because python can't kill a thread, just a process). But that's a problem for people running 40 tasks per second, not 20 a minute.

Things to look out for when you reach high numbers like you (i.e. you can see if it helps "staying with the scheduler"):
- trim the scheduler_task regularly (having 1m rows in the scheduler_task table, if you do not need them for reporting, can definitely bring down performances)
- same deal for scheduler_run (do you need the result, and if you need it, do you need for how much time?!)
- create indexes on it-s- the scheduler_task table (one on status, assigned_worker_name helps the workers,  on stop_time, next_run_time and status helps the ticker)

took that out of the table, you can quite easily jump on a different ship (my advice would be to look at rq before celery, but YMMV) which are HUGE projects.

The only problem you may have is that if you use the same backend to store results for your tasks (because they're needed) .... you *may* hit the same limits, i.e. you may have your backend not sized up for the work your apps need to do.


Boris Aramis Aguilar Rodríguez

unread,
Nov 28, 2018, 3:52:53 PM11/28/18
to web2py-users
Thanks for your answer!, I'm a poor worker. lol.

Really, really thanks, I'm tempted to take the road of doing some "interface" code to emulate the same functionality the scheduler currently does but using mrq.io library/framework as a backend... that to avoid changing the current code within the project, but I'm currently trying to see if it is feasible, and how much time that would take. 

Thanks again Niphlod for the hints and info; it would be awesome if the scheduler limits where documented somewhere (I would do it but I've never done it before); anyway, thanks a lot :)
Reply all
Reply to author
Forward
0 new messages