poll for scheduler features

77 views
Skip to first unread message

Niphlod

unread,
Jun 28, 2012, 6:08:06 PM6/28/12
to web2py-d...@googlegroups.com
This is my ultra-personal list of things I'd like to see in the scheduler. I have some free time to implement scheduler, and I'd like to see what input I get from the developers userbase.

Current features:
- one-time-only tasks
- recurring tasks
- possibility to schedule functions at a given time
- possibility to schedule recurring tasks with a stop_time
- can operate distributed among machines, given a database reachable for all workers
- group_names to "divide" tasks among different workers
- group_names can also influence the "percentage" of assigned tasks to similar workers
- simple integration using modules for "embedded" tasks (i.e. you can use functions defined in modules directly in your app or have them processed in background)
- configurable heartbeat to reduce latency: with sane defaults and not toooo many tasks queued normally a queued task doesn't exceed 5 seconds execution times
- can be started, process all available tasks and then die automatically
- integrated tracebacks
- monitorable as state is saved on the db
- integrated app environment if started as web2py.py -K


My personal "I'll do it shortly" todo-list:
1) stop processes dinamically (set them to "KILLED" status to terminate them)
2) parameter to allow discarding of results (never use scheduler_run table)
3) by default if function returns None skip scheduler_run record (so you can use results for functions that need it, or discard results if you're not gonna fetch them. If exception is raised save a record with the traceback anyway)
2bis and 3bis) add a new column to scheduler_task table to discard results of that function "no matter what it returns"

Things I'd like to implement, but I don't know how (i.e. no absolute idea on how to achieve the feature, or simply in the need of some ideas to implement them):
4) start workers dinamically (e.g. inserting a new record on scheduler_worker table fires a new worker)
5) gracefully kill workers (i.e. a better way to be able to terminate workers waiting for an eventual running task to be finished before terminating the process)
6) add a unique column to scheduler_task to be able to "tell" that there are strictly n functions scheduled at all times (feature requested "strongly" by Micheal Toomin)
7) assign global parameters as options to "embedded launch version", i.e. web2py.py -K myapp -o option_overriding_defaults_in_models
8) have a parameter to launch both web2py webserver and scheduler () and terminate both on ctrl+c or kill the main process
9) track pids of schedulers (i.e. provide some guidance on what process is "safe" to be killed, leaving any actual RUNNING task as "STOPPED")
10) decorator function to schedule a function and get the async result

I'm eager to see a good brainstorm :-P

Niphlod

Niphlod

unread,
Jul 4, 2012, 4:43:50 PM7/4/12
to web2py-d...@googlegroups.com
ok, 1), 2), 3) , 5) and 6) are ready. I'm a little brainwashed by all threading and multiprocessing quirks on various platforms....

To "stimulate" the process (i.e. going towards scheduler "formalization" process), I'm going to pack an app - quite like the "examples" one - to let anyone a) try the new features and b) test those ( and to force myself writing documentation about it ^_^)

That should "cover" all the bases, I'm not sure on how to write unittests for the scheduler.

I forgot about asking another thing: do you think it could be nice to provide "preconfigured" functions to use along with the scheduler ?
I'm talking about something like:
- decorator for one-time-only tasks (i.e. schedule task, return uniqueish-identifier if results not yet ready , retrieve results if ready)
- cleaning up scheduler_run table
- startup hook
- .....

Massimo DiPierro

unread,
Jul 4, 2012, 6:00:49 PM7/4/12
to web2py-d...@googlegroups.com
Is there supposed to be an attachment?

massimo

-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/

Niphlod

unread,
Jul 4, 2012, 6:09:02 PM7/4/12
to web2py-d...@googlegroups.com
Not until I review It completely and test all features combined at least on windows and with other db engines... Now works on python 2.6.5, Linux, postgres and sqlite

Niphlod

unread,
Jul 5, 2012, 3:39:40 PM7/5/12
to web2py-d...@googlegroups.com
rereading the quick-reply probably I didn't quite explained myself in the "correct" way.
I implemented those features skipping briefly between 3 machines:
- VM with xp, python 2.7, sqlite
- ubuntu 10.04, python 2.6.5, sqlite and postgres
- ubuntu 12.04, python 2.7.3, sqlite

I'd like to "enlarge" my current "test" app to be more usable also for newcomers and to be able to implement new features keeping all work consistent and trying to be "as backward compatible as possible".
In that way I can simply pack the app, load into web2py environment in machines, eventually change the DAL uri, run all tests, see if they behave correctly and then "be sure" that the patch I created actually works.

As I stated before, there are very simmmmmple things working flawlessly in linux that does not work at all on Windows (threads are somewhat consistent across platforms, multiprocessing is a little more daunting) and I'm never really sure that things that works flawlessly in my developing environment are working on other platforms.

Given the app codebase, testing on, let's say, MSSQL, Mysql and Oracle should be fairly easy, and on OSX too... I'll be able to respond quickly to bug reports and as a nice addition users can get "comfortable" with using scheduler just exploiting the app code.

Sorry for the previous mispelled and probably harsh-toned post of yesterday ^_^

Andrew W

unread,
Jul 6, 2012, 1:09:15 AM7/6/12
to web2py-d...@googlegroups.com
Some of the things I'm looking for may be outside the scope of the core scheduler.  In particular task dependencies, related to a "stream" of jobs, some serial, some parallel.  Have a look at:

https://groups.google.com/forum/?fromgroups#!searchin/web2py/workflow/web2py/0z1BysNp8Gc/v6KOUS-LoQsJ   I must ask Ross how it's going.

Would you say that type of functionality is outside the scheduler, perhaps in an accompanying "workflow" module ?

Niphlod

unread,
Jul 6, 2012, 6:51:41 AM7/6/12
to web2py-d...@googlegroups.com
Uhm, I'd need more examples to really assess those features but the scheduler should be a pure "processor".
It has a lot of flexibility though, for "stream" of jobs (chords or chains in celery) you could always define a tasks that near the end queues another task, or queue all the tasks with a "disabled" status and enable them if your workflows need them.

Are you in the need of having something like callables events to the tasks, like "before_process" and "after_process" ?

Niphlod

unread,
Jul 8, 2012, 7:22:56 PM7/8/12
to web2py-d...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages