Improving first-time experience using Scheduler

509 views
Skip to first unread message

Michael Toomim

unread,
May 5, 2012, 11:26:57 PM5/5/12
to web2py-d...@googlegroups.com
I just tried the scheduler with my app. It works well, but we could make it even easier for first-time users:
  • Launch it with rocket webserver by default.
    First-time web2py users should be able to immediately get satisfaction adding tasks the scheduler without needing to learn the cron syntax of @reboot  *  *  *  *  root python web2py.py -K <appname> or messing with /etc/init files. Advanced users can disable it if not needed.
    Can we do that? Perhaps add @reboot  *  *  *  *  root python web2py.py -K <appname> to the example app's crontab... but we need a way to fill in the python path, and I don't like how the users must remember to update <appname> if they copy and rename the welcome app and still want the scheduler to work.
  • Don't require task method names to be predefined.
    It is redundant work to specify each task1=task1 in Scheduler(db, dict(task1=task1, etc.)). It's great that web2py automatically looks up view filenames from controller function names, and the controller function names from urls... can't we look up scheduler tasks from function names too? Perhaps there is a technical limitation here, I have yet to read the scheduler source code. But even if there is, it seems we could surmount it, perhaps with a schedule() function as described below...
  • When scheduling tasks, automatically jsonify vars[] and args[].
    Every task I schedule needs this, and sometimes I accidentally forget the sj.dumps() part of vars=sj.dumps(stuff). Potential solutions: (1) have a json datatype in web2py that automatically converts json, or (2) add a wrapper function schedule(function_name, vars, args, etc.) that does the conversion for us. Right now I just have to write this function by hand for each project.
  • Import gluon.scheduler.Scheduler by default.
    And db.py should include a Scheduler(db) line by default.
    ... but I assume this is already in the works when it's ready for release.
These three changes would make my first-time experience with the scheduler very smooth! Then the typical "how do I send emails in the background?" problem would be solved very simply for new users:

Replace:
mail.send(to=to, subject=subject, message=message)

in your controller with:
schedule(send_email, args=[to, subject, message])
 
# ...and this goes in models/tasks.py
def send_email(to, subject, message): mail.send(to, subject, message)

Finally, two little things I noticed... (1) the scheduler is giving too much log information right now :P, and (2) it would be great to have a realtime (websockets/comet) visualization of the scheduler's activity in the admin app. I want to see a LED light turn on when it's processing a task. I want to see the queue of tasks displayed as a sequence of divs on screen:

   [ task 9 ]        Upcoming tasks
   [ task 8 ]
   [ task 7 ]
  - - - - - - - - -     
   [ task 6 ]        Completed tasks
   [ task 5 ]
   [ task 4 ]

and when I click on a task it shows me its details.

pbreit

unread,
May 6, 2012, 1:46:27 PM5/6/12
to web2py-d...@googlegroups.com
I don't know if the scheduler should be "on" by default but I generally like the gist of these recommendations.

Michael Toomim

unread,
May 6, 2012, 4:15:12 PM5/6/12
to web2py-d...@googlegroups.com
Ok, then let's analyze whether the scheduler should run by default.

Upsides:
  • It "just works" automatically in the dev server, just like everything else in web2py
  • New users don't need to use command-line crap

Possible downsides:
  1. Performance
  2. Backwards incompatibility
  3. Ease of use

Here's how to solve these possible downsides:

1. Performance
Maybe running another process will slow down the system for people who don't use it.
But if the schedule is slow doing nothing, this is a bug.
The scheduler by default does two things:
  • Waits
  • Queries the db.
Waiting takes no resources. But querying the DB can block if it's sharing a sqlite database.
So we will be ok with performance if we can verify it does not block w/ sqlite.
Perhaps it can use a different db file by default if we're using sqlite?
Or is it possible for it to check the db for tasks w/o blocking?

2. Backwards incompatibility
Since we're just talking about using it with Rocket, the only incompatibility I can think of is if a user is using Rocket as their production server, and stops the server by killing the web2py process without killing the scheduler process. I think we can solve this by:
  • Killing the scheduler when main web2py process quits
  • If it somehow evades death, then have the scheduler also check for its parent process in each loop iteration and die if the parent is dead.

And keep in mind that this is only for people using the built-in rocket webserver. If you run web2py via apache, I think you'll have to set up the scheduler process on your own. ... because e.g. cron doesn't even work... Right?

3. Ease of use
The only usability problem I see right now with enabling the scheduler is showing the user a bunch of logging messages they did not expect. These should be turned off.

pbreit

unread,
May 7, 2012, 4:44:50 PM5/7/12
to web2py-d...@googlegroups.com
Saw this today. Queuer that takes advantage of Postgres features:

Michael Toomim

unread,
Jun 9, 2012, 4:08:29 PM6/9/12
to web2py-d...@googlegroups.com
I found another problem. By default, tasks die a hidden death after a day, because they have a stop_time defaulting to 1 day:

Field('stop_time','datetime',default=now+datetime.timedelta(days=1))

Here's the scenario in which I found this to be difficult to debug:
  • I'm creating a daemon that will run forever (I've set repeats=0), processing a queue.
  • I test it, debug it, and everything is working
  • I come back after a couple days and find everything is messed up... jobs aren't being run!
  • I test everything, I look in the database and see that the scheduler_task is still sitting there, in state QUEUED, but it is not being run! If it's in state QUEUED, why isn't it being run?
  • I dive into the scheduler.py code to check line-by-line with the query used in pop_task(), to try to find what is happening that my queued task is not being run...
It seems to me that stop_task is likely to get in the way more than it is useful... shouldn't it be a safeguard that the programmer can add to their scheduled tasks if they need it, but having a hidden safeguard with a hidden default is bound to just confuse anybody scheduling tasks that run for more than a day.

I also do not see a way of disabling the stop_time safeguard, other than setting it to some large number with a parameter like: 

stop_time = now + timedelta(days=90000),

I propose:
  • We change stop_time's default to some enormous time_max value (like above)
  • Is there a way to make this optional, without the ugly "add 90000 days to today" trick?

szimszon

unread,
Jun 10, 2012, 9:30:25 AM6/10/12
to web2py-d...@googlegroups.com
There is no hidden safeguard it is documented in book:

http://web2py.com/books/default/chapter/29/4#Scheduler-%28experimental%29

--- cut ---
The best practice is to queue tasks using "insert". For example:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
db.scheduler_task.insert(
status='QUEUED',
application_name='myapp',
task_name='my first task',
function_name='task_add',
args='[]',
vars="{'a':3,'b':4}",
enabled=True,
start_time = request.now,
stop_time = request.now+datetime.timedelta(days=1),
repeats = 10, # run 10 times
period = 3600, # every 1h
timeout = 60, # should take less than 60 seconds
)

Notice that fields "times_run", "last_run_time" and "assigned_worker_name" are not provided at schedule time but are filled automatically by the workers.

--- cut ---

But you are right about the state shouldn't be QUEUED but something like EXPIRED (expired before executed) - because there is scenarios when something is not done before stop_time than it must not be done because it could do more harm than good.

Michael Toomim

unread,
Jun 10, 2012, 4:51:26 PM6/10/12
to web2py-d...@googlegroups.com
Thank you for the thoughtful response! I like your idea of making the state EXPIRED instead of QUEUED when time has exceeded stop_time. That sounds right. It would be much clearer when debugging!

However, let me clarify—I only have a problem with the default value for stop_time = 1 day, not with stop_time itself.

No matter what task you create, it has a default stop time of 1 day after you make it. This is very surprising! Only in certain cases do you need a stop_time. And for other cases, you end up making tasks that work fine... until a day later they all break! The default for stop_time should be None.

Here's a patch:

--- a/web2py/gluon/scheduler.py
+++ b/web2py/gluon/scheduler.py
@@ -336,7 +336,7 @@ class Scheduler(MetaScheduler):
             Field('enabled','boolean',default=True),
             Field('start_time','datetime',default=now),
             Field('next_run_time','datetime',default=now),
-            Field('stop_time','datetime',default=now+datetime.timedelta(days=1)),
+            Field('stop_time','datetime',default=None),
             Field('repeats','integer',default=1,comment="0=unlimted"),
             Field('period','integer',default=60,comment='seconds'),
             Field('timeout','integer',default=60,comment='seconds'),
@@ -381,7 +381,7 @@ class Scheduler(MetaScheduler):
             all_available = db(ts.status.belongs((QUEUED,RUNNING)))\
                 ((ts.times_run<ts.repeats)|(ts.repeats==0))\
                 (ts.start_time<=now)\
-                (ts.stop_time>now)\
+                ((ts.stop_time==None) | (ts.stop_time>now))\
                 (ts.next_run_time<=now)\
                 (ts.enabled==True)\
                 (ts.group_name.belongs(self.group_names))\

The first change defaults stop_time to None. The second ignores stop_time if it's set to None.

szimszon

unread,
Jun 11, 2012, 9:18:02 AM6/11/12
to web2py-d...@googlegroups.com
+1

Niphlod

unread,
Jun 11, 2012, 5:00:19 PM6/11/12
to web2py-d...@googlegroups.com
well, 1st post on this list .... somehow I always missed it.
Having worked intensively with the scheduler in the last month, I'll add my 2 cents. I know they'll seem as 20 at the end of the post, please bear with me for a while.
Please consider this post as a totally personal point of view of what the scheduler is, and excuse me from the beginning if the post is long. I know that the issue starting this thread was the stop_time defult value, but it "escalated" to a "full review" of the scheduler itself. Actually, you threw in some nice ideas.

We have a scheduler that works, trying to tie up (as always in web2py) a range of newbies developers and experienced users, from small sites to heavily populated production sites.
Scheduler was born to "accomplish" the same task of the other elephant in the room, celery, without having to cope with external software installations, queue handling, exchange queues, and so on.
I started watching celery a while ago, just after redis, for purely personal interests. I worked a lot with MS Service Broker and it was nice to see that open sourced softwares were accomplishing the same tasks. I know pretty much all the designs around for task queueing.......and then the scheduler came, with what, actually, I think it's one of the most concise, multi-platform, no-dependencies, no-frills nice pieces of code around for async tasks handling.
Now, I admit that scheduler is a bit heavy for "I send one email every 2 minutes" and a bit slow (implementation wise) for "I need to send 1000 mails per second". This is just because of finding the "sweet spot" among typical web2py's users needs.

With current defaults "in idle" the scheduler needs ~1 query every 3 sec + 1 query every 15 sec, for a nice total of ~30 queries per minute for every worker. The math isn't precise because if running multiple schedulers, with trunk, one process will take the lead and the others just follow that.

Now, tasks handling of the scheduler is not an issue for "the big databases" (oracle, mssql, mysql, postgres, etc) but are a real pain in the ass for sqlite.
Other implementations relying on sqlite mitigate the issue thanks to SQLAlchemy, but with the DAL and sqlite some problems arise.
BTW, there could be a nice shortcut, if sqlite3 is linked to a recent SQLite DLL (> 3.7.0), one could activate WAL on the database and all problems will be gone (I tested that, SQLite is reaaaaalllly fast with WAL. If someone knows how to recompile sqlite3 against an updated DLL and maybe embed that on web2py, pleeeeease come up!).

I made recently some changes in the scheduler to having it work faster for multiple workers, involving less queries and making it "aware" of the other scheduler processes around (one scheduler is a TICKER, a master giving away tasks, the others, slaves, will only process the tasks assigned to them), but with SQLite and >3 workers locked databases are just around the corner.
As the scheduler can be totally decoupled from the "application" database, creating it on another uri is always nice, and quite a "needed requirement" for SQLite.
If the scheduler is working to some tasks, it needs to make a lot of writes on the db, so if that is locked, it can be a major issue for the application.

Let's get to the ideas on the table....

I don't see an improvement to have it activated by default (for the above reasons): developers wanting to use scheduler are a few steps away to have it working, but need to consider the implications.

As for tasks names to be predefined, it could save you some typing, but the actual method allow you to having a single model with all the functions you need to have the application working and choose only some of them to be available for the scheduler.
If this feature is required, someone could be so nice to give an example of the implementation ? Having the functions calls as values in a dict is an ultra-valuable way of passing those around. BTW, I didn't see different implementations around.

Jsonify args and vars by default, could be easy with new "triggers" on DAL, I like it and hope it's possible. Could break backward compatibility for who is using scheduler yet, so I think the best solution is provide a schedule() function.

Importing gluon.scheduler by default ..... well, I don't see problems on that, but having a Scheduler(db) line around could be redundant (table definition one could not be in the need for, just like record versioning). Maybe a commented line in the scaffolding model is better.

Having peoples dealing with async tasks without knowing to handle them could be a problem: taking your example in consideration, one needs to code a sending mail task that watches for errors, requeue them if some specific errors arise, remove them from queue after a few trials for some other errors if you don't want to be included super-fast in a spamlist, check for actual sending success/error if you want to inform the user that the mailbox he provided is unreachable, etc.

Too much log issue by default, easily fixable. I think the info level should be enough.

Realtime visualization is an issue. For a webcomet server we'd need to include something like tornado in the scheduler and that's hardly multi-platform, and quite a heavy requirement. Having to poll the db for task statuses every n seconds could be done, but we'll face the locking problems described before with SQLite, especially when the worker is processing tasks. Not to mention that the scheduler could be running on another machine than the one "querying" its status, if you're thinking to skip scheduler_* tables on the db....

Stop time null by default, I'm ok with that, maybe the defaults are a little too much "safer"

Task marked as EXPIRED if times_run < repeats or repeats = 0 and stop_time > now , I'm okay with that.

Michael Toomim

unread,
Jun 11, 2012, 11:15:52 PM6/11/12
to web2py-d...@googlegroups.com
Hi Niphlod, thanks for sharing your experiences and detailed thoughts!

As you requested, here's a patch that makes "predefined tasks" optional! It looks up task functions in the models environment after trying the explicit tasks parameter.

Additionally, it now loads controller-specific subdirectories when loading the models environment. To do that, I made these design decisions:
  • Store the task's controller as 'app/controller' in the "app" field of the db.
     (The alternative would have been to add a new "controller" field to the db.)
  • By default, tasks will preserve the same app/controller environment in which they are created.

Do y'all like it?

--- a/web2py/gluon/scheduler.py
+++ b/web2py/gluon/scheduler.py
@@ -144,16 +144,21 @@ def executor(queue,task):
     try:
         if task.app:
             os.chdir(os.environ['WEB2PY_PATH'])
-            from gluon.shell import env
+            from gluon.shell import env, parse_path_info
             from gluon.dal import BaseAdapter
             from gluon import current
             level = logging.getLogger().getEffectiveLevel()
             logging.getLogger().setLevel(logging.WARN)
-            _env = env(task.app,import_models=True)
+            # Get controller-specific subdirectory if task.app is of
+            # form 'app/controller'
+            (a,c,f) = parse_path_info(task.app)
+            _env = env(a=a,c=c,import_models=True)
             logging.getLogger().setLevel(level)
             scheduler = current._scheduler
-            scheduler_tasks = current._scheduler.tasks
-            _function = scheduler_tasks[task.function]
+            f = task.function
+            # First look for the func in tasks, else look in models
+            _function = current._scheduler.tasks.get(f) or _env.get(f)
+            assert _function, 'Function %s not found in scheduler\'s environmen
             globals().update(_env)
             args = loads(task.args)
             vars = loads(task.vars, object_hook=_decode_dict)
@@ -312,6 +317,7 @@ class Scheduler(MetaScheduler):
 
     def define_tables(self,db,migrate):
         from gluon import current
+        from gluon.dal import DEFAULT
         logging.debug('defining tables (migrate=%s)' % migrate)
         now = datetime.datetime.now()
         db.define_table(
@@ -323,7 +329,8 @@ class Scheduler(MetaScheduler):
             Field('status',requires=IS_IN_SET(TASK_STATUS),
                   default=QUEUED,writable=False),
             Field('function_name',
-                  requires=IS_IN_SET(sorted(self.tasks.keys()))),
+                  requires=IS_IN_SET(sorted(self.tasks.keys())) \
+                      if self.tasks else DEFAULT),
             Field('args','text',default='[]',requires=TYPE(list)),
             Field('vars','text',default='{}',requires=TYPE(dict)),
             Field('enabled','boolean',default=True),
@@ -338,7 +345,9 @@ class Scheduler(MetaScheduler):
             Field('assigned_worker_name',default='',writable=False),
             migrate=migrate,format='%(task_name)s')
         if hasattr(current,'request'):
-            db.scheduler_task.application_name.default=current.request.applicat
+            db.scheduler_task.application_name.default = \
+                '%s/%s' % (current.request.application,
+                           current.request.controller)
 
         db.define_table(


--
mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/

Michael Toomim

unread,
Jun 11, 2012, 11:47:39 PM6/11/12
to web2py-d...@googlegroups.com
Ah, I think you're right that the scheduler cannot be on by default—that breaks on sqlite, which totally ruins the first-time experience I'm hoping to improve! This is a blocker until we have threadsafe sqlite.

However, I would love to make the scheduler easier to launch. How about a web2py command-line argument that spawns any number of workers, and kills them on quit? Perhaps "python web2py.py -X a1/c1,a2/c2,a3/c3" would spawn three workers alongside the webserver, with three models environments a1/c1, a2/c2, and a3/c3.

I agree with everyone else you've said here. However, I do not think we should be worried about programmers getting into async tasks before they have the "chops" ... I think our job is to make async tasks easier for everyone.

On Jun 11, 2012, at 2:00 PM, Niphlod wrote:

Let's get to the ideas on the table....

I don't see an improvement to have it activated by default (for the above reasons): developers wanting to use scheduler are a few steps away to have it working, but need to consider the implications.

As for tasks names to be predefined, it could save you some typing, but the actual method allow you to having a single model with all the functions you need to have the application working and choose only some of them to be available for the scheduler.
If this feature is required, someone could be so nice to give an example of the implementation ? Having the functions calls as values in a dict is an ultra-valuable way of passing those around. BTW, I didn't see different implementations around.

Jsonify args and vars by default, could be easy with new "triggers" on DAL, I like it and hope it's possible. Could break backward compatibility for who is using scheduler yet, so I think the best solution is provide a schedule() function.

Importing gluon.scheduler by default ..... well, I don't see problems on that, but having a Scheduler(db) line around could be redundant (table definition one could not be in the need for, just like record versioning). Maybe a commented line in the scaffolding model is better.

Having peoples dealing with async tasks without knowing to handle them could be a problem: taking your example in consideration, one needs to code a sending mail task that watches for errors, requeue them if some specific errors arise, remove them from queue after a few trials for some other errors if you don't want to be included super-fast in a spamlist, check for actual sending success/error if you want to inform the user that the mailbox he provided is unreachable, etc.

Too much log issue by default, easily fixable. I think the info level should be enough.

Realtime visualization is an issue. For a webcomet server we'd need to include something like tornado in the scheduler and that's hardly multi-platform, and quite a heavy requirement. Having to poll the db for task statuses every n seconds could be done, but we'll face the locking problems described before with SQLite, especially when the worker is processing tasks. Not to mention that the scheduler could be running on another machine than the one "querying" its status, if you're thinking to skip scheduler_* tables on the db....

Stop time null by default, I'm ok with that, maybe the defaults are a little too much "safer"

Task marked as EXPIRED if times_run < repeats or repeats = 0 and stop_time > now , I'm okay with that.

Niphlod

unread,
Jun 12, 2012, 4:43:29 AM6/12/12
to web2py-d...@googlegroups.com
I like the implementation: in Italy it's morning, this evening I'll review it.
BTW: scheduler on sqlite is fine if on a separate db and workers < 3 . I'm "against" turning it on by default because I think that's not so "useful" for everyone, just like for record versioning.

Niphlod

unread,
Jun 13, 2012, 4:29:14 PM6/13/12
to web2py-d...@googlegroups.com
Done rewiewing.... there's a potential problem... having scheduler watching for attributes or _env is potentially dangerous: also protecting with callable(_env.get(f)) one could always insert a task_function of all callable things web2py injects into the environment, as validators, T, cache, crud, auth, etc etc etc.

Should we check that the task_function is a callable AND a task function defined in the app and not at the "env" ("gluon") level ?

PS: here's the patch against trunk for managing expired tasks. Please test it

@@ -96,6 +96,7 @@
 ACTIVE
= 'ACTIVE'
 INACTIVE
= 'INACTIVE'
 DISABLED
= 'DISABLED'
+EXPIRED = 'EXPIRED'
 SECONDS
= 1
 HEARTBEAT
= 3*SECONDS
 
@@ -293,7 +294,7 @@
             
self.die()
 
 
-TASK_STATUS = (QUEUED, RUNNING, COMPLETED, FAILED, TIMEOUT, STOPPED)
+TASK_STATUS = (QUEUED, RUNNING, COMPLETED, FAILED, TIMEOUT, STOPPED, EXPIRED)
 RUN_STATUS
= (RUNNING, COMPLETED, FAILED, TIMEOUT, STOPPED)
 WORKER_STATUS
= (ACTIVE,INACTIVE,DISABLED)
 
@@ -361,7 +362,7 @@

             
Field('enabled','boolean',default=True),

             
Field('start_time','datetime',default=now),
             
Field('next_run_time','datetime',default=now),
-            Field('stop_time','datetime',default=now+datetime.timedelta(days=1)),
+            Field('stop_time','datetime'),

             
Field('repeats','integer',default=1,comment="0=unlimted"),
             
Field('period','integer',default=60,comment='seconds'),
             
Field('timeout','integer',default=60,comment='seconds'),
@@ -454,7 +455,8 @@
             run_id
= run_id,
             run_again
= run_again,
             next_run_time
=next_run_time,
-            times_run = times_run)
+            times_run = times_run,
+            stop_time = task.stop_time)
 
     
def report_task(self,task,task_report):
         logging
.debug(' recording task report in db (%s)' % task_report.status)
@@ -465,8 +467,11 @@
             result
= task_report.result,
             output
= task_report.output,
             traceback
= task_report.tb)
+        is_expired = task.stop_time and task.next_run_time > task.stop_time and True or False
+        status = (task.run_again and is_expired and EXPIRED
+                  or task.run_again and not is_expired and QUEUED or COMPLETED)
         
if task_report.status == COMPLETED:
-            d = dict(status = task.run_again and QUEUED or COMPLETED,
+            d = dict(status = status,
                      next_run_time
= task.next_run_time,
                      times_run
= task.times_run)
                     
#I'd like to know who worked my task, reviewing some logs...
@@ -539,14 +544,17 @@
         now
= datetime.datetime.now()
         all_workers
= db(sw.id>0).select()
         workers
= [row.worker_name for row in all_workers]
+        #set queued tasks that expired between "runs" (i.e., you turned off)
+        #the scheduler and then it wasn't expired, but now it is
+        db(ts.status.belongs((QUEUED,ASSIGNED)))(ts.stop_time<now).update(status=EXPIRED)
+
         all_available
= db(ts.status.belongs((QUEUED,ASSIGNED)))\

                 
((ts.times_run<ts.repeats)|(ts.repeats==0))\
                 
(ts.start_time<=now)\
-                (ts.stop_time>now)\
+                ((ts.stop_time>now) | (ts.stop_time == None))\
                 
(ts.next_run_time<=now)\
                 
(ts.enabled==True)\
-                (ts.group_name.belongs(self.group_names)) #\
-                #(ts.assigned_worker_name <> self.worker_name)
+                (ts.group_name.belongs(self.group_names))
         limit
= len(workers) * 50
         
#if there are a moltitude of tasks, let's assign a maximum of 50 tasks per worker.
         
#this can be adjusted with some added intelligence (like esteeming how many tasks will




pbreit

unread,
Jun 13, 2012, 10:26:41 PM6/13/12
to web2py-d...@googlegroups.com
So is the scheduler trying to address both situations 1) performing individual tasks at a later time (frequently immediately, like sending an email) and 2) performing repetitive tasks at certain times (like cron)?

I think both of these are extremely important and I have not been that successful implementing them. Cron probably makes the most sense for #2. I know there are solutions for #1 like Celery and ZeroMQ but I had no luck whatsoever getting them going.

I'd really like a good out-of-the-box solution for #1 such as what Rails has with Delayed Job. Maybe scheduler is that thing? For repetitive tasks, I'll probably stick with cron.

Massimo DiPierro

unread,
Jun 13, 2012, 10:49:30 PM6/13/12
to web2py-d...@googlegroups.com
The problem with cron is that i start a repetitive task even if the previous instance is still running or locked a resource thus causing spike in resource usage. The scheduler keeps resource consumption constant. It skips a task if previous instance is not completed.

Niphlod

unread,
Jun 14, 2012, 7:53:13 PM6/14/12
to web2py-d...@googlegroups.com
@Massimo, if I understood correctly the code the executor function recreates the env on each function call. For heavy apps this could be a hassle..... basically the scheduler does what it could be made placing the functions in a controller and call them as web2py.py -M -N -S app/thecontroller/thefunction.... sort of a cron on steroids. Obviously it does much more (manage the task queue, schedules times, handles repeats, save the traceback, terminates after a timeout and so on.....)

I noticed this using cache.ram into my modules and observing that the cache gets destroyed and recreated for every task....

Was this required when building the scheduler in the first time or have you encountered any issues ?

Don't get me wrong, I like the scheduler, but.... the scheduler should behave like a web2py process, loading modules only once, with the only difference in executing functions with no "forced" timeout.

Can the env be created only when scheduler starts (the main thread) and passed along to the executor function?

It would be more lightweight (e.g. could use cache.ram reliably, load all the modules only once, etc). If I understand correctly now you can leave the scheduler running, change models, and those changes would be reflected on the first task executed after saving the models. You have to restart the scheduler only to assign new functions....
Again, if I understood correctly all the roundtrips, this behaviour is nice for developing, but maybe resources could be saved for long/frequent scheduler processes.

PS: I get that a multiprocessing environment is heavier than a threaded one, and that is more parallelizable, but having the modules reloaded each time with no possibilities to cache anything........sounds bad.  After all the user in the need to send emails with a heavy app could consider as a shortcut to provide xmlrpc to scheduler functions, start a local webpy process with high timeouts and call it to process the tasks..... after all, rocket is like a pool of workers (yes, threaded, but I'm not many web2py users are in the need of processing a Fibonacci series....)

Niphlod

unread,
Jun 15, 2012, 5:30:36 PM6/15/12
to web2py-d...@googlegroups.com
Please stop me if I'm getting all wrong.
Every task get processed in a "fresh" process, that "must" be started and killed in order to use the timeout facilities (the logic is expressed in the async function).
 
Futher elaborations on the matter "loading env only once will be smarter than the current behaviour":
I tried to put a time.sleep(10) in a model and watched every task hanging for 10 seconds before getting processed, so that's another clue that what I said was right.
I modified the scheduler accordingly to create the env in the loop() function, just before the actual loop (while True and self.have_heartbeat), passing that around to self.async, that in turn passes it to executor.
Now scheduler hangs before the first loop only, tasks get executed without hangs, so, I'm fine with that!

But..... a greeeeat but. I though I won the battle on this, but also if printing inside "executor" env.get('cache') returns the same instance (same identifier, as in <gluon.cache.Cache object at 0x17b6110>), my tasks seems to pick another cache, i.e. defining a function that is

def mytest():
    return cache.ram('theinvaluablekey', lambda : time.ctime(), time_expire=1500)

the returned value is never cached (I think actually it gets cached and then trashed on process exit)

Am I missing something ?


Niphlod

unread,
Jun 17, 2012, 2:10:05 PM6/17/12
to web2py-d...@googlegroups.com
ok, I catched the error: you can pass the env to the executor function, that gets modified eventually by the task, but there is no way to pass the modified env back (env can not safely pickled).

@Massimo: my patch allows to create the env for the app before the first loop instead of every loop: that should alleviate execution times for apps that need to do lightweight tasks (like sending emails) but with "huge" models. If you don't see downsides I'll refactor scheduler and send a patch against current trunk.

@anyone: if you see any way to let the scheduler be substantially a web2py instance with everything loaded at start, and process tasks only executing functions, you're more than welcome to give your ideas :D

Michael Toomim

unread,
Jun 17, 2012, 7:39:35 PM6/17/12
to web2py-d...@googlegroups.com
Ah, I like this two-part classification! This perfectly captures how I see the main beneficial use-cases of the scheduler.

My efforts in using the scheduler for repetitive tasks (which I recently posted to web2py-users) is for #2. I want to find a good recipie for it. Any ideas there? This should go in the book.

Michael Toomim

unread,
Jun 17, 2012, 7:55:49 PM6/17/12
to web2py-d...@googlegroups.com
@Niphlod, Perhaps cache.ram() is a different issue than reloading models and modules. It would be nice for cache.ram() to work, but I also like reloading models for each task. These scheduler tasks run on the order of seconds, not milliseconds, so setting up a new environment for each task seems like no big deal to me. And it is better for development, and makes the scheduler work more like the rest of web2py. If loading models is fast enough for a HTTP request, shouldn't it be fast enough for running a scheduler task?

It sounds like the challenge is to get all subprocesses to share the same cache.ram(). Can we not just pass the cache object as an argument to executor(), within the call to multiprocessing.Process() in scheduler.py?

Niphlod

unread,
Jun 18, 2012, 4:12:00 AM6/18/12
to web2py-d...@googlegroups.com
we can (and in fact I did) pass something to the executor function. The env gets modified (e.g. using cache.ram) but then you have to pass the modified env back to the main process..... the only way is to put that in the queue, but all env is not safely pickeable, so it's a no-go.

I don't know if recreating env means reloading all the models or the modules also. I should test it and yes, you're right that if an app is fast enough for an http request, then it's usually faster.

Michael Toomim

unread,
Jun 18, 2012, 10:51:02 AM6/18/12
to web2py-d...@googlegroups.com
I see. The python multiprocessing module also provides other forms of interprocess communication beyond the Queue, such as shared memory: http://docs.python.org/library/multiprocessing.html#sharing-state-between-processes

However, all interprocess communication methods have weaknesses, and this is starting to sound like a bad idea. The point when using multiple processes is to avoid shared memory, and the point of cache.ram is to share memory. If caching is needed in the scheduler, perhaps people should use cache.disk, and ensure their objects are serializable.

Niphlod

unread,
Jun 18, 2012, 11:06:49 AM6/18/12
to web2py-d...@googlegroups.com
When I said "it's a no-go" I meant the same thing. cache.disk or cache.memcached or cache.redis are going to be used if something heavy is going "under".

Michael Toomim

unread,
Jul 7, 2012, 4:09:43 AM7/7/12
to web2py-d...@googlegroups.com
Niphlod, I disagree. Can you provide a scenario where this danger exists?

Maybe it appears dangerous to execute a function from the environment—but we were always executing functions from the environment. This just removes the need to define the function in multiple places.

Niphlod

unread,
Jul 7, 2012, 4:35:03 PM7/7/12
to web2py-d...@googlegroups.com
well, you're right, after all it's a "one-line" modification, all I was asking is some guidance.
I'm working hard towards making scheduler more flexible and manageable (as stated in the other post on web2py-developers, there's a fix for your problem too), and making an example app to let users feel more comfortable using that. As soon as I finish this round of patches and making the app, I'll release both the app and the new scheduler, and maybe open a poll on web2py-users for "most-wanted" missing features.
Let's see if anyone has to say something about being able to schedule every callable in the environment in the next few days. If anyone come up with the same need (here or on web2py-users after the new patch) I'll include that.

Massimo DiPierro

unread,
Jul 7, 2012, 5:56:40 PM7/7/12
to web2py-d...@googlegroups.com
I have missed this discussion and committed the change proposed by Michael this morning as its patch was posted in a googlecode issue.

I agree with Michael and you there there is no harm in this patch. Perhaps there is some negative effect I have not thought about.

Do you think it should be removed?

Massimo

Niphlod

unread,
Jul 8, 2012, 11:20:43 AM7/8/12
to web2py-d...@googlegroups.com
As I said before, I would have waited for someone to show up and say something about it.....
You're the boss :-P

I'll merge this with my patch and write docs about it.

Niphlod

unread,
Jul 8, 2012, 11:43:23 AM7/8/12
to web2py-d...@googlegroups.com

Michael Toomim

unread,
Jul 8, 2012, 2:06:27 PM7/8/12
to web2py-d...@googlegroups.com
Oh good eye! That was my fault... the line got cut off in my terminal. Here's the full line:

            assert _function, 'Function %s not found in scheduler\'s environment' % f

Michael Toomim

unread,
Jul 8, 2012, 2:08:08 PM7/8/12
to web2py-d...@googlegroups.com
Ok, thanks for the explanation! Sorry for my harsh tone earlier, I misunderstood your position.

I had an idea about how to solve my scheduler recurring-tasks problem... I will have to look into your updates and see how to merge this idea.

Niphlod

unread,
Jul 8, 2012, 7:21:04 PM7/8/12
to web2py-d...@googlegroups.com

Well, writing tests reeeealllly sucks. But while writing them I was able to resolve a few quirks here and there, so I guess the time spent on it was well spent.

@Michael, look into scheduler_task.uuid. It's declared as "unique" so if you set a unique identifier for your recurring task in a try: except: clause, your problems of locking should be resolved without further issues.

@all : please test the app in the zip substituting the provided scheduler.py. App itself is poorly coded, but for now it's ok. If anyone don't say anything bad about it (with the new scheduler too), I'd say we:
- push the scheduler code in web2py trunk
- post the app on web2py-users and on github
- open a poll of "most-wanted" features

Then I (or others) implement those, update the app/docs, test and reiterate. When the features will be more stable, it should be a breeze integrating docs with the book (they are in MARKMIN already)

Things I'm not sure about the scheduler:
- write pidfiles somewhere
- logging levels of various messages (die, terminate, new tasks, "TICKER", etc)
- autoremove KILL and TERMINATE workers from the scheduler_worker table before exiting (right now they're cleaned by other ACTIVE workers)

If this behaviour is tested correct,  nice things to have would be:
- a more flexible way to start the scheduler in "embedded mode" , i.e. have some parameters passed to web2py.py that goes into the scheduler istantiation (i.e. max_empty_runs)
- some options (or scripts) to disable,terminate,kill workers too.

The more testers the merrier: I was able to test it on 2 linux boxes, sqlite and postgres, and 1 Windows box, sqlite only. All tests passed.
scheduler_tests.zip

Michael Toomim

unread,
Jul 9, 2012, 4:24:24 PM7/9/12
to web2py-d...@googlegroups.com
I got this error... but have been unable to reproduce it. It occurred when adding the uuid column. Any ideas?

I thought "perhaps UNIQUE isn't supported by sqlite" — but it totally seemed to work after I restarted the server. Perhaps this is a fluke, or perhaps it's something to look into.

<class 'sqlite3.OperationalError'> Cannot add a UNIQUE column

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
Traceback (most recent call last):
 
File "/home/toomim/projects/utility/web2py/gluon/restricted.py", line 205, in restricted
    exec ccode in environment
  File "/home/toomim/projects/utility/web2py/applications/scheduler_tests/models/scheduler.py", line 28, in <module>
   
scheduler = Scheduler(db)
 
File "/home/toomim/projects/utility/web2py/gluon/scheduler.py", line 364, in __init__
    self.define_tables(db,migrate=migrate)
 
File "/home/toomim/projects/utility/web2py/gluon/scheduler.py", line 398, in define_tables
    migrate=migrate,format='%(task_name)s')
 
File "/home/toomim/projects/utility/web2py/gluon/dal.py", line 6320, in define_table
    polymodel=polymodel)
 
File "/home/toomim/projects/utility/web2py/gluon/dal.py", line 742, in create_table
    fake_migrate=fake_migrate)
 
File "/home/toomim/projects/utility/web2py/gluon/dal.py", line 830, in migrate_table
    self.execute(sub_query)
 
File "/home/toomim/projects/utility/web2py/gluon/dal.py", line 1392, in execute
    return self.log_execute(*a, **b)
 
File "/home/toomim/projects/utility/web2py/gluon/dal.py", line 1386, in log_execute
    ret = self.cursor.execute(*a, **b)
OperationalError: Cannot add a UNIQUE column

Michael Toomim

unread,
Jul 9, 2012, 7:44:03 PM7/9/12
to web2py-d...@googlegroups.com
Other than that, all the tests completed successfully for me. I'm using a ubuntu VM.

I'm going to try using this scheduler.py in my main project now.

Michael Toomim

unread,
Jul 9, 2012, 7:52:59 PM7/9/12
to web2py-d...@googlegroups.com
Niphlod you've done some good work here! The scheduler is maturing, I can see the new features coming together into a coherent whole.

As for an "introduction to the scheduler" app, I think the examples you've posted here are very helpful. I learned new features just from the examples. And in fact, as a first-time user, this is what I would want to see first. I would take the example and copy-paste to fit my situation.

The english at the beginning is a bit difficult to read, though. I suggest a few things:
  1) Put the examples first, and the detailed explanation with state-charts etc. second.
  2) Add a couple frequent "real-world" scenarios like "here's how to send delayed email."
      This is useful to show people just how simple it can be. Like the rails "make a blog" video. :)
  3) Improve the language & writing style in the detailed explanation section

Niphlod

unread,
Jul 10, 2012, 12:35:04 AM7/10/12
to web2py-d...@googlegroups.com
ok, going in order:
- SQLite error: probably due to the fact there was not an empty scheduler_task table. I don't know precisely how migrations work into details, but having a unique column "added" to an existing table needs to be populated "first" with unique values than having a unique index actually created. i.e. probably the migration added a column and enforced the unique index creation before filling it with unique values. Anyway, it's more likely to not happen for newly created tables and/or with empty scheduler_task table.
- English is certainly not my mother tongue, and I'm open to all suggestions, I'll make those modifications right away, feel free to suggest "grammar" corrections. I'll try to rewrite all of those
- I'll reduce the period time for repeating tasks to have a "smaller" time-window in order to wait less in front of the screen
- I'll watch that video and see if there are more fit examples, delayed email sending looks good :-P 

Reply all
Reply to author
Forward
0 new messages