Problems with Scheduler/ComfortScheduler Monitor setup on a different server than website

136 views
Skip to first unread message

DeanK

unread,
Oct 18, 2013, 6:11:33 PM10/18/13
to web...@googlegroups.com
So I'm working to scale up my web2py based app a bit.  Part of this was moving the Scheduler to a separate machine.  Since doing this I'm getting some errors and weird behavior and could use some insight.  There are multiple issues but I sort of need to explain it all so I opted for 1 post with multiple questions instead of multiple concise posts...sorry

My configuration:

"production server" - nginx serving my website. db connects to a mysql instance bound to production server network address. scheduler connects to mysql instance running on the "dev/workq server"
"dev/workq server" - nginx serving a copy of the same web2py directory...plan to use as development server if needed as well.  db connects to mysql instance running on production server. scheduler connects to mysql instance bound to dev server network address.

from 0_db.py:
db = DAL('mysql://dev:XXX...@production.server.edu/myapp',pool_size=8,check_reserved=['mysql'],migrate=ENABLE_MIGRATE,fake_migrate_all=ENABLE_FAKE_MIGRATE)



from scheduler.py:
scheduler = Scheduler(DAL('mysql://workq:xxxx@dev.workq.server.edu/myapp',pool_size=8,check_reserved=['mysql'],migrate=ENABLE_MIGRATE,fake_migrate_all=ENABLE_FAKE_MIGRATE),heartbeat=2)




My steps from empty mysql databases and database directories on both servers and both mysql instances.
- go to site on the production server - migrate = True, fake_migrate = False -> OK
- go to site on the dev server - migrate = True, fake_migrate = False -> Error - <class 'gluon.contrib.pymysql.err.InternalError'> (1050, u"Table 'auth_user' already exists")
- go to site on the dev server - migrate = False, fake_migrate = True -> OK
- start scheduler task -> OK

Am I starting this all up improperly?  I'm a little confused since I've got two web2py instances talking to different db instances for the web app and the scheduler...but I think having to do a fake migrate on the second server makes sense.


So now I think my website is up and running properly.  I then run a function that schedules a job.  This seems to run (by looking at the comfortscheduler monitor), but it's supposed to schedule additional jobs itself.  This never happens.


inside the single task that is scheduled and runs:

for items in my_thing:

  #...do stuff

  # Submit tasks to process more stuff
  print "submitting job \n"
  scheduler.queue_task(my_task2,timeout=60000,
    pvars=dict(arg1= arg1,
    arg2=arg2))
  
  db(db.collections.name == collection).update(last_id=last_id)
  db.commit()



If I try to view the details of the task in the ComfortScheduler monitor (from my production server) by clicking on the UUID link of the task i get an error:

<type 'exceptions.AttributeError'> 'DAL' object has no attribute 'scheduler_task'


1.
2.
3.
4.
5.
6.
7.
8.
Traceback (most recent call last):
File "/home/www-data/web2py/gluon/restricted.py", line 217, in restricted
exec ccode in environment
File "/home/www-data/web2py/applications/parity/views/plugin_cs_monitor/task_details.html", line 111, in <module>
File "/home/www-data/web2py/gluon/dal.py", line 8041, in __getattr__
return ogetattr(self, key)
AttributeError: 'DAL' object has no attribute 'scheduler_task'



I think the problem here is the comfortscheduler code looks at the global db object maybe?  Since I've passed in to Scheduler() a different database is this breaking things?  This might be a question only niphlod can answer since it's his app....


So ignoring that for now, if i go and look in the table I can see the run_output "submitting job submitting job submitting job submitting job" indicating it got to the point in the code where it should have submitted more tasks.  Any idea why new tasks would not be getting scheduled?  I think it might be because I'm calling db.commit() in my task...but this is on my main web2py db, not the db the scheduler is using?  Can i have 2 global db objects?  So should the scheduler be setup more like this:

sched_db = DAL('mysql://workq:xxxx@dev.workq.server.edu/myapp',pool_size=8,check_reserved=['mysql'],migrate=ENABLE_MIGRATE,fake_migrate_all=ENABLE_FAKE_MIGRATE)
scheduler = Scheduler(
sched_db,heartbeat=2)


and then i'd have to call both 

db.commit()
sched_db.commit()

Anyone have a server config like this before?

Long message...lots of questions...sorry and thanks.

Dean




Niphlod

unread,
Oct 19, 2013, 8:39:07 AM10/19/13
to web...@googlegroups.com
let's go with order .....
 
"production server" - nginx serving my website. db connects to a mysql instance bound to production server network address. scheduler connects to mysql instance running on the "dev/workq server"
"dev/workq server" - nginx serving a copy of the same web2py directory...plan to use as development server if needed as well.  db connects to mysql instance running on production server. scheduler connects to mysql instance bound to dev server network address.

so you have two databases, one on the prod and one on the dev.
The db for the webapp is always pointing to the prod server (regardless of who is running the webapp) and the db for the scheduler is always pointing to the dev one (regardless of who is running the scheduler), correct ?
 

from 0_db.py:
db = DAL('mysql://dev:XXXXXX@production.server.edu/myapp',pool_size=8,check_reserved=['mysql'],migrate=ENABLE_MIGRATE,fake_migrate_all=ENABLE_FAKE_MIGRATE)



from scheduler.py:
scheduler = Scheduler(DAL('mysql://workq:xxxx@dev.workq.server.edu/myapp',pool_size=8,check_reserved=['mysql'],migrate=ENABLE_MIGRATE,fake_migrate_all=ENABLE_FAKE_MIGRATE),heartbeat=2)



 
this seems right, although I'd have preferred a simple
db2 = DAL('mssql://dev.work.server....')
sched = Scheduler(db2)
BTW: Scheduler has a migrate argument for itself as Auth does.
 

My steps from empty mysql databases and database directories on both servers and both mysql instances.
- go to site on the production server - migrate = True, fake_migrate = False -> OK
- go to site on the dev server - migrate = True, fake_migrate = False -> Error - <class 'gluon.contrib.pymysql.err.InternalError'> (1050, u"Table 'auth_user' already exists")
 
this is expected. dev has no .table files but tables were created by the prod app. Next step is the correct way to fix the issue. Assuming that the dev app has notion of the scheduler, scheduler tables were created too by the prod app.

- go to site on the dev server - migrate = False, fake_migrate = True -> OK
- start scheduler task -> OK

Am I starting this all up improperly?  I'm a little confused since I've got two web2py instances talking to different db instances for the web app and the scheduler...but I think having to do a fake migrate on the second server makes sense.

Makes perfect sense. BTW, once the tables are created you **should** turn migration to False to avoid unnecessary overhead.
 

So now I think my website is up and running properly.  I then run a function that schedules a job.  This seems to run (by looking at the comfortscheduler monitor), but it's supposed to schedule additional jobs itself.  This never happens.


inside the single task that is scheduled and runs:

for items in my_thing:

  #...do stuff

  # Submit tasks to process more stuff
  print "submitting job \n"
  scheduler.queue_task(my_task2,timeout=60000,
    pvars=dict(arg1= arg1,
    arg2=arg2))
  
  db(db.collections.name == collection).update(last_id=last_id)
  db.commit()


 
Here lies the error. Inside the task you're committing for the "main" webapp db and not on the "scheduler" one. Assuming you did as I told before (having a separate db2 DAL instance), you need to do also db2.commit(), else the task will run ok but it will never commit the records to the scheduler_task table.
 

If I try to view the details of the task in the ComfortScheduler monitor (from my production server) by clicking on the UUID link of the task i get an error:

<type 'exceptions.AttributeError'> 'DAL' object has no attribute 'scheduler_task'


This shouldn't happen. ComfortScheduler retrieves the db of the scheduler from the scheduler instance itself.

s = current._scheduler
dbs = s.db
st = dbs.scheduler_task
sw = dbs.scheduler_worker
sr = dbs.scheduler_run
 
However ComfortScheduler has been battle-tested only by me (AFAIK) and there could be errors.... The main "strange thing" here is that if the page successfully generated "the link" for the task, it should access the task in the same exact way.
Can you provide more details and/or a screenshot ? It may be very well be a bug in the code of the plugin.


Anyone have a server config like this before?

Not precisely the same but the "same". Having different DAL instances is totally supported. Even in w2p_tvseries there are two dbs, one for the webapp and one for the scheduler only. Works like a charm as intended.


Niphlod

unread,
Oct 19, 2013, 9:08:13 AM10/19/13
to
BTW, found the issue with comfort scheduler. Will update as soon as github comes back.

[edit] Done

DeanK

unread,
Oct 21, 2013, 10:26:54 AM10/21/13
to web...@googlegroups.com
Awesome thanks.  Everything is working now.

I've used the scheduler monitor a decent amount and it has worked pretty well for me. This is the first bug I've noticed but if I find more I'll be sure to let you know.  The only tweak I've made is the output from my tasks contains lots of debug information when things go wrong and it was hard to interpret since newline characters were not being displayed in the table.  I simply edited line 29 in run_details.html:

<td>{{=XML(r_.run_output.replace('\n', '<br />'))}}</td>

It's not the prettiest thing in the world. If there are lots of newline characters it makes the output look almost doubled spaced, but it is useful and does the trick.


Thanks!

Niphlod

unread,
Oct 21, 2013, 11:24:29 AM10/21/13
to web...@googlegroups.com
uhm. what if you do

<td>{{=PRE(r_.run_output)}}</td>

Without an example it's hard to guess what should be the correct "style".

DeanK

unread,
Oct 28, 2013, 2:14:06 PM10/28/13
to web...@googlegroups.com
That works well too.  It looks a little better so I'm using it.  Thanks!
Reply all
Reply to author
Forward
0 new messages