web2py Scheduler with Postgres creates "idle in transaction" connections

113 views
Skip to first unread message

stingpan

unread,
Feb 17, 2015, 7:36:12 PM2/17/15
to web...@googlegroups.com
web2py version: 2.9.12-stable
Postgres server: 9.1

I have 3 databases for my application, only one is designated for Scheduler and its tables. What I noticed is that when I start scheduler (using upstart deployment and running "start web2py-scheduler" as root), all 3 databases are immediately connected to. These connections become "idle in transaction" (using ps aux | grep postgres)  which sometimes causes locking problems when I attempt make changes to the databases (either through the app or executing a script).  I'm assuming that Scheduler makes these kinds of connections so we can use DAL when defining the task functions. However the tasks I have set up for Scheduler calls a standalone script that does not rely on web2py or DAL and uses pyscopg2 to access/modify the other 2 databases. Is there a way I can have Scheduler run without holding up the other 2 databases? 

Here's a snippet of my Scheduler set up:

from gluon.scheduler import Scheduler

scheduler_db = DAL(config.get('connection'), pool_size=10, migrate=False)

scheduler = Scheduler(
    scheduler_db,
    heartbeat = 5,
    migrate = '',
    tasks = dict(
        taskA = taskA,
        taskB = taskB
    )
)


Niphlod

unread,
Feb 18, 2015, 4:04:45 PM2/18/15
to web...@googlegroups.com
uhm, the problem is more subtle.........

The main process of the scheduler is like a shell opened on that application....... it needs to reads models to see the scheduler definition. I guess that this means that it will also connect to all the databases defined in models, even if it will effectively send/receive commands only on the scheduler_db.
Those "additional" connections are not needed as a matter of fact, but shouldn't block anything too: they're idle and never used.

Every spawned process (the one that will process the task) needs to execute models, else your task won't be able to access, e.g. "db", and, to be fair, they won't be able to "see" the tasks definitions in the first place. Those connections are used, but the connections are kept open just for the time the task gets processed (the spawned process dies as soon as the task finishes).

Let me check if there is a workaround for the  connections on the main process. I'll get back to you (at most in a few days)

Niphlod

unread,
Feb 18, 2015, 5:09:22 PM2/18/15
to web...@googlegroups.com
uhm^3. The code is quite unfixable as it is (launching scheduler with web2py.py -K appname). Remember that the only thing I'm trying to fix is the main process using idle connections to all databases defined in the models of you app.

However, there's a small unknown trick: the scheduler can be started on its own, and process happily tasks defined in applications, as long as
they are reachable from within the path you're launching it.
The former translates to: if you are on the same path web2py is in, you can use another commandline to launch the scheduler, whose main process will only be aware of the "db_sched" connection.
Small fixes are needed to make the "unknown trick" work again (up until now the "trick" hasn't been tested much) but the working file is here .

You should start as

cd web2py # <<-- path where web2py.py is
python gluon
/scheduler.py -u uri_of_the_db -f folder -L 0 -b 2


where:
- uri_of_the_db is the database uri (i.e. postgresql://....., mind that for sqlite, you should pass the entire relative path, as in sqlite:///applications/appname/databases/storage.sqlite)
- folder is the relative folder where the database tables are (i.e. applications/appname/databases/ )
- the number after -L is the logging level (0 all, 100 nothing)
- the number after -b is the heartbeat in seconds

Please try it and see if the idle connections are still there or not.

Michele Comitini

unread,
Feb 18, 2015, 6:14:14 PM2/18/15
to web...@googlegroups.com
idle connection are one thing and they can be fine.
idle in transaction, usually happens when there is a missing commit and the transaction stays open, delivering all sorts of locking over the long term
usually a commit or rollback before the fork should solve the problem.

--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

stingpan

unread,
Feb 18, 2015, 7:27:54 PM2/18/15
to web...@googlegroups.com
I actually get an error using your trick:

Traceback (most recent call last):
  File "gluon/scheduler.py", line 1512, in <module>
    main()
  File "gluon/scheduler.py", line 1506, in main
    utc_time=options.utc_time)
  File "gluon/scheduler.py", line 588, in __init__
    self.define_tables(db, migrate=migrate)
  File "gluon/scheduler.py", line 613, in define_tables
    from pydal.base import DEFAULT
ImportError: No module named pydal.base



Thanks for your help anyway Niphlod. But maybe I wasn't being very clear when I mentioned the connections being "idle in transaction" rather than just "idle". I did find a work-around! I segregated scheduler functions into its own application. A dedicated application for scheduler meant scheduler would only hold a single connection to its own dedicated database. No more locks accessing the other 2 databases so everything works quite well now :)

Niphlod

unread,
Feb 19, 2015, 4:20:59 AM2/19/15
to web...@googlegroups.com
in my environment, I can't see any idle in transaction........ that's why I patched the scheduler and asked to do that test.
Reply all
Reply to author
Forward
0 new messages