how to manage db connections on long running tasks

4,214 views
Skip to first unread message

Paolo Valleri

unread,
Feb 19, 2015, 1:03:33 PM2/19/15
to web2py-d...@googlegroups.com
Hi all, 
using pydal, given this example:
import time
from pydal import DAL, Field

db = DAL('postgres:://web2py:web2py@localhost:5433/test_w2p')
db.define_table('thing',Field('name'))
db.thing.insert(name='Chair')
db.commit()

while True:
    print db(db.thing).count()
    time.sleep(60)
If for example the db is restarted while the application is sleeping, as soon as the application runs the query, it will get, something like:
psycopg2.OperationalError: terminating connection due to administrator command
SSL connection has been closed unexpectedly

In this case I need a kind of (auto-)reconnect transparent method, namely lets try n-times to reconnect and then raise the exception.
Actually I'd avoid to wrap the above code in a try/except statement because I don't want to fill the code with:
....
    try:
        print db(db.thing).count()
    except:
        db._adapter.reconnect()
        print db(db.thing).count()

which duplicates all pydal calls with the result of very hard to maintain code
Is there a way to implement a transparent auto-reconnect ?

Paolo

Michele Comitini

unread,
Feb 19, 2015, 3:51:08 PM2/19/15
to web2py-developers
why keeping the transaction ongoning?

while True:
   print db(db.thing).count() #<- the connection pool is asked for a working connection if there is one already open that will be used
   db.commit() # <- back to the pool but still open without any transaction active
   time.sleep(60)


--
-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-develop...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Paolo Valleri

unread,
Feb 19, 2015, 4:50:12 PM2/19/15
to web2py-d...@googlegroups.com
even with db.commit() I got
psycopg2.OperationalError: terminating connection due to administrator command
SSL connection has been closed unexpectedly

To reproduce the error, restart pg while the application is sleeping.

 Paolo

Michele Comitini

unread,
Feb 19, 2015, 5:29:27 PM2/19/15
to web2py-developers
That's a bug!

Paolo Valleri

unread,
Feb 20, 2015, 1:35:42 AM2/20/15
to web2py-d...@googlegroups.com
Is it a bug or a design choice? 
As far as I can see the pool is accessed (with the method reconnect) only when the dbAdapter is instantiated.


 Paolo

Massimo DiPierro

unread,
Feb 20, 2015, 9:28:41 AM2/20/15
to web2py-d...@googlegroups.com
If the database goes does while an action is in execution (and therefore a transaction) we should not reconnect.

Ideally we should try reconnect when a new http request arrives. The problem is this would require performing a db io even if we do not need one.

It was a design choice to let it fail when the database goes down (very rare) and be able to recover at the next http request without having to check the status of the database.

I am open to suggestions for improvement.

Massimo

Paolo Valleri

unread,
Feb 20, 2015, 10:11:52 AM2/20/15
to web2py-d...@googlegroups.com

Hi Massimo,
I've no http requests at all,i forgot to mantion that it's a python script based on pydal.
It' a feature pydal should have but normally not necessary in web2py (maybe for the scheduler?)

In this case, a db io is better than a failure.

Michele Comitini

unread,
Feb 20, 2015, 10:39:04 AM2/20/15
to web2py-developers
No db io to check status is necessary in most cases IMHO.  Eventually problem to be addressed is when having distributed transactions with multiple db on a DAL instance, to do that safely one should use checkpoints I guess.

One can avoid sending 'SELECT 1' or similar NOOP to db.If I recall correctly the transaction begins just before the first statement after a commit()
So as first statement after a transaction begin, just send the user query, if it fails before execution for an IOError, then drop the old connection and try to open a new connection
 and try to execute that same query.  Repeat if needed. 


Massimo DiPierro

unread,
Feb 20, 2015, 11:14:35 AM2/20/15
to web2py-d...@googlegroups.com
This would work.

Michele Comitini

unread,
Feb 20, 2015, 11:39:53 AM2/20/15
to web2py-developers
Afterthought... cant we just rely on the transtaction opening?  That has to be issued in any case.

BEGIN

if that fails -> close connection -> open new -> try to start transaction again
nothing else needed
and that can make things easy even for distributed transaction since the transaction would be in its initial state for any instance...
Too simple to be real?

Massimo DiPierro

unread,
Feb 20, 2015, 12:46:02 PM2/20/15
to web2py-d...@googlegroups.com
but sending a BEGIN requires one more executesql(BEGIN)

Michele Comitini

unread,
Feb 20, 2015, 2:53:31 PM2/20/15
to web2py-developers
How does the DAL start a transaction then?

Massimo DiPierro

unread,
Feb 20, 2015, 3:54:49 PM2/20/15
to web2py-d...@googlegroups.com
Some Engines require a BEGIN and we send it. Most engine start a transaction automatically with the first query and close it on commit.

Paolo Valleri

unread,
Feb 21, 2015, 2:59:11 AM2/21/15
to web2py-d...@googlegroups.com
I've tried the following:
while True:
    try:
        print db(db.thing).count()
        db.commit()
    except Exception as e:
                 try:
            db._adapter.close(action=None)
        except:
            pass
        db._adapter.reconnect()
    print 'Sleeping'
    time.sleep(60)


It works, is it the correct way to implement it? 
'action=None' is necessary because before closing pydal tries to send a commit which is not in a try/except.
In addition, the example catches a general Exception coming from the adapter driver.
The 'connection broken' exception is raised from the driver and pydal doesn't filter/rename/manage them in a single way.
For example, pg8000 raises a error: [Errno 32] Broken pipe, psycopg raises , OperationalError: terminating connection due to administrator command
SSL connection has been closed unexpectedly.

Paolo

Massimo DiPierro

unread,
Feb 21, 2015, 11:18:25 AM2/21/15
to web2py-d...@googlegroups.com
yes this is the correct logic.

Paolo Valleri

unread,
Feb 24, 2015, 11:10:39 AM2/24/15
to web2py-d...@googlegroups.com
Please have a look at this very basic working implementation https://github.com/ilvalle/pydal/tree/auto-reconnect
I tested it using the attached script using pypyodbc with sql server.

Given the following pseudo-code:
exec command1
exec command2     <-- disconnections is detected
db.commit()

Even if I'll re-execute correctly command2, I'll lose command1. 
For this reason I've to remember all executed commands in a transaction (between two consecutive commits), and re-execute all of them in the new transaction after the re-connection.

This option is enabled by default if the driver defines the general OperationalError Exceptions. Unfortunately, even if the driver defines the Exception, there are no guarantees that it will be raised. See: https://github.com/mfenniak/pg8000/issues/73
I've created a PR to fix it on pypyodbc https://github.com/jiangwen365/pypyodbc/pull/18, psycopg seems to work well.

For the moment commit/rollback are not involved by the reconnection since they have a different work-flow in pydal (they call directly drivers' methods).


 Paolo
test_connection.py

Michele Comitini

unread,
Feb 24, 2015, 2:40:25 PM2/24/15
to web2py-developers
That's difficult to handle, but a disconnection in the middle of a short transaction is something really unexpected.  It is good that all the transaction is gone.
Usually a transaction must last the minimum necessary.  For instance:

for q in queries:
   db.executesql(q)
   sleep(60)
   db.commit()

is sure looking for troubles, not only because it could disconnet, but also because the resource locked and the increasing chance of a conflict.
Keeping with the above example the correct way to handle the session would be:

for q in queries:
   db.executesql(q)
   db.commit()
   sleep(60)


Niphlod

unread,
Feb 24, 2015, 3:18:55 PM2/24/15
to
@paolo: I don't get if the previous post is "hey, that's a problem" or "hey, it's fine".

The whole thing NEEDS to behave as you describe it. If I send command1 and I don't commit, I definitely DON'T want command2 to be executed, no matter what. Actually, we should NOT encourage such pattern and if possible make it impossible to do.

if write operations are in order, committing before sleeping is what any sane developer should do, and if he doesn't it's best to discard the whole thing.

Niphlod

unread,
Feb 24, 2015, 3:20:54 PM2/24/15
to web2py-d...@googlegroups.com
tl;dr: is a good thing to make db.commit() look for reconnections in case the socket is down, but we DON'T want any operations (.select(), .update(), whatever) to recreate the connection.

Paolo Valleri

unread,
Feb 24, 2015, 3:45:38 PM2/24/15
to web2py-d...@googlegroups.com
I agree with you both, committing before sleeping is the correct approach. I've just placed a sleep between two commands to let them failing more easily during tests.
@niphlod, if the socket is down while committing, then the transaction is gone and we've two alternatives: either raising the exception to the application or reconnecting and executing all (previous) un-committed commands saved. Currently we have the former approach, we can decide to implement the latter.
The idea I've in mind, and partially implemented, is to let any command (without really knowing what really they are) to try transparently to reconnect to the server. I don't get why you don't want .select, .update and so on to reconnect.

My previous post is: I've implemented the mentioned problem by placing all un-committed commands in a list, and commit() resets that list every time, perhaps is not the best approach.

 Paolo

Michele Comitini

unread,
Feb 24, 2015, 4:01:09 PM2/24/15
to web2py-developers
There is one special case where reconnecting and starting a transaction does not harm, in a transactional flow.
That case is when the FIRST command in the transaction fails because of an IO error.
Caveat: the start of the transaction and the command must be executed without time interval between them.

Niphlod

unread,
Feb 24, 2015, 4:29:00 PM2/24/15
to web2py-d...@googlegroups.com



The idea I've in mind, and partially implemented, is to let any command (without really knowing what really they are) to try transparently to reconnect to the server. I don't get why you don't want .select, .update and so on to reconnect.


because it would be too much magic, and consume potentially a lot of memory (keeping in a list every non-committed insert for, e.g, a lot of blobs would hurt). Frankly, pydal doesn't really have the UnitOfWork concept embedded. My fear is that there are tons of corner-cases...

res = db(db.table.id >0).select()  <--holding 10 lines
---disconnection <-- refetch res
for row in res:
    db(db.table.id == row.id).update(mycounter=db.table.mycounter + 1)
    db.commit()
----disconnection after the 5th <--- ?!?


Paolo Valleri

unread,
Feb 25, 2015, 2:46:21 AM2/25/15
to web2py-d...@googlegroups.com
The memory usage can increase (i.e., when inserting blobs), niphlod is right. However, the memory will be freed as soon as a commit comes. Given that, I'm not sure this is it really the issue.
I've the same fear, for this reason we have this discussion. Is there something missing/not working in the proposed approach?

From niphload's example, running the select again after the first disconnection is useless because all rows are already in res. After the second disconnection no commands are in the cmd-queue, as a result no other queries are executed; the re-connection will come during the update. Do you agree ?

Avoiding to keep in the cmd-queue particular queries I guess can be hard to maintain, since disconnections are more than rare, I don't see running these read-only queries again could hurt that much.

 Paolo

Michele Comitini

unread,
Feb 25, 2015, 4:10:50 AM2/25/15
to web2py-developers
To keep a command queue is up to the user of pydal IMHO.  Something that is out of the scope of pydal itself.
What I'm trying to say is that one can hide the most common cause of IOError, which is an IDLE connection TIMEDOUT.
Its common to leave an IDLE connection for hours, even days, open and thinking that one can start a transaction and  push a command through it right away when needed.
There is no harm in trying to reestablish the connection if the command that fails is the first of a newly started transaction.

Back to the queue of commands it is important to understand that some things do change the state of db even if the transaction is rolled back afterwards. 
Sequences are never rolled back.

mic


Massimo DiPierro

unread,
Feb 25, 2015, 9:28:32 AM2/25/15
to web2py-d...@googlegroups.com
On Feb 25, 2015, at 3:10 AM, Michele Comitini <michele....@gmail.com> wrote:

To keep a command queue is up to the user of pydal IMHO.  Something that is out of the scope of pydal itself.
What I'm trying to say is that one can hide the most common cause of IOError, which is an IDLE connection TIMEDOUT.
Its common to leave an IDLE connection for hours, even days, open and thinking that one can start a transaction and  push a command through it right away when needed.
There is no harm in trying to reestablish the connection if the command that fails is the first of a newly started transaction.

 agree.

Niphlod

unread,
Feb 25, 2015, 10:41:22 AM2/25/15
to web2py-d...@googlegroups.com

From niphload's example, running the select again after the first disconnection is useless because all rows are already in res. After the second disconnection no commands are in the cmd-queue, as a result no other queries are executed; the re-connection will come during the update. Do you agree ?

not really, as that involves the idea that "res" doesn't change between reconnections, while in practice it CAN, as any other involved row. 

tl;dr coming from everybody: you can just reexecute only the first query if it fails: every other logic is not failsafe.

Paolo Valleri

unread,
Feb 26, 2015, 6:27:25 AM2/26/15
to web2py-d...@googlegroups.com
Thanks for the suggestions. I'll update the code in order to re-execute only the first command of a transaction.

 Paolo

--

Niphlod

unread,
Mar 18, 2015, 7:08:38 PM3/18/15
to web2py-d...@googlegroups.com
Don't know what's the actual status (i.e. if you think the close() changes made some improvements), but current pyDAL seems to have serious issues with pg8000.

Massimo DiPierro

unread,
Mar 18, 2015, 8:13:54 PM3/18/15
to web2py-d...@googlegroups.com
The fact is the one we used was a fork created by Mariano Reingart. He had made changes that were never approved upstream. 


On Mar 18, 2015, at 6:08 PM, Niphlod <nip...@gmail.com> wrote:

Don't know what's the actual status (i.e. if you think the close() changes made some improvements), but current pyDAL seems to have serious issues with pg8000.

Paolo Valleri

unread,
Mar 19, 2015, 3:11:52 AM3/19/15
to web2py-d...@googlegroups.com
I'll have a look asap
@niphold how do you think the change on close() affects pg8000 ?
@massimo former upstream releases worked before.

 Paolo

Niphlod

unread,
Mar 19, 2015, 3:19:29 AM3/19/15
to web2py-d...@googlegroups.com


On Thursday, March 19, 2015 at 8:11:52 AM UTC+1, Paolo Valleri wrote:
I'll have a look asap
@niphold how do you think the change on close() affects pg8000 ?

seemed a nice coincidence. Then saw that a newer release of pg8000 may be another one. 
Usually when tests hangs, it's socket-related code that is to blame. Nevertheless, I'll never trust pg8000 with my data handling: too much issues.
Reply all
Reply to author
Forward
0 new messages