[PATCH] Proposal: connection pooling with psycopg2

77 views
Skip to first unread message

jothan

unread,
Nov 16, 2008, 2:57:36 PM11/16/08
to Django developers
PsycoPg2 with a remote DB over a WAN is deadly slow. Since PsycoPg2
supports connection pooling, I have decided to patch the Django DB
adapter to use this. I could see this implemented as below (this is a
very rough draft). For example, min and max connections should not be
hardcoded.

Let me know what you think,
Jonathan

diff -uprN /usr/share/python-support/python-django/django/db/backends/
postgresql_psycopg2/base.py psycopg2_pool/base.py
--- /usr/share/python-support/python-django/django/db/backends/
postgresql_psycopg2/base.py 2008-08-28 02:49:00.000000000 -0400
+++ psycopg2_pool/base.py 2008-11-16 14:51:02.465706723 -0500
@@ -15,6 +15,7 @@ from django.utils.safestring import Safe
try:
import psycopg2 as Database
import psycopg2.extensions
+ from psycopg2.pool import PersistentConnectionPool as PoolClass
except ImportError, e:
from django.core.exceptions import ImproperlyConfigured
raise ImproperlyConfigured("Error loading psycopg2 module: %s" %
e)
@@ -25,6 +26,23 @@ IntegrityError = Database.IntegrityError
psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
psycopg2.extensions.register_adapter(SafeString,
psycopg2.extensions.QuotedString)
psycopg2.extensions.register_adapter(SafeUnicode,
psycopg2.extensions.QuotedString)
+pools = {}
+
+def getconn(conn_str, opts):
+ try:
+ pool = pools[conn_str];
+ except KeyError:
+ pool = pools[conn_str] = PoolClass(3, 5, conn_str)
+
+ return pool.getconn()
+
+def putconn(conn_str, conn):
+ try:
+ pool = pools[conn_str];
+ except KeyError:
+ return
+
+ return pool.putconn(conn)

class DatabaseFeatures(BaseDatabaseFeatures):
needs_datetime_string_cast = False
@@ -72,16 +90,16 @@ class DatabaseWrapper(BaseDatabaseWrappe
if settings.DATABASE_NAME == '':
from django.core.exceptions import
ImproperlyConfigured
raise ImproperlyConfigured("You need to specify
DATABASE_NAME in your Django settings file.")
- conn_string = "dbname=%s" % settings.DATABASE_NAME
+ self.conn_string = "dbname=%s" % settings.DATABASE_NAME
if settings.DATABASE_USER:
- conn_string = "user=%s %s" % (settings.DATABASE_USER,
conn_string)
+ self.conn_string = "user=%s %s" %
(settings.DATABASE_USER, self.conn_string)
if settings.DATABASE_PASSWORD:
- conn_string += " password='%s'" %
settings.DATABASE_PASSWORD
+ self.conn_string += " password='%s'" %
settings.DATABASE_PASSWORD
if settings.DATABASE_HOST:
- conn_string += " host=%s" % settings.DATABASE_HOST
+ self.conn_string += " host=%s" %
settings.DATABASE_HOST
if settings.DATABASE_PORT:
- conn_string += " port=%s" % settings.DATABASE_PORT
- self.connection = Database.connect(conn_string,
**self.options)
+ self.conn_string += " port=%s" %
settings.DATABASE_PORT
+ self.connection = getconn(self.conn_string, self.options)
self.connection.set_isolation_level(1) # make
transactions transparent to all cursors
self.connection.set_client_encoding('UTF8')
cursor = self.connection.cursor()
@@ -94,3 +112,8 @@ class DatabaseWrapper(BaseDatabaseWrappe
# No savepoint support for earlier version of
PostgreSQL.
self.features.uses_savepoints = False
return cursor
+
+ def close(self):
+ if self.connection is not None:
+ putconn(self.conn_string, self.connection)
+ self.connection = None

alex....@gmail.com

unread,
Nov 16, 2008, 8:41:08 PM11/16/08
to Django developers
This sounds like a good feature, however, it's very difficult to read
a patch in either email or on google groups, so could you please open
a ticket in trac, and post this as a patch there.

Malcolm Tredinnick

unread,
Nov 16, 2008, 8:48:54 PM11/16/08
to django-d...@googlegroups.com

On Sun, 2008-11-16 at 11:57 -0800, jothan wrote:
> PsycoPg2 with a remote DB over a WAN is deadly slow. Since PsycoPg2
> supports connection pooling, I have decided to patch the Django DB
> adapter to use this. I could see this implemented as below (this is a
> very rough draft). For example, min and max connections should not be
> hardcoded.
>
> Let me know what you think,

You've made it non-optional, which isn't ideal. The connection pooling
in psycopg2 is purely manual, in-memory stuff that shares connections
within the same process (so, for multi-threaded operations). Once you
switch to multi process operations, it's time to graduate to the "real
stuff" and use, e.g., pgpool to the connection pooling. At that point,
the extra overhead from the psycopg2 classes is probably not helping.

A different approach would be to work out which hooks are really needed
to support this (i.e. where we might need to call a function in a
subclass where the subclass is the thing that supports psycopg2's
pooling, for example) and we can write those in. Then, as hinted, we can
put the pooling in a subclass and provide it that way (maybe).

A couple of comments on the patch inline below...

You're making this sort of change all throughout this function, which
suggests there's an easier way: put a single line at the end that says

self.conn_string = conn_string

Less changes, shorter lines in the function and it's also slightly more
efficient (since local variable lookups are faster than having to
dereference the attribute all the time). All of which are extremely
minor, but it simultaneously helps cut down on the code churn.

> if settings.DATABASE_USER:
> - conn_string = "user=%s %s" % (settings.DATABASE_USER,
> conn_string)
> + self.conn_string = "user=%s %s" %
> (settings.DATABASE_USER, self.conn_string)
> if settings.DATABASE_PASSWORD:
> - conn_string += " password='%s'" %
> settings.DATABASE_PASSWORD
> + self.conn_string += " password='%s'" %
> settings.DATABASE_PASSWORD
> if settings.DATABASE_HOST:
> - conn_string += " host=%s" % settings.DATABASE_HOST
> + self.conn_string += " host=%s" %
> settings.DATABASE_HOST
> if settings.DATABASE_PORT:
> - conn_string += " port=%s" % settings.DATABASE_PORT
> - self.connection = Database.connect(conn_string,
> **self.options)
> + self.conn_string += " port=%s" %
> settings.DATABASE_PORT
> + self.connection = getconn(self.conn_string, self.options)

This would seem to be the place for a hook. We call, for example, a
"connect()" function on the current class which is normally aliased to
Database.connect, but can be overridden in the pooling subclass.

> self.connection.set_isolation_level(1) # make
> transactions transparent to all cursors
> self.connection.set_client_encoding('UTF8')
> cursor = self.connection.cursor()
> @@ -94,3 +112,8 @@ class DatabaseWrapper(BaseDatabaseWrappe
> # No savepoint support for earlier version of
> PostgreSQL.
> self.features.uses_savepoints = False
> return cursor
> +
> + def close(self):
> + if self.connection is not None:
> + putconn(self.conn_string, self.connection)
> + self.connection = None

Similarly, could be a no-op in the main class and overridden in the
subclass.

Regards,
Malcolm


Reply all
Reply to author
Forward
0 new messages