Persistent connections, take 2

815 views
Skip to first unread message

Aymeric Augustin

unread,
Feb 27, 2013, 6:12:09 PM2/27/13
to django-d...@googlegroups.com
Hello,

I've integrated the feedback received on my initial proposal in a new pull request:
https://github.com/django/django/pull/733 and I think it's ready for review.

I'm just wondering if 10 minutes is a good default value for CONN_MAX_AGE.
I chose it randomly. Would "unlimited" be better?

Unfortunately, this code is difficult to test automatically, for two reasons:
- The testing framework needs a permanent database connection; it inhibits
  the closing of the connection triggered on request_started / request_finished.
- Database errors cannot be produced at will.

I've done some manual testing with gunicorn. Django recovers as expected
after a database restart, ie. one request fails and the next one succeeds.

Thanks for your feedback,

-- 
Aymeric.



Aymeric Augustin

unread,
Mar 18, 2013, 10:36:53 AM3/18/13
to django-d...@googlegroups.com
On 28 févr. 2013, at 00:12, Aymeric Augustin <aymeric....@polytechnique.org> wrote:

> I'm just wondering if 10 minutes is a good default value for CONN_MAX_AGE.


Since I committed the patch, I discovered that persistent connections don't interact well with runserver.

By default, the development server creates a new thread for each request it handles. Not only does this negate the effect of persistent connections (they're thread-local), but it also keeps connections open until they're garbage collected, making it easy to exceed the maximum number of available connections.

Florian independently discovered the same problem with gunicorn + gevent, because the gevent worker runs each request in its own "thread" (greenlet).


This raises the following questions:

1) Do we want to enable persistent connections in production by default?
a) yes
b) yes, through a deprecation path
c) no

2) Assuming the answer to 1) is yes, can we fix runserver?
a) close connections at the end of the dev server request handler
=> creates tight coupling between the dev server and the ORM :(
b) disable persistent connections when DEBUG = True
=> badly targeted: some people may use runserver with DEBUG = False, or want persistent connections when DEBUG = True
c) ???


The lazy solution is to disable persistent connections by default, document the problem with servers that run each request in its own thread, and recommend to set `DATABASES['default']['CONN_MAX_AGE'] = 0 if settings.DEBUG else 600`.

Besides, I've observed that performance is often less of a concern that backwards-compatibility and ease of use; from this perspective, disabling persistent connections would be appropriate.


What do you think?

--
Aymeric.



Jeremy Dunck

unread,
Mar 18, 2013, 12:08:26 PM3/18/13
to django-d...@googlegroups.com, django-d...@googlegroups.com
It sounds like we need a way to tell the worker that we are done sending requests to it so that the worker can do cleanup (of which db conn close is one task). This mirrors the previous request_finished "coupling" to requests_finished.

(OS?) Signal? Sentinel queue/socket/named pipe + background thread?
> --
> You received this message because you are subscribed to the Google Groups "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
> To post to this group, send email to django-d...@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Shai Berger

unread,
Mar 18, 2013, 12:10:24 PM3/18/13
to django-d...@googlegroups.com
On Monday 18 March 2013 16:36:53 Aymeric Augustin wrote:
> By default, the development server creates a new thread for each request it
> handles. Not only does this negate the effect of persistent connections
> (they're thread-local),
> [...]

> 1) Do we want to enable persistent connections in production by default?
>
> 2) Assuming the answer to 1) is yes, can we fix runserver?
> a) close connections at the end of the dev server request handler
> => creates tight coupling between the dev server and the ORM :(
> b) disable persistent connections when DEBUG = True
> => badly targeted: some people may use runserver with DEBUG = False,
> or want persistent connections when DEBUG = True
> c) ???
>

If the persistent connections are thread-local, don't you want to close them
anyway when the thread exits?

> Florian independently discovered the same problem with gunicorn + gevent,
> because the gevent worker runs each request in its own "thread"
> (greenlet).

... but that fix will kill persistent connections under gunicorn.

The solution that sounds "right" is to pool the persistent connections --
every thread that ends returns the connection to the pool.

If I understand correctly, this would tie the ORM not to the dev server, but
to the wsgi handler -- but in a sense where, actually, it should be tied.

Shai.

Aymeric Augustin

unread,
Mar 18, 2013, 1:01:13 PM3/18/13
to django-d...@googlegroups.com
On 18 mars 2013, at 17:10, Shai Berger <sh...@platonix.com> wrote:

> If the persistent connections are thread-local, don't you want to close them
> anyway when the thread exits?

Yes, do you know how this could be achieved? I haven't found how to hook
on thread termination.

> ... but that fix will kill persistent connections under gunicorn.

If you're talking of gunicorn + gevent, persistent connections simply cannot
work, so this doesn't matter as long as Django doesn't use more database
connections than "live" threads.

> The solution that sounds "right" is to pool the persistent connections --
> every thread that ends returns the connection to the pool.

Yes, using an external pooler is the right solution in cases where persistent
connections don't work.

--
Aymeric.



Anssi Kääriäinen

unread,
Mar 18, 2013, 1:29:22 PM3/18/13
to Django developers
On 18 maalis, 19:01, Aymeric Augustin
<aymeric.augus...@polytechnique.org> wrote:
It is possible to create a pool of connections to reuse inside Django.
Close releases back to pool, get_new_connection() fetches an unused
connection from the pool or creates a new connection if none
available.

The downside is that having a real connection pool is necessarily more
complex than the current setup. The problematic cases are how to deal
with `__del__` and plain concurrency can cause problems, too. However
a real pool would solve the above problems and thus it would be
possible to default persistent connections to on. The needed amount of
code is 100-200 LOC for the as-simple-as-possible pool.

For now I would just default the persistent connections feature to
off. I believe most Django projects fall into category where
connection creation performance doesn't matter. For those projects
where connection creation performance does matter turning persistent
connection on is easy enough. The default can be changed later on if
need be.

- Anssi

Shai Berger

unread,
Mar 18, 2013, 10:13:13 PM3/18/13
to django-d...@googlegroups.com
On Monday 18 March 2013, Aymeric Augustin wrote:
> On 18 mars 2013, at 17:10, Shai Berger <sh...@platonix.com> wrote:
> > If the persistent connections are thread-local, don't you want to close
> > them anyway when the thread exits?
>
> Yes, do you know how this could be achieved? I haven't found how to hook
> on thread termination.
>

You can get a list of the "living" threads from threading.enumerate(), so it
is possible to find out that a thread has already ended. This probably quite
inefficient, but may be good enough for runserver.
Reply all
Reply to author
Forward
0 new messages