I've discovered that after upgrading Django to ver. 4 (currently 4.0.2), I
started to see database **FATAL: sorry, too many clients already** errors
in the Sentry.
For a database, I'm using containerized Postgres 14.1 and the connection
between Django and Postgres is done by Unix socket.
Database settings look like this:
{{{
DATABASES = {
"default": {
"ENGINE": "django.db.backends.postgresql",
"NAME": environ.get("POSTGRES_DB"),
"USER": environ.get("POSTGRES_USER"),
"PASSWORD": environ.get("POSTGRES_PASSWORD"),
"HOST": environ.get("POSTGRES_HOST"),
"PORT": environ.get("POSTGRES_PORT"),
"CONN_MAX_AGE": 3600
}
}
}}}
In production, I'm using ASGI (Uvicorn 0.17.4) to run the Django
application (4 workers).
When everything is deployed and I have surfed around the Django admin
site, then checking Postgres active connections, using **SELECT * FROM
pg_stat_activity;** command, I see that there are 30+ Idle connections
made from Django.
After surfing more around the admin site, I can see that more Idle
connections have been made by Django.
It looks like the database connections are not reused. At one point some
of the Idle connections are closed, but then again more connections have
been made when more DB queries are made by Django.
I have one Django 3.2.11 project running on production and all the
settings are the same, there are always max 10 persistent connections with
the database and everything works fine.
Should that be like this in version 4.0?
--
Ticket URL: <https://code.djangoproject.com/ticket/33497>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* component: Uncategorized => Database layer (models, ORM)
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:1>
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:2>
* cc: Carlton Gibson (added)
* status: new => closed
* resolution: => needsinfo
Comment:
Thanks for the report. Django has a
[https://github.com/django/django/blob/f0480ddd2d3cb04b784cf7ea697f792b45c689cc/django/db/__init__.py#L34-L42
routine] to clean up old connections that is tied into the request-
response life-cycle, so idle connections should be closed. However, I
don't think you've explained the issue in enough detail to confirm a bug
in Django. This can be an issue in `psycopg2`, `uvicorn`, or in custom
middlewares (see #31905) it's hard to say without a reproduce.
Please reopen the ticket if you can debug your issue and provide details
about why and where Django is at fault, or if provide a sample project
with reproducible scenario.
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:3>
* cc: Florian Apolloner, Andrew Godwin (added)
Comment:
Hi Stenkar.
Would you be able to put together a minimal test-project here, so that
folks can reproduce quickly.
This **may** be due to Django 4.0 having per-request contexts for the
thread sensitivity of `sync_to_async()` — See #32889.
If so, that's kind-of a good thing, in that too many open resources is
what you'd expect in async code, and up to now, we've not been hitting
that, as we've essentially been running serially.
Immediate thought for a mitigation would be to use a connection pool.
Equally, can we limit the number of threads in play using
[https://github.com/django/asgiref/blob/02fecb6046bb5ec0dbbad00ffcd2043e893fcea5/asgiref/sync.py#L303-L304
asgiref's `AGSI_THREADS` environment variable]? (But see
[https://github.com/django/daphne/issues/319#issuecomment-991962381 the
discussion on the related Daphne issue about whether that's the right
place for that at all].)
This is likely a topic we'll need to deal with (eventually) in Django:
once you start getting async working, you soon hit resource limits, and
handling that with structures for sequencing and maximum parallelism is
one of those hard-batteries™ that we maybe should provide. 🤔
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:4>
Comment (by Florian Apolloner):
I think https://github.com/django/asgiref/pull/306#issuecomment-991959863
might play into this as well. By using a single thread per connection,
persistent connections will never get clean up.
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:5>
* status: closed => new
* resolution: needsinfo =>
Comment:
Thank you for the comments.
I created a project where it's possible to spin up a minimal project with
docker-compose.
https://github.com/KStenK/django-ticket-33497
I'm not sure that I can find where or what goes wrong in more detail, but
I'll give it a try.
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:6>
* type: Bug => New feature
* stage: Unreviewed => Accepted
Comment:
OK, thanks Stenkar.
I'm going to accept this as a New Feature. It's a change in behaviour from
3.2, but it's precisely in allowing multiple executors for
`sync_to_async()` that it comes up. (In 3.2 it's essentially single-
threaded, with only a single connection actually being used.) We need to
improve the story here, but it's not a bug in #32889 that we don't have
async compatible persistent DB connections yet. (I hope that makes sense.)
A note to the docs about this limitation may be worthwhile.
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:7>
Comment (by Florian Apolloner):
Thinking more about this I do not think the problem is new. We have the
same problem when persistent connections are used and a new thread is
generated per request (for instance in runserver.py). Usually (ie with
gunicorn etc) one has a rather "stable" pool of processes or requests; as
soon as you switch to new threads per connection this will break. In ASGI
this behavior is probably more pronounced since by definition every
request is in it's own async task context which then propagates down to
the db backend as new connection per request (which in turn will also
never reuse connections because the "thread" ids change).
All in all I think we are finally at the point where we need a connection
pool in Django. I'd strongly recommend to use something like
https://github.com/psycopg/psycopg/tree/master/psycopg_pool/psycopg_pool
but abstracted to work for all databases in Django.
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:8>
* cc: Anders Kaseorg (added)
Comment:
Possibly related: #33625 (for memcached connections).
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:9>
* cc: Patryk Zawadzki (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:10>
* cc: Mikail (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:11>
Comment (by Patryk Zawadzki):
This is marked as a "new feature," but it's an undocumented breaking
change between 3.2 and 4.0. Connections that were previously reused and
terminated are now just left to linger.
The {{{request_finished}}} signal does not terminate them as they are not
idle for longer than {{{MAX_CONN_AGE}}}.
The {{{request_started}}} signal does not terminate them as it never sees
those connections due to the connection state being {{{asgiref.local}}}
and discarded after every request.
Allowing parallel execution of requests is a great change, but I feel
Django should outright refuse to start if {{{MAX_CONN_AGE}}} is combined
with ASGI.
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:12>
* keywords: ASGI; Database => ASGI, Database, async
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:13>
* cc: Alex (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:14>
Comment (by joeli):
Replying to [comment:12 Patryk Zawadzki]:
> This is marked as a "new feature," but it's an undocumented breaking
change between 3.2 and 4.0. Connections that were previously reused and
terminated are now just left to linger.
>
> The {{{request_finished}}} signal does not terminate them as they are
not idle for longer than {{{MAX_CONN_AGE}}}.
>
> The {{{request_started}}} signal does not terminate them as it never
sees those connections due to the connection state being
{{{asgiref.local}}} and discarded after every request.
>
> Allowing parallel execution of requests is a great change, but I feel
Django should outright refuse to start if {{{MAX_CONN_AGE}}} is combined
with ASGI.
I agree. I would even go as far as calling this a regression, not just an
undocumented breaking change. No matter the reasons behind it or the
technical superiority of the new solution, fact of the matter stands that
in 3.2 ASGI mode our code worked fine and reused connections. In 4.x it is
broken unless using {{{MAX_CONN_AGE = 0}}}, which disables a feature in
Django that used to work.
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:15>
* cc: joeli (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:16>
* cc: Marco Glauser (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:17>
* owner: nobody => rajdesai24
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:18>
* owner: rajdesai24 => (none)
* status: assigned => new
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:19>
* cc: Rafał Pitoń (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:18>
* cc: Marty Cochrane (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:19>
Comment (by Florian Apolloner):
I have created a draft pull request for database connection pool support
in postgresql: https://github.com/django/django/pull/16881
It would be great if people experiencing the problems noted here could
test this (this would probably help in getting it merged).
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:20>
* cc: lappu (added)
Comment:
We just ran into this while upgrading from 3.2 to 4.2. During a QA round
our staging environment '''MySQL''' server running on AWS RDS t3.micro
instance exceeded its max connections (70? or so, while normally the
connections stay below 10).
I git bisected the culprit to be
https://github.com/django/django/commit/36fa071d6ebd18a61c4d7f1b5c9d17106134bd44,
which is what Carlton Gibson suspected.
We are also running uvicorn.
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:21>
Comment (by Andreas Pelme):
We have been using `CONN_MAX_AGE=300` since it was introduced in Django
1.6 and rely in it for not having to reconnect to the database in each
http request. This change really caught us off guard. We upgraded from
django 3.2 to 4.0 and our site went completely down in a matter of seconds
when all database connections was instantly depleted.
Giving each http request its own async context makes a lot of sense and is
a good change in itself IMO. But I would argue that this change is not
backward compatible. `CONN_MAX_AGE` does still "technically" work but it
does clearly not behave as it has been doing for the last 10 years.
Specifying `CONN_MAX_AGE` is recommended in a lot of places, including
Django's own docs:
- https://docs.djangoproject.com/en/4.2/ref/databases/#persistent-
database-connections
- https://devcenter.heroku.com/articles/python-concurrency-and-database-
connections
At the very least, I think this needs to be clearly called out in the
release notes and docs on "Persistent connections". I think we need to
deprecate/remove `CONN_MAX_AGE`. Or is there even a reason to keep it
around?
I am very much in favor of getting basic db connection pooling into
django. We will try to give https://github.com/django/django/pull/16881 a
spin and put it in production and report back. Would love to have
something like that available out of the box in Django! We use Postgres
and would be happy with having such a pool which would replace
CONN_MAX_AGE for our use case.
However, that would only work for postgres. What is the situation with
mysql/oracle? Does mysqlclient come with a pool like psycopg?
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:22>
Comment (by Sarah Boyce):
Replying to [comment:22 Andreas Pelme]:
> However, that would only work for postgres. What is the situation with
mysql/oracle? Does mysqlclient come with a pool like psycopg?
Oracle: https://python-
oracledb.readthedocs.io/en/latest/user_guide/connection_handling.html#connpooling
mysqlclient doesn't appear to support this out of the box. Looks like
mysql-connector-python would have support though:
https://dev.mysql.com/doc/connector-python/en/connector-python-connection-
pooling.html 🤔
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:23>
* cc: Sarah Boyce (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:24>
* has_patch: 0 => 1
Comment:
Will need to add solutions for other backends but don't see a reason why
we can't do this incrementally - the current patch is for postgreSQL:
https://github.com/django/django/pull/17594
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:25>
Comment (by Florian Apolloner):
Well, I guess the main question is if we want to create our own pool
implementation or reuse what the database adapters provide and assume that
they can do it better for their database than we can do it generically :D
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:26>
* owner: nobody => Sarah Boyce
* needs_better_patch: 0 => 1
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:27>
* needs_better_patch: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:28>
* cc: Dmytro Litvinov (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:29>
Comment (by Sarah Boyce):
There is already a ticket (and now a PR) to support database connection
pools in Oracle: #7732
--
Ticket URL: <https://code.djangoproject.com/ticket/33497#comment:30>