[Django] #35853: Disable garbage collection during transaction.atomic.

12 views
Skip to first unread message

Django

unread,
Oct 21, 2024, 10:47:10 AM10/21/24
to django-...@googlegroups.com
#35853: Disable garbage collection during transaction.atomic.
-------------------------------------+-------------------------------------
Reporter: Matej Spiller Muys | Type: New
| feature
Status: new | Component:
| Uncategorized
Version: 5.1 | Severity: Normal
Keywords: gc, atomic, | Triage Stage:
transaction | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
We noticed that occasionally atomic blocks are keeping transaction open
for longer than expected.
This was traced to GC being run during atomic operation affecting the DB
performance since locks are released too late.

We added tracing to GC and transactional.atomic() and we noticed that GC
is keeping DB transaction longer from 0.5 to 1 second.

To avoid that we extended atomic object to call gc.disable before creating
new transaction and gc.enable after commit or rollback.
This positively impacted database load. Because we use mod_wsgi with a lot
of workers and horizontally scaling the app it happens quite often that GC
is run inside a DB transaction making whole system slower.

It would be nice to have this buildin in Django (opt-out, opt-in) for
better DB performance.
--
Ticket URL: <https://code.djangoproject.com/ticket/35853>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Oct 21, 2024, 3:01:22 PM10/21/24
to django-...@googlegroups.com
#35853: Disable garbage collection during transaction.atomic.
-------------------------------------+-------------------------------------
Reporter: Matej Spiller Muys | Owner: (none)
Type: New feature | Status: closed
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution: wontfix
Keywords: gc, atomic, | Triage Stage:
transaction | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Natalia Bidart):

* component: Uncategorized => Database layer (models, ORM)
* resolution: => wontfix
* status: new => closed

Comment:

Hello Matej Spiller Muys, thank you for taking the time to create this
report.

It might be worth asking on the
[https://forum.djangoproject.com/c/internals/5 Django forum] for
help/input and then confirming if something new would need adding to
Django (and what that would look like). We usually ask contributors to get
feedback from the community through a forum discussion first, so we can
confirm there is community consensus that Django should add this feature.

From my personal view, this seems like a very specific need arising from a
niche use case. I believe you could implement your own atomic decorator on
top of Django's and disable the garbage collector on entry and re-enable
on exit? I'm not sure this applies to the broader ecosystem, and Django is
a framework designed to offer robust and accurate solutions for common
scenarios. But if you get positive feedback from the forum, please comment
back and link the forum conversation here so we reopen the ticket.
--
Ticket URL: <https://code.djangoproject.com/ticket/35853#comment:1>

Django

unread,
Oct 22, 2024, 3:01:04 AM10/22/24
to django-...@googlegroups.com
#35853: Disable garbage collection during transaction.atomic.
-------------------------------------+-------------------------------------
Reporter: Matej Spiller Muys | Owner: (none)
Type: New feature | Status: closed
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution: wontfix
Keywords: gc, atomic, | Triage Stage:
transaction | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Matej Spiller Muys):

The need for having db transactions as short as possible is quite general
rule.

In other languages, GC is not blocking the main thread. In Python that is
unfortunately not the case.
Basically if you can get eliminate GC you can scale better. And also DBA
will be much happier because it won't hog the database.
We are monitoring the database performance and are trying to have
transactions duration at max 0.5 seconds.
If you have shorted transactions you also need less concurrent connections
as well.

We are currently having our own atomic decorator that is disabling GC but
I filled the issue because it seems like could be useful for wider
audience. I will write to forum and see if there is more general need for
high performance Django minding DB resources.

Example code that we use:
{{{
@contextlib.contextmanager
def disable_gc():
_gc_was_enabled = gc.isenabled()
if _gc_was_enabled:
gc.disable()
try:
yield
finally:
if _gc_was_enabled:
gc.enable()

@contextlib.contextmanager
def atomic(using=None, savepoint=True, durable=False):
with disable_gc(), transaction.atomic(using, savepoint, durable):
yield
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35853#comment:2>

Django

unread,
Oct 22, 2024, 3:10:19 AM10/22/24
to django-...@googlegroups.com
#35853: Disable garbage collection during transaction.atomic.
-------------------------------------+-------------------------------------
Reporter: Matej Spiller Muys | Owner: (none)
Type: New feature | Status: closed
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution: wontfix
Keywords: gc, atomic, | Triage Stage:
transaction | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Claude Paroz):

Maybe a benchmark with as sample project could also help making a decision
here?
--
Ticket URL: <https://code.djangoproject.com/ticket/35853#comment:3>

Django

unread,
Oct 22, 2024, 8:33:07 AM10/22/24
to django-...@googlegroups.com
#35853: Disable garbage collection during transaction.atomic.
-------------------------------------+-------------------------------------
Reporter: Matej Spiller Muys | Owner: (none)
Type: New feature | Status: closed
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution: wontfix
Keywords: gc, atomic, | Triage Stage:
transaction | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Natalia Bidart):

Replying to [comment:2 Matej Spiller Muys]:
> The need for having db transactions as short as possible is quite
general rule.

I disagree with this. While I recognize that managing transaction
thresholds and garbage collection may be critical for large-scale
services, such as Instagram, I believe that the average Django project
does not require such measures. Django has been around for nearly 21
years, and this is the first time such a request has been reported in the
issue tracking system. This suggests that the need for disabling GC to
allow lowering the transaction max time is not a common concern among the
broader Django users.

> We are currently having our own atomic decorator that is disabling GC
but I filled the issue because it seems like could be useful for wider
audience. I will write to forum and see if there is more general need for
high performance Django minding DB resources.

Nice, do you have any benchmarks available, following Claude's ask? Please
note that besides needing community consensus, we'd need a clear win on
performance for a change like the one proposed. This is documented in
https://docs.djangoproject.com/en/dev/internals/contributing/bugs-and-
features/#requesting-performance-optimizations.

> Example code that we use:
> {{{
> @contextlib.contextmanager
> def disable_gc():
> _gc_was_enabled = gc.isenabled()
> if _gc_was_enabled:
> gc.disable()
> try:
> yield
> finally:
> if _gc_was_enabled:
> gc.enable()
>
> @contextlib.contextmanager
> def atomic(using=None, savepoint=True, durable=False):
> with disable_gc(), transaction.atomic(using, savepoint, durable):
> yield
> }}}

I believe this code is not thread-safe nor reentrant-safe. For a proposal
like this to be considered for inclusion in the core framework, we would
need strong guarantees that it meets these criteria and that garbage
collection is not inadvertently "disabled forever" due to unforeseen bugs.

While the need for this functionality is valid and presents an interesting
challenge, I reiterate that this is a niche use case, perhaps relevant to
only 1% or even 0.5% of Django services deployed in the wild. As
previously mentioned, Django aims to provide a robust, stable, and secure
platform for common web service requirements, and this particular need
feels outside the realm of what is considered "common."
--
Ticket URL: <https://code.djangoproject.com/ticket/35853#comment:4>
Reply all
Reply to author
Forward
0 new messages