[Django] #35279: Memory Leak with `prefetch_related`

28 views
Skip to first unread message

Django

unread,
Mar 7, 2024, 4:48:38 AM3/7/24
to django-...@googlegroups.com
#35279: Memory Leak with `prefetch_related`
-----------------------------------------+-----------------------------
Reporter: canton | Owner: nobody
Type: Bug | Status: new
Component: Uncategorized | Version: 4.2
Severity: Normal | Keywords: memory leak
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-----------------------------------------+-----------------------------
Memory Leak after calling `queryset.prefetch_related()` or
`prefetch_related_objects()`

To reproduce:

{{{
import gc
from django.db import models
from django.db.models import prefetch_related_objects


class Foo(models.Model):
id = models.AutoField(primary_key=True)


class Bar(models.Model):
id = models.AutoField(primary_key=True)
foo = models.ForeignKey(Foo, on_delete=models.CASCADE)


def prepare_data():
if Foo.objects.exists():
return
foo = Foo()
foo.save()
bar = Bar(foo=foo)
bar.save()


def test1():
# no prefetch
for foo in Foo.objects.all():
for bar in foo.bar_set.all():
print(foo.id, bar.id)


def test2():
# queryset.prefetch_related()
for foo in Foo.objects.prefetch_related("bar_set").all():
for bar in foo.bar_set.all():
print(foo.id, bar.id)


def test3():
# prefetch_related_objects()
foo_list = list(Foo.objects.all())
prefetch_related_objects(foo_list, "bar_set")
for foo in foo_list:
for bar in foo.bar_set.all():
print(foo.id, bar.id)


def run():
prepare_data()

# warn up
test1()
test2()
test3()

gc.collect()

gc.set_debug(gc.DEBUG_LEAK)

gc.collect()
print(f"baseline - garbage count: {len(gc.garbage)}")

test1()
gc.collect()
print(f"test1 - garbage count: {len(gc.garbage)}")

test2()
gc.collect()
print(f"test2 - garbage count: {len(gc.garbage)}")

test3()
gc.collect()
print(f"test3 - garbage count: {len(gc.garbage)}")

gc.set_debug(0)


run()
}}}


Output
{{{
1 1
1 1
1 1
baseline - garbage count: 0
1 1
test1 - garbage count: 0 # no memory leak
1 1
test2 - garbage count: 23 # 23 objects leaked
1 1
test3 - garbage count: 46 # another 23 objects leaked
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35279>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Mar 7, 2024, 5:10:02 AM3/7/24
to django-...@googlegroups.com
#35279: Memory Leak with `prefetch_related`
-------------------------------+--------------------------------------
Reporter: Ken Tong | Owner: nobody
Type: Bug | Status: new
Component: Uncategorized | Version: 4.2
Severity: Normal | Resolution:
Keywords: memory leak | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Comment (by Ken Tong):

Hi Team,

So far I am adding the code below in the appropriate lines in order to fix
the memory leak in my projects. Hopefully there will be a fix and
documented way to properly clean up the cache.

{{{
foo._prefetched_objects_cache.pop("bar_set")
}}}

Thank you for your attention!
--
Ticket URL: <https://code.djangoproject.com/ticket/35279#comment:1>

Django

unread,
Mar 7, 2024, 6:42:23 AM3/7/24
to django-...@googlegroups.com
#35279: Memory Leak with `prefetch_related`
-------------------------------------+-------------------------------------
Reporter: Ken Tong | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: memory leak | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* component: Uncategorized => Database layer (models, ORM)
* stage: Unreviewed => Accepted
* type: Bug => Cleanup/optimization

Comment:

Interesting, thanks for the report. Tentatively accepted for further
investigation.
--
Ticket URL: <https://code.djangoproject.com/ticket/35279#comment:2>

Django

unread,
Mar 7, 2024, 11:54:15 AM3/7/24
to django-...@googlegroups.com
#35279: Memory Leak with `prefetch_related`
-------------------------------------+-------------------------------------
Reporter: Ken Tong | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: memory leak | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Antoine Humbert):

The following code snippet shows the same result:

{{{

import gc


class Parent:

def __init__(self):
self.cache = {}


class Child:

def __init__(self, parent):
self.parent = parent



def test():
foo = Parent()
bar = Child(parent=foo)
foo.cache["bars"] = [bar]
print(foo.cache, bar.parent)


test()
gc.collect()
print(len(gc.garbage))

gc.set_debug(gc.DEBUG_LEAK)
gc.collect()
print(len(gc.garbage))

test()
gc.collect()
print(len(gc.garbage))
}}}

Results in following output

{{{
{'bars': [<__main__.Child object at 0x6f520cdd90>]} <__main__.Parent
object at 0x6f520cd6d0>
0
0
{'bars': [<__main__.Child object at 0x6f520b32d0>]} <__main__.Parent
object at 0x6f520b1fd0>
gc: collectable <Parent 0x6f520b1fd0>
gc: collectable <Child 0x6f520b32d0>
gc: collectable <list 0x6f520b1600>
gc: collectable <dict 0x6f520b1e80>
4
}}}

Removing the `gc.set_debug` statement, the `gc.garbage` is always empty,
so it looks like à side effect of `DEBUG_LEAK`.

As per the `gc` documentation:

{{{
To debug a leaking program call gc.set_debug(gc.DEBUG_LEAK). Notice that
this includes gc.DEBUG_SAVEALL, causing garbage-collected objects to be
saved in gc.garbage for inspection.
}}}

So, using `DEBUG_LEAK` leads to collected objects to be present in
gc.garbage. So, I would say that looking at `gc.garbage` in this case does
not identifies a memory leak. On the contrary, it shows objects that were
garbage collected
--
Ticket URL: <https://code.djangoproject.com/ticket/35279#comment:3>

Django

unread,
Mar 7, 2024, 10:10:28 PM3/7/24
to django-...@googlegroups.com
#35279: Memory Leak with `prefetch_related`
-------------------------------------+-------------------------------------
Reporter: Ken Tong | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: memory leak | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Ken Tong):

Thank you for your detailed explanation, Antoine. I confirm that memory
leak is a false alarm and I am sorry about it
--
Ticket URL: <https://code.djangoproject.com/ticket/35279#comment:4>

Django

unread,
Mar 7, 2024, 10:11:30 PM3/7/24
to django-...@googlegroups.com
#35279: Memory Leak with `prefetch_related`
-------------------------------------+-------------------------------------
Reporter: Ken Tong | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: memory leak | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Ken Tong):

* resolution: => invalid
* status: new => closed

--
Ticket URL: <https://code.djangoproject.com/ticket/35279#comment:5>

Django

unread,
Mar 7, 2024, 11:59:38 PM3/7/24
to django-...@googlegroups.com
#35279: Memory Leak with `prefetch_related`
-------------------------------------+-------------------------------------
Reporter: Ken Tong | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: memory leak | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* stage: Accepted => Unreviewed

Comment:

TIL
--
Ticket URL: <https://code.djangoproject.com/ticket/35279#comment:6>
Reply all
Reply to author
Forward
0 new messages