[Django] #37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory usage

9 views
Skip to first unread message

Django

unread,
May 28, 2026, 11:09:43 AM (2 days ago) May 28
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
-------------------------------------+-------------------------------------
Reporter: Johannes Maron | Type:
| Cleanup/optimization
Status: new | Component: Tasks
Version: dev | Severity: Normal
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
[https://forum.djangoproject.com/t/tasks-framework-versatility-
performance/45035/4 Jake and I have been discussing changes to the Task
framework for 6.2 with a focus on performance and versatility.]

One discovery was the memory overhead created by `__post_init__` which
doubles the working memory when compared to a factory method.

A `__new__` method is what I used in the benchmark and what I would
consider the most Pythonic way while also maintaining full compatibility.
--
Ticket URL: <https://code.djangoproject.com/ticket/37125>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
May 28, 2026, 1:05:11 PM (2 days ago) May 28
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
-------------------------------------+-------------------------------------
Reporter: Johannes Maron | Owner: (none)
Type: | Status: new
Cleanup/optimization |
Component: Tasks | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Jacob Walls):

Can you share the benchmark script? I'd like to just eyeball the setup
before drawing conclusions.
--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:1>

Django

unread,
May 28, 2026, 2:35:35 PM (2 days ago) May 28
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
-------------------------------------+-------------------------------------
Reporter: Johannes Maron | Owner: (none)
Type: | Status: new
Cleanup/optimization |
Component: Tasks | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Johannes Maron):

Sure, where are my manners? Here you go. It's slop, but I tweaked it to
minimize side effects:


{{{
import timeit
import tracemalloc
from dataclasses import dataclass, field
from datetime import datetime
from typing import Any

from django.utils.json import normalize_json
from django.tasks.base import Task, TaskResult, TaskResultStatus

# --- Real Task instance ---

def my_func():
pass

real_task = Task.__new__(Task)
object.__setattr__(real_task, "func", my_func)
object.__setattr__(real_task, "priority", 0)
object.__setattr__(real_task, "backend", "default")
object.__setattr__(real_task, "queue_name", "default")
object.__setattr__(real_task, "run_after", None)
object.__setattr__(real_task, "takes_context", False)

# --- Shared kwargs ---

now = datetime.now()

common_kwargs = dict(
task=real_task,
id="abc123",
status=TaskResultStatus.SUCCESSFUL,
enqueued_at=now,
started_at=now,
finished_at=now,
last_attempted_at=now,
args=[],
kwargs={},
backend="default",
errors=[],
worker_ids=["worker-1"],
)

# --- Variant 1: __post_init__ (same as original TaskResult) ---

@dataclass(frozen=True, slots=True, kw_only=True)
class TaskResultPostInit:
task: Any
id: str
status: Any
enqueued_at: datetime | None
started_at: datetime | None
finished_at: datetime | None
last_attempted_at: datetime | None
args: list[Any]
kwargs: dict[str, Any]
backend: str
errors: list
worker_ids: list[str]
_return_value: Any | None = field(init=False, default=None)

def __post_init__(self):
object.__setattr__(self, "args", normalize_json(self.args))
object.__setattr__(self, "kwargs", normalize_json(self.kwargs))


# --- Variant 2: classmethod factory, no __post_init__ ---

@dataclass(frozen=True, slots=True, kw_only=True)
class TaskResultNew:
task: Any
id: str
status: Any
enqueued_at: datetime | None
started_at: datetime | None
finished_at: datetime | None
last_attempted_at: datetime | None
args: list[Any]
kwargs: dict[str, Any]
backend: str
errors: list
worker_ids: list[str]
_return_value: Any | None = field(init=False, default=None)

def __new__(cls, *args, **kwargs):
kwargs["args"] = normalize_json(kwargs["args"])
kwargs["kwargs"] = normalize_json(kwargs["kwargs"])
return super().__new__(cls)


# --- Benchmark ---

N = 1_000_000

def run_bench(label, fn):
t = timeit.timeit(fn, number=N)

tracemalloc.start()
instances = [fn() for _ in range(N)]
_, peak = tracemalloc.get_traced_memory()
tracemalloc.stop()
del instances

print(f"{label}")
print(f" Time: {t:.3f}s ({t/N*1e6:.2f} µs/call)")
print(f" Peak mem: {peak / 1024 / 1024:.2f} MB (for {N:,}
instances)")
print()

run_bench(
"TaskResult with __post_init__",
lambda: TaskResultPostInit(**common_kwargs),
)

run_bench(
"TaskResult with classmethod factory",
lambda: TaskResultNew(**common_kwargs),
)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:2>

Django

unread,
May 28, 2026, 2:36:43 PM (2 days ago) May 28
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
-------------------------------------+-------------------------------------
Reporter: Johannes Maron | Owner: (none)
Type: | Status: new
Cleanup/optimization |
Component: Tasks | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Johannes Maron):

… and the results:

{{{
TaskResult with __post_init__
Time: 1.126s (1.13 µs/call)
Peak mem: 252.19 MB (for 1,000,000 instances)

TaskResult with classmethod factory
Time: 1.326s (1.33 µs/call)
Peak mem: 137.76 MB (for 1,000,000 instances)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:3>

Django

unread,
May 28, 2026, 11:35:00 PM (2 days ago) May 28
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
--------------------------------------+------------------------------------
Reporter: Johannes Maron | Owner: zky
Type: Cleanup/optimization | Status: assigned
Component: Tasks | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by zky):

* owner: (none) => zky
* stage: Unreviewed => Accepted
* status: new => assigned

Comment:

Hi Johannes, since I just assigned #37126 to myself, I'd like to handle
this memory optimization alongside it, because they both modify the
TaskResult model. I'm assigning this to myself as well so I can bundle
them in the same PR and avoid conflicts.
--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:4>

Django

unread,
May 29, 2026, 5:27:03 AM (yesterday) May 29
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
--------------------------------------+------------------------------------
Reporter: Johannes Maron | Owner: zky
Type: Cleanup/optimization | Status: assigned
Component: Tasks | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
--------------------------------------+------------------------------------
Comment (by Johannes Maron):

@zky go for it! I think those changes are fairly conflict-free, though.
IMHO, the real work will be writing good tests to prevent performance
regression. The actual patch is probably just those four lines:

{{{
def __new__(cls, *args, **kwargs):
kwargs["args"] = normalize_json(kwargs["args"])
kwargs["kwargs"] = normalize_json(kwargs["kwargs"])
return super().__new__(cls)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:5>

Django

unread,
May 29, 2026, 8:17:56 AM (yesterday) May 29
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
-------------------------------------+-------------------------------------
Reporter: Johannes Maron | Owner: zky
Type: | Status: closed
Cleanup/optimization |
Component: Tasks | Version: dev
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jacob Walls):

* resolution: => needsinfo
* stage: Accepted => Unreviewed
* status: assigned => closed

Comment:

I think the bench incorrectly implements `__new__()`. It looks like the
`kwargs` are mutated in place, but they're actually re-bound and
discarded:

{{{#!py
>>> def a(**kwargs): return kwargs
...
>>> kw = {}
>>> inner = a(**kw)
>>> inner is kw
False
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:6>

Django

unread,
May 29, 2026, 8:52:06 AM (yesterday) May 29
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
-------------------------------------+-------------------------------------
Reporter: Johannes Maron | Owner: zky
Type: | Status: closed
Cleanup/optimization |
Component: Tasks | Version: dev
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by David):

Why using `__new__` which is used for class initialization while `args`
and `kwargs` should be instance-level attributes?
https://docs.python.org/3/reference/datamodel.html#object.__new__
--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:7>

Django

unread,
May 29, 2026, 10:20:40 AM (yesterday) May 29
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
-------------------------------------+-------------------------------------
Reporter: Johannes Maron | Owner: zky
Type: | Status: closed
Cleanup/optimization |
Component: Tasks | Version: dev
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Johannes Maron):

[[Image(https://media0.giphy.com/media/v1.Y2lkPTc5MGI3NjExdzZ1Y3U3YzJpeXpjbnNyNXhqOWVtazdqZXo4bTZnMjVoem1veTI3ZSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/Ra1bmpxpsppNC/giphy.gif)]]

Y'all are absolutely right. It was too good to be true. I was spending too
much time with Claude. It has successfully dumbed me down.
--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:8>

Django

unread,
May 29, 2026, 10:21:35 AM (yesterday) May 29
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
-------------------------------------+-------------------------------------
Reporter: Johannes Maron | Owner: zky
Type: | Status: closed
Cleanup/optimization |
Component: Tasks | Version: dev
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Johannes Maron):

* resolution: needsinfo => invalid

--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:9>

Django

unread,
May 29, 2026, 10:45:26 AM (yesterday) May 29
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
-------------------------------------+-------------------------------------
Reporter: Johannes Maron | Owner: zky
Type: | Status: closed
Cleanup/optimization |
Component: Tasks | Version: dev
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by zky):

Thanks for the great catch, Jacob! Indeed, after running a quick test
locally, I confirmed exactly what you pointed out—modifying kwargs inside
__new__ doesn't actually take effect for the initialization.

Given this, would introducing a classmethod factory be a viable
alternative to solve the original memory issue?Replying to [comment:6
Jacob Walls]:
> I think the bench incorrectly implements `__new__()`. It looks like the
`kwargs` are mutated in place, but they're actually re-bound and
discarded:
>
> {{{#!py
> >>> def a(**kwargs): return kwargs
> ...
> >>> kw = {}
> >>> inner = a(**kw)
> >>> inner is kw
> False
> }}}
--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:10>

Django

unread,
May 29, 2026, 10:47:23 AM (yesterday) May 29
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
-------------------------------------+-------------------------------------
Reporter: Johannes Maron | Owner: zky
Type: | Status: closed
Cleanup/optimization |
Component: Tasks | Version: dev
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Jacob Walls):

Thanks for confirming. At this point we would need a fresh bench
confirming that there is a memory issue at all.
--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:11>

Django

unread,
May 29, 2026, 10:54:53 AM (yesterday) May 29
to django-...@googlegroups.com
#37125: Use __new__ to sanitize TaskResult instead of __post_init__ to half memory
usage
-------------------------------------+-------------------------------------
Reporter: Johannes Maron | Owner: zky
Type: | Status: closed
Cleanup/optimization |
Component: Tasks | Version: dev
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Johannes Maron):

I wrote a benchmark with a proper factory method, and I can confirm that
the `__post_init__` does NOT add any memory overhead. Hence I marked it as
invalid.

I'll keep looking, but I am out of ideas at this point and happy to see
that this is already very efficient.
--
Ticket URL: <https://code.djangoproject.com/ticket/37125#comment:12>
Reply all
Reply to author
Forward
0 new messages