[Django] #33465: Introduce empty __slots__ protocol for SafeString & SafeData

16 views
Skip to first unread message

Django

unread,
Jan 27, 2022, 12:14:49 PM1/27/22
to django-...@googlegroups.com
#33465: Introduce empty __slots__ protocol for SafeString & SafeData
-------------------------------------+-------------------------------------
Reporter: Keryn | Owner: Keryn Knight
Knight |
Type: | Status: assigned
Cleanup/optimization |
Component: Utilities | Version: dev
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
This is a case-by-case proposal ultimately referencing #12826

Because `SafeString` is used ''a lot'' and is otherwise supposed to be
treatable as a untainted `str` we should be able to (AFAIK) update it +
it's inheritance chain to use `__slots__ = ()` whilst still allowing
custom subclasses of either to add additional attributes. By defining
`__slots__` as empty on `SafeString` (**and** `SafeData`) we'd avoid
creation of a `__dict__` on the instance, which mirrors the `str()`
behaviour.

According to pympler, currently in Python `3.10` using the following back
of the napkins strings:
{{{
In [4]: s = "test" # this might be interned, as a short string?

In [5]: s2 = "test" * 100

In [6]: s3 = SafeString("test")

In [7]: s4 = SafeString("test" * 100)
}}}
we get:
{{{
In [8]: asizeof(s) # str
Out[8]: 56

In [9]: asizeof(s2) # str
Out[9]: 456

In [10]: asizeof(s3) # SafeString
Out[10]: 208

In [11]: asizeof(s4) # SafeString
Out[11]: 608
}}}
But if we swap out the implementation to be slots'd, it looks more like:
{{{
In [8]: asizeof(s) # str
Out[8]: 56

In [9]: asizeof(s2) # str
Out[9]: 456

In [10]: asizeof(s3) # SafeString
Out[10]: 104

In [11]: asizeof(s4) # SafeString
Out[11]: 504
}}}

So we're "saving" `104 bytes` per `SafeString` created, by the look of it.
I presume it to be some fun implementation detail of something somewhere
that it is allegedly accounting for more than `56` bytes, which is the
`asizeof({})`

A quick and dirty check over the test suite suggests that for me locally,
running `14951 tests in 512.912s` accounted for `949.0 MB` of SafeStrings,
checked by just incrementing a global integer of bytes (using
`SafeString.__new__` and `--parallel=1`) and piping that to
`filesizeformat`, so y'know, ''room for error''.
After the patch, the same tests accounted for `779.4 MB` of `SafeString`,
"saving" `170 MB` overall.

The only functionality this would preclude -- as far as I know -- is no
longer being able to bind arbitrary values to an instance like so:
{{{
s = SafeString('test')
s.test = 1
}}}
which would raise `AttributeError` if `__slots__` were added, just like
trying to assign attributes to `str()` directly does.

I don't believe this will have any marked performance change, as neither
`SafeString` nor `SafeData` actually have any extra attributes, only
methods.

I have a branch which implements this, and tests pass for me locally.

--
Ticket URL: <https://code.djangoproject.com/ticket/33465>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jan 27, 2022, 12:15:35 PM1/27/22
to django-...@googlegroups.com
#33465: Introduce empty __slots__ protocol for SafeString & SafeData
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: assigned
Component: Utilities | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by Keryn Knight:

Old description:

New description:

that it is allegedly accounting for more than `64` bytes, which is the
`asizeof({})`

A quick and dirty check over the test suite suggests that for me locally,
running `14951 tests in 512.912s` accounted for `949.0 MB` of SafeStrings,
checked by just incrementing a global integer of bytes (using
`SafeString.__new__` and `--parallel=1`) and piping that to
`filesizeformat`, so y'know, ''room for error''.
After the patch, the same tests accounted for `779.4 MB` of `SafeString`,
"saving" `170 MB` overall.

The only functionality this would preclude -- as far as I know -- is no
longer being able to bind arbitrary values to an instance like so:
{{{
s = SafeString('test')
s.test = 1
}}}
which would raise `AttributeError` if `__slots__` were added, just like
trying to assign attributes to `str()` directly does.

I don't believe this will have any marked performance change, as neither
`SafeString` nor `SafeData` actually have any extra attributes, only
methods.

I have a branch which implements this, and tests pass for me locally.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/33465#comment:1>

Django

unread,
Jan 27, 2022, 4:08:04 PM1/27/22
to django-...@googlegroups.com
#33465: Introduce empty __slots__ protocol for SafeString & SafeData
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: assigned
Component: Utilities | Version: dev

Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Claude Paroz):

I think this has a good potential for reducing memory footprint, +1.

--
Ticket URL: <https://code.djangoproject.com/ticket/33465#comment:2>

Django

unread,
Jan 28, 2022, 12:08:36 AM1/28/22
to django-...@googlegroups.com
#33465: Introduce empty __slots__ protocol for SafeString & SafeData
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: assigned
Component: Utilities | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* has_patch: 0 => 1
* stage: Unreviewed => Accepted


Comment:

Sounds reasonable.

[https://github.com/django/django/pull/15370 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/33465#comment:3>

Django

unread,
Jan 29, 2022, 7:54:27 AM1/29/22
to django-...@googlegroups.com
#33465: Introduce empty __slots__ protocol for SafeString & SafeData
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: assigned
Component: Utilities | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/33465#comment:4>

Django

unread,
Jan 29, 2022, 12:49:16 PM1/29/22
to django-...@googlegroups.com
#33465: Introduce empty __slots__ protocol for SafeString & SafeData
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: closed
Component: Utilities | Version: dev
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"55022f75c1e76e92206e023a127532d97cedd5b7" 55022f75]:
{{{
#!CommitTicketReference repository=""
revision="55022f75c1e76e92206e023a127532d97cedd5b7"
Fixed #33465 -- Added empty __slots__ to SafeString and SafeData.

Despite inheriting from the str type, every SafeString instance gains
an empty __dict__ due to the normal, expected behaviour of type
subclassing in Python.

Adding __slots__ to SafeData is necessary, because otherwise inheriting
from that (as SafeString does) will give it a __dict__ and negate the
benefit added by modifying SafeString.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/33465#comment:5>

Reply all
Reply to author
Forward
0 new messages