[Django] #35139: Performing a save() with an ImageField where width_field or height_field is set results in an extra read operation

37 views
Skip to first unread message

Django

unread,
Jan 24, 2024, 4:03:12 PM1/24/24
to django-...@googlegroups.com
#35139: Performing a save() with an ImageField where width_field or height_field is
set results in an extra read operation
------------------------------------------------+------------------------
Reporter: john-parton | Owner: nobody
Type: Bug | Status: new
Component: File uploads/storage | Version: 5.0
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
------------------------------------------------+------------------------
I have prepped a github repo here with the basic tests: https://github.com
/john-parton/django-image-field-extra-read

**Conditions for behavior**

You must have an ImageField with the `width_field` or `height_field`
arguments set.

**Description of current behavior**

When a model is saved, the image file is written out using the Storage
API, and then
in order for the width and height fields to be updated, the file is read
back out
and then the width and height are extracted from the image.

In the case the storage is local, the performance impact is probably
negligible,
unless the application is seriously IO constrained, however if the storage
is
remote, the performance impact is significant, and there can be other
impacts on
operations.

For instance, if using S3 as backing, more GET operations are performed
than
strictly necessary. This could be a few dollars a day of operational costs
if your
scale is small, but can be significant if egress charges are high.

As another example, CloudFlare Images rate limits requests. This
effectively cuts
the rate limit in half because every save operations requires an
additional GET.

**Proposed behavior**

The proposed behavior is to simple read the image file which is resident
in memory
without reading it back out from the storage backend.

**Possible breaking issues**

The vast majority of storage backends and use cases likely guarantee that
if you
write a file into storage, and then retrieve it, you will get the same
file back.

However, for some image-specific services, they will compress or crush
larger images.

For users who specifically have this use case, they may end up with the
`width_field`
and `height_field` not representing the actual size of the image in the
store, but
rather the size of the image at time of upload.

**Explanation of current behavior**

It looks like when a model's save() method is called, the field's
pre_save() method
is called which results in the descriptor for the field having its __get__
method
called and then immediately having its __set__ method called with a
similar value.

The effect is to coerce the value of the field to `ImageFieldFile` which
is a subclass
of `ImageFile`. The `ImageFieldFile` instance is assigned a property of
`file` which
is the wrapped original value.

The image field then saves and persists the data using the storage API,
and then the
wrapped file isn't referred to later to get the width and height. When the
width and
height are requested, the file is read back out of storage.

**Proposed fix**

No specific fix at this time.

**Mitigating breaking issues**

Considering how unusual this use case is, it may be sufficient to document
the change
in behavior and provide a code snippet to wire up a signal handler to do
the
additional read for those users who have the unusual storage backends and
actually
care about the width/height being what is on disk. This would also allow
users to
customize the behavior. For instance, maybe if the image is under a
certain resolution,
the storage provider guarantees they don't mangle the image. A user could
enshrine
that logic in the signal handler, so they could still get the performance
uplift where
appropriate.

**Summary**

It seems pretty unlikely that this is the intended behavior, and seems to
largely be a byproduct of heavy descriptor use and "magic" in the code.
I've tried mitigating this in my application code, but it requires me to
monkey patch some private methods, and I'm not even sure I got it right.
Any attempt to "fix" this at a level higher than Django's internals will
just result in an unmaintainable mess. Back when Django was new, I'm not
sure I would make this argument, but now storage is usually backed by some
API like S3, I think it makes sense to make avoiding an extra network
request the "sane default."
--
Ticket URL: <https://code.djangoproject.com/ticket/35139>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jan 24, 2024, 4:12:30 PM1/24/24
to django-...@googlegroups.com
#35139: Performing a save() with an ImageField where width_field or height_field is
set results in an extra read operation
-------------------------------------+-------------------------------------
Reporter: john-parton | Owner: nobody
Type: Bug | Status: closed
Component: File | Version: 5.0
uploads/storage |
Severity: Normal | Resolution: duplicate

Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: new => closed
* resolution: => duplicate

Comment:

Looks like a duplicate of #8307. Can you add your analysis as a comment in
the original ticket?
--
Ticket URL: <https://code.djangoproject.com/ticket/35139#comment:1>

Django

unread,
Jan 24, 2024, 4:18:23 PM1/24/24
to django-...@googlegroups.com
#35139: Performing a save() with an ImageField where width_field or height_field is
set results in an extra read operation
-------------------------------------+-------------------------------------
Reporter: john-parton | Owner: nobody
Type: Bug | Status: closed
Component: File | Version: 5.0
uploads/storage |
Severity: Normal | Resolution: duplicate
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by john-parton):

Wow, okay. I did a search and thought it was thorough. Didn't realize it
was a Pre- 1.0 existing issue. I'll update it there.
--
Ticket URL: <https://code.djangoproject.com/ticket/35139#comment:2>

Django

unread,
May 21, 2024, 4:53:36 AM5/21/24
to django-...@googlegroups.com
#35139: Performing a save() with an ImageField where width_field or height_field is
set results in an extra read operation
--------------------------------------+------------------------------------
Reporter: john-parton | Owner: nobody
Type: Bug | Status: new
Component: File uploads/storage | Version: 5.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by Sarah Boyce):

* resolution: duplicate =>
* stage: Unreviewed => Accepted
* status: closed => new

Comment:

After looking into the other ticket, I can see that one was a duplicate of
#35139 which has been fixed. This described a separate concern.
--
Ticket URL: <https://code.djangoproject.com/ticket/35139#comment:3>

Django

unread,
May 21, 2024, 4:53:46 AM5/21/24
to django-...@googlegroups.com
#35139: Performing a save() with an ImageField where width_field or height_field is
set results in an extra read operation
-------------------------------------+-------------------------------------
Reporter: john-parton | Owner: john-
| parton
Type: Bug | Status: assigned
Component: File | Version: 5.0
uploads/storage |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* has_patch: 0 => 1
* owner: nobody => john-parton
* status: new => assigned

--
Ticket URL: <https://code.djangoproject.com/ticket/35139#comment:4>

Django

unread,
May 21, 2024, 4:58:45 AM5/21/24
to django-...@googlegroups.com
#35139: Performing a save() with an ImageField where width_field or height_field is
set results in an extra read operation
-------------------------------------+-------------------------------------
Reporter: john-parton | Owner: john-
| parton
Type: Bug | Status: assigned
Component: File | Version: 5.0
uploads/storage |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* stage: Accepted => Ready for checkin

--
Ticket URL: <https://code.djangoproject.com/ticket/35139#comment:5>

Django

unread,
May 21, 2024, 6:26:07 PM5/21/24
to django-...@googlegroups.com
#35139: Performing a save() with an ImageField where width_field or height_field is
set results in an extra read operation
-------------------------------------+-------------------------------------
Reporter: john-parton | Owner: john-
| parton
Type: Bug | Status: closed
Component: File | Version: 5.0
uploads/storage |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce <42296566+sarahboyce@…>):

* resolution: => fixed
* status: assigned => closed

Comment:

In [changeset:"9c5fe93349bd4339c41d057b87046e5d28be6f77" 9c5fe933]:
{{{#!CommitTicketReference repository=""
revision="9c5fe93349bd4339c41d057b87046e5d28be6f77"
Fixed #35139 -- Prevented file read after ImageField is saved to storage.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35139#comment:6>
Reply all
Reply to author
Forward
0 new messages