[Django] #35846: Reproducibility of staticfiles manifests

21 views
Skip to first unread message

Django

unread,
Oct 16, 2024, 7:36:04 AM10/16/24
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: lheckemann | Type:
| Cleanup/optimization
Status: new | Component:
| contrib.staticfiles
Version: 5.0 | Severity: Normal
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Paths in staticfiles manifests appear in a nondeterministic order in the
resulting JSON file. I assume this would often reflect the order in which
files are listed by the operating system, given dict's insertion order
preservation, but there are probably many more factors affecting this.

This can sometimes (but may not always -- this depends heavily on
filesystem behaviour) be reproduced by running collectstatic in projects
using ManifestStaticFilesStorage across different copies of the project
source.

Sorting them would result in more comparable results, smaller diffs and
(depending on the environment) more efficient deployments.
--
Ticket URL: <https://code.djangoproject.com/ticket/35846>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Oct 16, 2024, 10:53:27 AM10/16/24
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: (none)
Type: | Status: closed
Cleanup/optimization |
Component: contrib.staticfiles | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* resolution: => needsinfo
* status: new => closed

Comment:

Hi Linus, would you be able to create a test project with some
instructions as to how to reproduce this behavior?
If this is an issue we will need to be able to confirm it's fixed
--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:1>

Django

unread,
Oct 17, 2024, 1:49:07 PM10/17/24
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: (none)
Type: | Status: closed
Cleanup/optimization |
Component: contrib.staticfiles | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Andreas Pelme):

It may be hard to show this exact problem since it depends on the OS/file
systems used. The problem is the list of paths in the staticfiles
manifest.

Somewhat related, see https://github.com/django/django/pull/16411/files
#diff-
f5c7100e3528e9f6edb98cd5a3d33133bdde6286a89fa472f89980fb07364a8eR486. It
sorts the paths before hashing them. If it would not sort the keys, the
hash could change. I tried changing sorted() to list() on that line and
ran the test suite and there were no test failures. So testing this is a
bit tricky to test properly!
--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:2>

Django

unread,
Jan 15, 2025, 3:46:54 PM1/15/25
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: (none)
Type: | Status: closed
Cleanup/optimization |
Component: contrib.staticfiles | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Matthias Kestenholz):

* needs_tests: 0 => 1

Comment:

[https://github.com/django/django/pull/18676 PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:3>

Django

unread,
Jun 13, 2025, 8:11:32 AM6/13/25
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: (none)
Type: | Status: closed
Cleanup/optimization |
Component: contrib.staticfiles | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Matthew Stell):

The root cause of this issue is the ordering of the files returned from
the Finders.

For example, the FileSystemSotorageFinder uses os.scandir() to "find" the
files.
https://github.com/django/django/blob/efb7f9ced2dcf71294353596a265e3fd67faffeb/django/core/files/storage/filesystem.py#L188
{{{
def listdir(self, path):
path = self.path(path)
directories, files = [], []
with os.scandir(path) as entries:
for entry in entries:
if entry.is_dir():
directories.append(entry.name)
else:
files.append(entry.name)
return directories, files
}}}

**The Python docs state that ''The entries are yielded in arbitrary
order'' [https://docs.python.org/3/library/os.html#os.scandir].**

The order in which they are yielded is the order in which they are added
to the found_files dictionary
https://github.com/django/django/blob/cf9da6fadd44cc7654681026d202387022b30d8d/django/contrib/staticfiles/management/commands/collectstatic.py#L124
which is passed through to the save_manifest function via the post_process
method.

https://github.com/django/django/blob/1ba5fe19ca221663e6a1e9391dbe726bb2baaf8a/django/contrib/staticfiles/storage.py#L498
{{{
def save_manifest(self):
self.manifest_hash = self.file_hash(
None,
ContentFile(json.dumps(sorted(self.hashed_files.items())).encode())
)
payload = {
"paths": self.hashed_files,
"version": self.manifest_version,
"hash": self.manifest_hash,
}
if self.manifest_storage.exists(self.manifest_name):
self.manifest_storage.delete(self.manifest_name)
contents = json.dumps(payload).encode()
self.manifest_storage._save(self.manifest_name,
ContentFile(contents))
}}}


Therefore the ordering of the items in staticfiles.json is by definition
"arbitrary" and therefore cannot be guaranteed to be consistent.


The most thorough test I can think of is to mock os.scandir so that we can
force the files to be returned in a different order. However I think this
is overkill and instead reordering the
storage.staticfiles_storage.hashed_files dictionary might be a more
appropriate test?

For example:

{{{
def test_staticfile_content_consistency(self):
manifest_file_content_orig =
storage.staticfiles_storage.read_manifest()
hashed_files = storage.staticfiles_storage.hashed_files
# Change the order of the hashed files
storage.staticfiles_storage.hashed_files =
dict(reversed(hashed_files.items()))
# The manifest file content should not change.
storage.staticfiles_storage.save_manifest()
manifest_file_content =
storage.staticfiles_storage.read_manifest()
self.assertEqual(manifest_file_content_orig,
manifest_file_content)
}}}

What do you think?

Many thanks,
Matthew
--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:4>

Django

unread,
Jun 18, 2025, 6:05:09 AM6/18/25
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: (none)
Type: | Status: closed
Cleanup/optimization |
Component: contrib.staticfiles | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Linus Heckemann):

Yes, that's the analysis we got to as well and how we ended up with
https://github.com/django/django/pull/18676 . Your test approach sounds
reasonable, though we're now working around it by subclassing
ManifestStaticFilesStorage to sort the keys ourselves and I'm not
particularly motivated to continue on this myself, so feel free to pick it
up.
--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:5>

Django

unread,
Jun 24, 2025, 2:39:30 AM6/24/25
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: (none)
Type: | Status: new
Cleanup/optimization |
Component: contrib.staticfiles | Version: 5.0
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Matthew Stell):

* resolution: needsinfo =>
* status: closed => new

--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:6>

Django

unread,
Jun 24, 2025, 2:39:44 AM6/24/25
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: Matthew
Type: | Stell
Cleanup/optimization | Status: assigned
Component: contrib.staticfiles | Version: 5.0
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Matthew Stell):

* owner: (none) => Matthew Stell
* status: new => assigned

--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:7>

Django

unread,
Jun 24, 2025, 2:41:25 AM6/24/25
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: Matthew
Type: | Stell
Cleanup/optimization | Status: assigned
Component: contrib.staticfiles | Version: 5.0
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Matthew Stell):

* needs_tests: 1 => 0

--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:8>

Django

unread,
Jun 24, 2025, 7:40:59 AM6/24/25
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: Matthew
Type: | Stell
Cleanup/optimization | Status: assigned
Component: contrib.staticfiles | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Natalia Bidart):

* stage: Unreviewed => Accepted
* version: 5.0 => dev

Comment:

Thank you everyone for the detailed analysis and the initial PR. I think
the issue and fix makes sense, accepting the ticket.
--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:9>

Django

unread,
Jun 24, 2025, 7:49:35 AM6/24/25
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: Matthew
Type: | Stell
Cleanup/optimization | Status: assigned
Component: contrib.staticfiles | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Natalia Bidart):

* needs_better_patch: 0 => 1

--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:10>

Django

unread,
Jul 1, 2025, 12:46:36 PM7/1/25
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: Matthew
Type: | Stell
Cleanup/optimization | Status: assigned
Component: contrib.staticfiles | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Natalia Bidart):

* needs_better_patch: 1 => 0
* stage: Accepted => Ready for checkin

--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:11>

Django

unread,
Jul 1, 2025, 2:24:45 PM7/1/25
to django-...@googlegroups.com
#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: Matthew
Type: | Stell
Cleanup/optimization | Status: closed
Component: contrib.staticfiles | Version: dev
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by nessita <124304+nessita@…>):

* resolution: => fixed
* status: assigned => closed

Comment:

In [changeset:"7feafd79a481216cdd85b4828e749fc5efacb8db" 7feafd79]:
{{{#!CommitTicketReference repository=""
revision="7feafd79a481216cdd85b4828e749fc5efacb8db"
Fixed #35846 -- Ensured consistent path ordering in
ManifestStaticFilesStorage manifest files.

This change reuses the existing sorting of `hashed_files` in
`ManifestStaticFilesStorage.save_manifest` to also store a sorted
`paths` mapping in the manifest file. This ensures stable manifest
output that does not change unnecessarily.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35846#comment:12>
Reply all
Reply to author
Forward
0 new messages