#35846: Reproducibility of staticfiles manifests
-------------------------------------+-------------------------------------
Reporter: Linus Heckemann | Owner: (none)
Type: | Status: closed
Cleanup/optimization |
Component: contrib.staticfiles | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Matthew Stell):
The root cause of this issue is the ordering of the files returned from
the Finders.
For example, the FileSystemSotorageFinder uses os.scandir() to "find" the
files.
https://github.com/django/django/blob/efb7f9ced2dcf71294353596a265e3fd67faffeb/django/core/files/storage/filesystem.py#L188
{{{
def listdir(self, path):
path = self.path(path)
directories, files = [], []
with os.scandir(path) as entries:
for entry in entries:
if entry.is_dir():
directories.append(
entry.name)
else:
files.append(
entry.name)
return directories, files
}}}
**The Python docs state that ''The entries are yielded in arbitrary
order'' [
https://docs.python.org/3/library/os.html#os.scandir].**
The order in which they are yielded is the order in which they are added
to the found_files dictionary
https://github.com/django/django/blob/cf9da6fadd44cc7654681026d202387022b30d8d/django/contrib/staticfiles/management/commands/collectstatic.py#L124
which is passed through to the save_manifest function via the post_process
method.
https://github.com/django/django/blob/1ba5fe19ca221663e6a1e9391dbe726bb2baaf8a/django/contrib/staticfiles/storage.py#L498
{{{
def save_manifest(self):
self.manifest_hash = self.file_hash(
None,
ContentFile(json.dumps(sorted(self.hashed_files.items())).encode())
)
payload = {
"paths": self.hashed_files,
"version": self.manifest_version,
"hash": self.manifest_hash,
}
if self.manifest_storage.exists(self.manifest_name):
self.manifest_storage.delete(self.manifest_name)
contents = json.dumps(payload).encode()
self.manifest_storage._save(self.manifest_name,
ContentFile(contents))
}}}
Therefore the ordering of the items in staticfiles.json is by definition
"arbitrary" and therefore cannot be guaranteed to be consistent.
The most thorough test I can think of is to mock os.scandir so that we can
force the files to be returned in a different order. However I think this
is overkill and instead reordering the
storage.staticfiles_storage.hashed_files dictionary might be a more
appropriate test?
For example:
{{{
def test_staticfile_content_consistency(self):
manifest_file_content_orig =
storage.staticfiles_storage.read_manifest()
hashed_files = storage.staticfiles_storage.hashed_files
# Change the order of the hashed files
storage.staticfiles_storage.hashed_files =
dict(reversed(hashed_files.items()))
# The manifest file content should not change.
storage.staticfiles_storage.save_manifest()
manifest_file_content =
storage.staticfiles_storage.read_manifest()
self.assertEqual(manifest_file_content_orig,
manifest_file_content)
}}}
What do you think?
Many thanks,
Matthew
--
Ticket URL: <
https://code.djangoproject.com/ticket/35846#comment:4>