Hi Brian! Comments below:
* Brian Stewart [2022-12-01 05:23]:
> Does the image upload system create a hash for every picture uploaded?
>
> 1. Does a hash check occur to possibly flag/alert that this is a duplicate
> of a picture that already exists, and provide a direct link to the issue?
No, each upload is treated as just an independent file which gets an id
in the database (and some resized versions for display in the indexes).
Nothing in the code checks the contents of the file (either via a hash
or via another mechanism). We trust the uploaders to upload the correct
image and the editors checking the uploads to catch occassional mistakes
or problems.
Apart from that, while the cover and other scans we keep is a nice extra
and important for identification and documentation purposes, our
original mission began with the issue indexes so our search tools focus
on the database data, not on the pictures.
> a. Handy during uploading to alert submitter
> b. handy for approvers to quickly find/compare a possible issue
Here you mean a user uploading the same image twice by mistake? And this
concerns uploading an image to the incorrect issue after having added it
to the correct one? Or another scenario?
> 2. If all the images were hashed then a dupe report could be generated in
> the special reports section for a deeper review
Hopefully, there are no unnecessary dupes because the editors catch
them during approval. Of course there are various cases where the covers
of different issues might legitimately be very close visually, e.g. for
variants, second printings, etc., as you mention later in your message.
For this reason, I don't think even a visual similarity / perceptual
hash, e.g. using the Python ImageHash lib
(
https://pypi.org/project/ImageHash/) would help much in our normal
workflow.
Alexandros