Proposal: Use timestamp or hash to determine name when uploading duplicate filenames

139 views
Skip to first unread message

Areski

unread,
Aug 4, 2014, 2:44:30 PM8/4/14
to django-d...@googlegroups.com
Hi,

I opened a ticket suggesting to add a new setting to determine how duplicate filename will be named: https://code.djangoproject.com/ticket/23183

Tim asked me to bring this to the mailinglist to discuss further this change and eventually decide if this is worth considering or if it's a no go.

You will notice that this can me achieved by using a CustomStorage but it might be interesting that we consider changing the default behavior in future versions. The main gain is to keep obfuscated the existing uploaded filenames on a server. Right now an incremental method is used and so it makes easy for someone to know which files have been previously uploaded.

What do you think about this issue?


Yours,
/Areski

Collin Anderson

unread,
Aug 4, 2014, 9:59:31 PM8/4/14
to django-d...@googlegroups.com
It seems to me a setting is not the way to go, but a custom storage would be better. I've made a custom storage the computes the md5 of the content and uses that as the directory of the file, though that generate a whole lot of directories. In a lot of ways, I think it would be great if django could try to check to see if the duplicate files also have the same content and re-use the first file if so. Otherwise I'm still in the mindset that filename_1, filename_2 is an improvement over filename_, filename__, filename___. :).

It would be nice if upload_to functions had access to the file content.

Areski Belaid

unread,
Aug 5, 2014, 5:42:00 AM8/5/14
to django-d...@googlegroups.com
I also agree with you that custom storage will be more elegant solution.

On the other hand it seems to me that reusing the same file on upload will not allow a clean management of user's resources, for instance if a user decide to delete a file we won't be able to tell if this file is used by an other user too.

Carl Meyer

unread,
Aug 5, 2014, 1:42:50 PM8/5/14
to django-d...@googlegroups.com
On 08/05/2014 03:42 AM, Areski Belaid wrote:
> I also agree with you that custom storage will be more elegant solution.
>
> On the other hand it seems to me that reusing the same file on upload
> will not allow a clean management of user's resources, for instance if a
> user decide to delete a file we won't be able to tell if this file is
> used by an other user too.

Note that we have strong precedent already in Django that an uploaded
file is not strongly linked to one particular FileField, but may be a
shared resource: uploaded files are never removed when a model instance
containing a FileField is deleted.

Carl
Reply all
Reply to author
Forward
0 new messages