Design decision on FileField removing abandoned files?

10 views
Skip to first unread message

Tim

unread,
Nov 20, 2009, 3:24:06 PM11/20/09
to Django developers
In light of ticket #11663 (http://code.djangoproject.com/ticket/
11663), and the consequent discussion on a thread on this group
(http://groups.google.com/group/django-developers/browse_thread/thread/
491619541ba6ac75).... it doesn't seem like there was any resolution on
the topic of having an optional keyword argument to automatically
remove a file when "replaced" via a new upload.

I understand the point of view concerning the impossible knowing of
the file being "free" for removal. But... Is there any opposition to
getting innovative with this?

In Malcom's words (taken from the previous thread, linked above):

> So knowing when it is safe to delete files requires domain-specific
> knowledge.

This is true, but, aren't *I* the one to give such a commission to the
FileField? It seems to me that I, as the developer and maintainer of
the server, should be able to provide instructions to the FileField
which constitute such "domain-specific knowledge."

If Django goes and tries to delete a file, but can't, then would it be
design-acceptable to have methods for dealing with the exception? A
default catcher could just append those underscores like normal.

Ultimately, I fail to see the difference between Django having an
option for *trying* to delete the file automatically, and me going and
hacking up the silly thing just to make it naively delete the file
anyway. Omitting this functionality doesn't really "solve" any real-
world problem. It simply refuses to make a decision about the
*existing* real-world problem of wanting to depend on a path to
provide a predictable filename.

Anyway, I don't want to go on and on for nothing. If there's any more
info on this topic, I'd be welcome to reading more discussions or
tickets. I'm rather in favor of having a declared spot for this
exception check to happen, so that it's officially documented where a
file delete / rename should happen. Doing this on the outside of the
code just feels sloppy, no matter how I approach it. Suggestions
welcome.

Tim

Alexander Schepanovski

unread,
Nov 21, 2009, 8:18:52 PM11/21/09
to Django developers
I agree, currently I am deleting old file in view handling form, which
does not feel right. I also need to check for upload errors and in
case of ImageField if it's a valid image, which feels really wrong
since the field is doing it on itself

Tim

unread,
Nov 23, 2009, 11:37:09 AM11/23/09
to Django developers
After some more searching of the current 1.2 tickets, #7048 [1] seems
like it does the job. It has a kind of goofy option for deleting
empty directories when removing a file leaves said directory empty.
If this were merged together with the existing models.FileField class,
it would perfectly fix what I suggest be done.

As the author of the patch says:

> Currently ... blank=True on FileFields currently means "blank
first, then set forever".
> Which is a not the most common use case IMHO ;)

The patch is marked as "Accepted", but there's some discussion on the
comments about "deleting files" not being within the scope of what the
patch should do. I think the intended suggestion is that the patch
should clear the DB field pointing to the file, yet not remove the
file itself.

Which sort of lands us back at the initial discussion I brought up in
the first post. Now, for the record, I'm totally against an option
for removing directories. That most certainly intrudes on the file
system far more than acceptable. It's far from a required feature for
the Django core, and has a rare use case at best. But, as for the
basic file clearing and deletion, it doesn't seem to me that it ever
was discussed beyond the ticket comments, or anything else I've found
yet.

So once again, in summary, I'd like to petition for something of a
conclusive decision about file deletion, this time on the above-noted
patch. The ticket comments mention that it would be too late for
introducing a feature like that into the development cycle, but that
looks like it was from the 1.1 days. I don't want something like this
to slip beyond the end of December, and then have it ignored until a
1.3-ish release.

It looks like Jacob had the eyes that looked over the ticket last
significant update. Any thoughts, Jacob or anyone else?

Tim

[1] http://code.djangoproject.com/ticket/7048

Jonas Pfeil

unread,
Nov 23, 2009, 11:41:31 PM11/23/09
to Django developers
> It has a kind of goofy option for deleting
> empty directories when removing a file leaves said directory empty.

Hey, it's not goofy. It's handy ;) I also took _great_ care when
writing the code for directory deletion and it's happily deleting
directories on our production server since a year or so. And it's of
course off by default. That said, if throwing directory deleting out
is the thing that's required to get the patch in, I won't object ;)

> The patch is marked as "Accepted", but there's some discussion on the
> comments about "deleting files" not being within the scope of what the
> patch should do.  I think the intended suggestion is that the patch
> should clear the DB field pointing to the file, yet not remove the
> file itself.

Originally I wrote the patch for 1.0 starting with code that Brian
gave me. As can be seen from his comment the change was a bit too big
to get in at the last minute.

It never occurred to me that we would not a least want the option to
delete the file from disk. I would guess that in 95% of projects
people have exactly one FileField in charge of a particular file. But
I can see how deleting files by default is potentially dangerous for a
framework.

In the current patch just changing the default behavior to not delete
files creates a lot of work for the probably 95% of people who want to
delete the file form disk. I personally would find an option in the
settings file most convenient. Adding an option to the model field
that let's me say "yep, this field has permission to physically delete
it's file" would also be OK.

> So once again, in summary, I'd like to petition for something of a
> conclusive decision about file deletion, this time on the above-noted
> patch.  The ticket comments mention that it would be too late for
> introducing a feature like that into the development cycle, but that
> looks like it was from the 1.1 days.  I don't want something like this
> to slip beyond the end of December, and then have it ignored until a
> 1.3-ish release.

+1 to that ;) Although as Russell pointed out here [1] we are in
feature development phase and as long as this really get's looked at
in the bug fixing phase I won't say anything ;) This was classified as
"Small/just bugs" when deciding on features for 1.1 by the way [2].

Cheers,

Jonas

[1] http://groups.google.com/group/django-developers/browse_thread/thread/48a445e2f90ff644
[2] http://code.djangoproject.com/wiki/Version1.1Features

Tim

unread,
Nov 24, 2009, 12:05:52 AM11/24/09
to Django developers
I'm with you, if not on 100% of what you said, at least 95% of it :)

> That said, if throwing directory deleting out
> is the thing that's required to get the patch in, I won't object ;)

I just foresee that if there is heavy hesitation on even deleting
files, deleting directories might be too liberal a feature for the
kind of position that the framework takes. I don't doubt that
directory deletion has its use, and that it's safe when used
properly. I think that the primary committors (of which I am most
certainly not) would not think that it was a critical piece of the
framework. So that being said, I'm just folding on that one without
any argument

> It never occurred to me that we would not a least want the option to
> delete the file from disk. I would guess that in 95% of projects
> people have exactly one FileField in charge of a particular file.

And that's where this patch has an excellent point. Like I kicked
things off by saying, letting Django help out and delete a file is
hardly different than me hacking in a blind file deletion and
'pass'ing on the exception. I don't think that anybody can argue that
there shouldn't at least be the checkbox for "clear this field", even
if the file *isn't* deleted. But, without the ability to actually
remove the file, I'm unsure that we can say that the FileField and its
derivatives are doing 100% of the job that I'm sure a lot of us expect
it to do.

For the sake of bringing the real-world use case into the picture,
I've got an admin change form which has a few FileFields for various
uses, such as cover letter documents, email lists, etc. I discard the
original file name in favor of something streamlined, like
"email_list.csv". Pretty reasonable. But even the smartest system
admin sitting next to me can't do anything about the utterly useless
duplicate files that pile up on the remote server, without some SSH
credentials.

So, where the core developers and decision makers might hesitate to
delete a file that I declare useless, this patch would instead pick up
the slack. The files left behind are 100% useless if I say so.
Consider me using a hosted solution, where I can't be bothered with
logging in each week and clearing abandoned files out of a folder.
The typical "set up a cron job" (or similar) approach doesn't work in
every case.

> This was classified as
> "Small/just bugs" when deciding on features for 1.1 by the way [2].

If that's the case, then that's find it it waits until bugfix phase.
I had thought it to be a feature addition, but I'll adjust my
understanding accordingly!

I hope that a patch which has predated version 1.0 can at last get a
decision made. I'm of course bias toward it being implemented very
nearly as-is, but a decision is at least a decision.

*peacefully waiting for bugfix phase*

Tim
> [1]http://groups.google.com/group/django-developers/browse_thread/thread...
> [2]http://code.djangoproject.com/wiki/Version1.1Features
Reply all
Reply to author
Forward
0 new messages