Altering data uploaded to FileField before save is called

157 views
Skip to first unread message

shadfc

unread,
Sep 10, 2008, 10:13:23 AM9/10/08
to Django developers
My use case is this: I'm using a custom filestorage backend to Amazon
S3. An inline model allows the upload of mp3 files and I am trying to
edit the ID3 tag on these before they get saved to backend. To
accomplish this, I've overridden the save_formset() on the parent
ModelAdmin to do a save(commit=False) and then I edit the instances
and save them individually.

The problem is that FieldFiles get saved to the storage backend even
on a save(commit=False), and (according to a more knowledgeable SeanL-
on #django), the FieldFile will only get saved once, the first time
through. So the only way I can accomplish what I want is to do two
saves to my backend -- one when the formset is saved with commit=False
and then I must call the storage backend's save() directly in order to
upload the changes I've made to the mp3 file in memory.

I think that the commit=False (or save=False in FieldFile.save() )
should not do the save to the storage backend. From what little I
could figure out with some help, the FieldFile would have to be saved
multiple times (but only one actually calling storage.save()) and that
could cause some transitory names on the FieldFile instance. For
example, on the commit=False, storage.get_available_name() would
return an available name for the file, but before the commit=True
happens, another process might save a file to the name given to the
first. In which case, when the first is actually saved, its filename
would have to change.

Another option might be to pass the save argument along to the storage
backend and give the backend designer control over the behavior --
perhaps to reserve a filename in a cache with a short ttl. This would
still require that the save() on FieldFile be called more than once
though.

Jay

Incidentally, if anyone has a "proper" procedure for the actual
editing of the FieldFile's contents, I'd appreciate it if you could
pass that along as well. My way feels like nasty hackery.

Brian Rosner

unread,
Sep 10, 2008, 10:38:14 AM9/10/08
to django-d...@googlegroups.com
On Wed, Sep 10, 2008 at 8:13 AM, shadfc <jay.wi...@gmail.com> wrote:
> The problem is that FieldFiles get saved to the storage backend even
> on a save(commit=False), and (according to a more knowledgeable SeanL-
> on #django), the FieldFile will only get saved once, the first time
> through. So the only way I can accomplish what I want is to do two
> saves to my backend -- one when the formset is saved with commit=False
> and then I must call the storage backend's save() directly in order to
> upload the changes I've made to the mp3 file in memory.

Ouch. In a formset sounds like you could be calling save up to N*2 times.

> I think that the commit=False (or save=False in FieldFile.save() )
> should not do the save to the storage backend.

Agreed. I have been considering treating FileFields the same as
many-to-many fields when commit=False in a ModelForm. In the event
that commit=False a new method is available to perform the file saving
explicitly. This would help in the event of wanting to name the file
the same as the primary key for instance.

>
> Another option might be to pass the save argument along to the storage
> backend and give the backend designer control over the behavior --
> perhaps to reserve a filename in a cache with a short ttl. This would
> still require that the save() on FieldFile be called more than once
> though.

We need to avoid calling save on a file backend as much as possible.
That operation may be very expensive.

Go look for a ticket about this. If there isn't one I would say report
one. This seems like a reasonable thing to do in general.

--
Brian Rosner
http://oebfare.com

Karen Tracey

unread,
Sep 10, 2008, 10:50:39 AM9/10/08
to django-d...@googlegroups.com

I agree with what you're saying, but wouldn't changing whether the file is saved when commit=False be a backwards-incompatible change?  Do we need to preserve current behavior for backwards-compatibility sake and find a way to enable the desired behavior going forward?

Karen

Brian Rosner

unread,
Sep 10, 2008, 12:05:40 PM9/10/08
to django-d...@googlegroups.com
On Wed, Sep 10, 2008 at 8:50 AM, Karen Tracey <kmtr...@gmail.com> wrote:
> I agree with what you're saying, but wouldn't changing whether the file is
> saved when commit=False be a backwards-incompatible change? Do we need to
> preserve current behavior for backwards-compatibility sake and find a way to
> enable the desired behavior going forward?

Yes it would be backward incompatible. I haven't thought this all the
way through, yet. In general the solution I proposed is the ultimate
fix. For a fix like this I still need to get grips on what exactly the
process for this is going to be and when exactly it can be
implemented. A less than ideal way of accomplishing this now in user
code if they need to do this now is to override save on the ModelForm
and use their own save_instance function, but seems to be the best way
avoiding some other really ugly hack and make going forward much
simpler.

Joseph Kocherhans

unread,
Sep 10, 2008, 12:09:41 PM9/10/08
to django-d...@googlegroups.com
On Wed, Sep 10, 2008 at 7:38 AM, Brian Rosner <bro...@gmail.com> wrote:
>
> On Wed, Sep 10, 2008 at 8:13 AM, shadfc <jay.wi...@gmail.com> wrote:
>> The problem is that FieldFiles get saved to the storage backend even
>> on a save(commit=False), and (according to a more knowledgeable SeanL-
>> on #django), the FieldFile will only get saved once, the first time
>> through. So the only way I can accomplish what I want is to do two
>> saves to my backend -- one when the formset is saved with commit=False
>> and then I must call the storage backend's save() directly in order to
>> upload the changes I've made to the mp3 file in memory.
>
> Ouch. In a formset sounds like you could be calling save up to N*2 times.
>
>> I think that the commit=False (or save=False in FieldFile.save() )
>> should not do the save to the storage backend.
>
> Agreed. I have been considering treating FileFields the same as
> many-to-many fields when commit=False in a ModelForm. In the event
> that commit=False a new method is available to perform the file saving
> explicitly. This would help in the event of wanting to name the file
> the same as the primary key for instance.

That sounds like a promising idea to me, but I'd want to see the
details. I was writing some code the other day that was trying to poke
the username into part of the upload_to path. I made it work with a
custom ModelForm __init__ method, but commit=False seems like it would
have been a lot cleaner, or at least that was my first instinct about
what to try.

To address Karen's backwards-incompatibility concern, perhaps a new
'commit_files' argument would be in order.

Joseph

Sean Legassick

unread,
Sep 10, 2008, 12:47:38 PM9/10/08
to django-d...@googlegroups.com

On 10 Sep 2008, at 15:38, Brian Rosner wrote:
>> I think that the commit=False (or save=False in FieldFile.save() )
>> should not do the save to the storage backend.
>
> Agreed. I have been considering treating FileFields the same as
> many-to-many fields when commit=False in a ModelForm. In the event
> that commit=False a new method is available to perform the file saving
> explicitly. This would help in the event of wanting to name the file
> the same as the primary key for instance.

Looking at the code, the issue I see with this is at the moment there
is a mechanism for the storage backend to alter the filename when
saving. If the save was performed separately after the DB update then
a second DB update would be required to store the amended name.

The default FileSystemStorage uses the name change mechanism to avoid
race conditions over conflicting names. It may be that this can be
considered unusual enough that a second DB update in this situation is
acceptable, although a mechanism for this would need to be included.
Anyway I just wanted to flag this potential issue to help with
thinking this through.

Sean

Reply all
Reply to author
Forward
0 new messages