class Song(models.Model):
album = models.CharField(maxlength=255)
title = models.CharField(maxlength=255)
artist = models.CharField(maxlength=255)
track = models.PositiveSmallIntegerField()
year = models.PositiveIntegerField()
genre = models.CharField(maxlength=40)
file = models.FileField(upload_to='songs',
filename='%(album)s_%(track)s_%(title)s.mp3')
However, this raises two concerns, both stemming from the fact that
given people a cookie will make them want a glass of milk:
* Many (though I expect not all) will believe that the filename would
update automatically when the model itself is updated. This is
obviously not the case, and the documentation would try to make this
clear, but it's a whole round of questions I expect to hear anyway.
* An often-requested variant on this scheme is the ability to
determine subdirectories based on instance data as well. For instance:
..., filename='%(genre)s/%(year)s/%(artist)s/%(album)s_%(track)s_%(title)s.mp3'
Currently, get_filename uses os.path.basename, and after filestorage
goes in, get_available_filename will do the same. This means that any
directory information is lost, unless it's preserved separately. It's
possible, but will take a bit of doing.
Frankly, I'm not sure it's worth it, given the above concerns, but
since working with filestorage, I've been paying attention to
FileField gripes, and this comes up more often than anything else I've
seen. Do you guys think this is worth implementing?
It wouldn't be part of filestorage itself, but how it gets implemented
will depend a bit on when filestorage makes it into trunk.
I don't understand this paragraph. Currently, if you use upload_to, we
store the full upload_to prefix + the filename (with the implicit
assumption that it's all appended to MEDIA_ROOT), both in the database
field and in the model attribute. Why wouldn't you do the same -- store
the full relative path -- in the new world? What am I missing.
Unless you allow subdirectory support, I doubt it's worth adding much
customisation here, since it's not going to be too useful. My impression
has been that allowing customisation inside a Model.save() override is
the real wish -- so rather than having to specify upload_to (or filename
in your case) up-front, it can be changed a bit more dynamically just
prior to saving.
Malcolm
--
On the other hand, you have different fingers.
http://www.pointy-stick.com/blog/
Well, that was probably an unnecessary point, really. That was a
drawback in the proof-of-concept implementation I threw together last
night, because I was just replacing the original filename with a
model-based one. That filename would then be appended to the upload_to
argument before being stored in the database. But os.path.basename was
dropping the subdirectory portion of the filename, so
'/songs/2007/Test_song.mp3' was being stored as
'/songs/Test_song.mp3'. Like I said though, doing it right is
possible, it just needs more work than I put into it last night. So
ignore that point.
> Unless you allow subdirectory support, I doubt it's worth adding much
> customisation here, since it's not going to be too useful. My impression
> has been that allowing customisation inside a Model.save() override is
> the real wish -- so rather than having to specify upload_to (or filename
> in your case) up-front, it can be changed a bit more dynamically just
> prior to saving.
Quite true. Unfortunately, the filestorage patch will actually
complicate the case of FileField saving even more than it already is,
by shifting the actual saving into a completely separate package.
Right now, it's possible to override either _save_FIELD_file() or its
curried counterpart to customize things a bit more. With filestorage
in place (as it stands now), it would require subclassing FileField as
well as the new File class, just to be able to override save_file()
for the necessary effect.
A simple change to how FileField figures out which File class it uses
would remove the necessity to subclass FileField, but there's still
the matter of having to import and subclass the File class to do this,
and those are guts I'd rather leave inside, except for cases where
those kinds of customizations are really necessary, such as adding new
file types.
All in all, I guess I'm torn. Some kind of filename/subdirectory
format string avoids numerous questions on how to subclass File or
ImageFile just to use a user id as a subdirectory. But filestorage's
save_file() is much smaller than _save_FIELD_file(), and so overriding
it might not be that terrible, I suppose, and it would save code on
our end, and make it more flexible. Adding in an optional
'subdirectory' argument would make overriding it even easier.
Well, I'll think about it some more and see if I can work up a
reasonable solution to it. I won't add it to the existing filestorage
patch though, that's already handling enough.
-Gul
...
> file = models.FileField(upload_to='songs',
> filename='%(album)s_%(track)s_%(title)s.mp3')
>
> However, this raises two concerns, both stemming from the fact that
> given people a cookie will make them want a glass of milk:
Why not allow passing a callable that can do whatever it likes? Then
if they want milk they can write it themselves....
file = models.FileField(upload_to='songs', filename=mysuperfilenamefunc)
def mysuperfilenamefunc(model):
return "%(album)s_%(track)s_%(title)s.mp3" % model.__dict__
Rob :)
Hi,
I am one of the people who asked for this. I only want to use the
primary key for a directory name. I think a filename is not enough:
I don't want to store the files under MEDIA_ROOT. Otherwise you can't
use access control, since the request gets served by apache/lighttpd, not
django.
Example: if an Object class should have N attachments:
class Object(models.Model):
pass
class Attachment(models.Model):
object=models.ForeignKey(Object)
filename=models.FileField()
The file should be saved under /non-public-path/attachments/OBJECT_ID/
For me, it's enough to store the basename in the database. The
leading directory (/non-puglic-path/attachments) could be stored in the
models.py file as keyword argument:
filename=models.FileField(media_root="/...")
The idea of Robert Coup to use a callback looks good. Nevertheless, using
anything other than a primary key is most of the time nonsense, since
attribute can change, but the physical filename does not (except you have a
script which updates the filesystem).
> Frankly, I'm not sure it's worth it, given the above concerns, but
> since working with filestorage, I've been paying attention to
> FileField gripes, and this comes up more often than anything else I've
> seen. Do you guys think this is worth implementing?
>
> It wouldn't be part of filestorage itself, but how it gets implemented
> will depend a bit on when filestorage makes it into trunk.
Please announce it here, if you update your patches. I will try them.
Or send a email to h...@tbz-pariv.de.
Thomas
Well, whether it's stored under MEDIA_ROOT is unrelated to whether it
uses just a filename for each instance. The current filestorage patch
supports storing files anywhere on your filesystem (as long as you set
the proper permissions), not just under MEDIA_ROOT.
> Example: if an Object class should have N attachments:
>
> class Object(models.Model):
> pass
>
> class Attachment(models.Model):
> object=models.ForeignKey(Object)
> filename=models.FileField()
>
> The file should be saved under /non-public-path/attachments/OBJECT_ID/
This is definitely a useful storage strategy. And, in fact, it
illustrates that a callback would indeed be the correct course of
action. A format string for this setup would require the use of
"object_id" explicitly, which most people shouldn't have to worry
about. And a format string would be completely incapable of following
relationships, so if you had a Category model as well, and wanted
/CATEGORY_ID/OBJECT_ID/, it just couldn't be done.
> For me, it's enough to store the basename in the database. The
> leading directory (/non-puglic-path/attachments) could be stored in the
> models.py file as keyword argument:
>
> filename=models.FileField(media_root="/...")
With the filestorage patch as it stands right now, you'd actually do
something like this:
from django.db import models
from django.core.filestorage.filesystem import FileSystemBackend
storage = FileSystemBackend(location='/non-public-path/attachments')
class Object(models.Model):
pass
class Attachment(models.Model):
object=models.ForeignKey(Object)
filename=models.FileField(backend=storage)
> The idea of Robert Coup to use a callback looks good.
The idea is growing on me, especially once I considered a convention
that would make it look really slick:
class Attachment(models.Model):
object=models.ForeignKey(Object)
def get_filename(self, filename):
return '%s/%s' % (self.object.pk, filename)
filename=models.FileField(backend=storage, filename=get_filename)
This way, get_filename() becomes a bound method on the model, and can
be used elsewhere, just to see what the filename would be for the
object, without retrieving it. Of course, this is just one convention
that a simple callback would enable, but I like it anyway.
One addition to the callback idea: As I show above, it should
definitely include a filename argument, which would be the original
filename. That way, in cases like yours, you can modify it with
details from the object, but still include the original filename.
> Nevertheless, using
> anything other than a primary key is most of the time nonsense, since
> attribute can change, but the physical filename does not (except you have a
> script which updates the filesystem).
I agree with you there, but people won't always assume this to be the case.
> Please announce it here, if you update your patches. I will try them.
> Or send a email to h...@tbz-pariv.de.
I'll definitely announce it here to get more feedback.
One last question, folks. I'm not a big fan of using the name
"filename" for the callback argument, because it might confuse people
into thinking that the callback will always be used to determine the
filename. Once it's stored, of course, the database value will be used
instead, and I think this should be made clear.
I think something like "get_filename" might work? Or perhaps even
"default"? The standard is to have the "default" callback not take any
arguments, but it might make more sense to at least reuse the name,
but with FileField calling it with the instance and filename as
arguments. Thoughts?
-Gul
In doing so, I ran once again into a concern I had a while back,
regarding the strftime stuff currently supported by FileField. I've
gotten them to play nicely with the new filename callable, but it
makes for a bit of overlap, and i was wondering if it'd be possible to
merge them somehow. Consider the following example:
def get_filename(obj, filename):
return '%s/%s' % (obj.id, filename)
file = models.FileField(upload_to='songs/%Y/%m/%d', filename=get_filename)
Currently, this would store the path as something like
'songs/2007/10/26/13/song.mp3', by having to combine the two methods.
It's not difficult to do this, but it seems to me that it might make
more sense to try to move the strftime stuff out from built-in
functionality to just another callable-driven path modification.
def get_filename(obj, filename):
return '%s/%s/%s' % (strftime('%Y/%m/%d'), obj.id, filename)
file = models.FileField(upload_to='songs', filename=get_filename)
But then there's some confusion as to what would happen if someone
tried the first example mentioned above. Should it still also respect
the strftime formatting in upload_to?
One way feels less "clean", but the other has potential real-world
problems. Any advice? Am I missing a third option that solves it all?
Also, speaking of strftime formatting, the current filestorage patch
moves that feature into the FileSystemBackend, rather than in the
backend-neutral sections. That means that any additional backends
would have to duplicate it if they want to use it. This seems
appropriate to me, though, because many other backends wouldn't
benefit as much from it. Thoughts?
-Gul
os.path.join('/my/path/', '/something/beginning/with/slash/') the
result is '/something/beginning/with/slash/' - which is not
necessarily intuitive for someone new to os.path...
Hmm. If i don't want to use upload_to, and just have a callable to
generate the entire path, then upload_to would need to be made
optional, and some check done that at least one of them is defined. It
doesn't look so nice having FileField(upload_to='',
filename=my_super_path_generator)
Rob :)
Hrm, to be honest, I hadn't really considered that. To me, upload_to
makes the most sense as a directory path only, rather than including a
filename. Especially since, how would we be able to tell whether
upload_to specifies a filename or a directory? We have to know if
we're supposed to replace the original filename or append it to a
path. If you have any ideas on how to do that, I'm all ears.
-Gul
Hrm. That would require two different sets of semantics depending on
the value passed in as the argument. I'm not really thrilled about
that idea, but it certainly does seem like a better approach than
anything else I've considered.
I don't think there's a single tactic that will feel 100% appropriate,
so any solution will probably feel at least a little bit dirty. Do any
of the core devs have an opinion on this proposal?
-Gul
I haven't been concentrating too much, since I'm pretty overloaded with
other things (both Django and real life) at the moment. But my general
impression is in line with yours, Marty: having two attributes and a
complex system of "if this and that then this happens" is flawed design.
Ideally, we should settle on one way to do this. For bonus points (and
strongly desired), make it easy to port from the existing code to the
new code -- e.g. via a simple search-and-replace -- if some kind of
backwards compatibility is needed.
Malcolm
--
The sooner you fall behind, the more time you'll have to catch up.
http://www.pointy-stick.com/blog/
Well, I can see two options for backwards-compatibility:
- Tolerate either both strings and callables as values for upload_to:
If a string is provided, it's passed to a utility function that
returns a callable to process a strftime string. Then all the rest of
the code could just assume a callable, but existing code would work
unmodified.
- Provide the utility function as a way to wrap existing upload_to
argument strings. Something like this:
Old:
file = models.FileField(upload_to='%Y/%m/%d')
New:
from django.utils.dates import lazy_strftime
file = models.FileField(upload_to=lazy_strftime('%Y/%m/%d'))
Not exactly a pure search-and-replace (unless you use regex), but
pretty close. And no, I'm not proposing any particular location for
that utility function, just throwing it somewhere for the sake of an
example.
-Gul
Aren't you still left with one option for upload_to and one for
specifying the full details of the filename in this version? Two options
is at least twice as many as we need.We have to live with (which means
maintain, document and explain) whatever we decide here for a long time.
It will be in 1.0, after all. The idea is to introduce any necessary
backwards incompatibilities into pre-1.0 code and have an API we like.
We've been fairly consistent (not perfect ,though) at keeping only one
way of doing things in place to date. That being the Python way and all,
it's a good goal.
> - Provide the utility function as a way to wrap existing upload_to
> argument strings. Something like this:
>
> Old:
>
> file = models.FileField(upload_to='%Y/%m/%d')
>
> New:
>
> from django.utils.dates import lazy_strftime
>
> file = models.FileField(upload_to=lazy_strftime('%Y/%m/%d'))
We just got rid of LazyDate. Reintroducing it seems inconsistent. I'd
rather you tried to come up with a way that a single callable can be
used to specify detailed control. Then people can write whatever they
like in their callables. I don't know what this API looks like and, as I
mentioned previously, I haven't had time to put a lot of thought into it
(and won't for a while yet), but that would be the goal I'd be putting a
lot of effort into. One approach is to answer the question "if I only
want to customise the directory portion (the bit that goes after
MEDIA_PREFIX), what do I need to put in the filename (basename) part of
the string I generate?"
Regards,
Malcolm
--
Depression is merely anger without enthusiasm.
http://www.pointy-stick.com/blog/
so, variant 1 (backward-compatible one):
my_file = models.FileField(upload_to="%Y/%m/%d")
variant 2 (works only in new revisions):
my_file = models.FileField(upload_to=get_file_path)
def get_file_path(obj):
return "%s/%s/" % (datetime.now.strftime("%Y/%m/%d"), obj.type)
variant 3 (works only in new revisions):
my_file = models.FileField(filename=get_file_path)
def get_file_path(obj, filename):
return "%s/%s/%s" % (datetime.now.strftime("%Y/%m/%d"), obj.id, filename)
--
Best regards, Yuri V. Baburov, ICQ# 99934676, Skype: yuri.baburov,
MSN: bu...@live.com
Okay, I'm fine with that. I expect FileField isn't used by the
majority of people, so its impatc will be less than some other
backwards-incompatible changes. I just don't want to trivialize how
many people are in fact using it. And if we go to just one way of
doing it, and that involves a callable, it will require changes to
every FileField instance, regardless of whether people actually want
fine-grained control.
> We just got rid of LazyDate. Reintroducing it seems inconsistent.
Agreed.
> I'd rather you tried to come up with a way that a single callable can be
> used to specify detailed control. Then people can write whatever they
> like in their callables.
Well, that's exactly the end goal of this proposal. Where a single
callable can customize the destination path and filename, based on the
model instance, original filename, other arbitrary data (such as
date/time), or any combination thereof.
> I don't know what this API looks like and, as I
> mentioned previously, I haven't had time to put a lot of thought into it
> (and won't for a while yet), but that would be the goal I'd be putting a
> lot of effort into.
To sum up what we've worked out so far, the API would look basically like this:
def make_path(obj, filename):
year = date.today().year
return 'photos/%s/%s/%s' % (year, obj.pk, filename)
file = models.FileField(upload_to=make_path)
> One approach is to answer the question "if I only
> want to customise the directory portion (the bit that goes after
> MEDIA_PREFIX), what do I need to put in the filename (basename) part of
> the string I generate?"
Well, this also raises the question of, "What do I need to know just
to specify a subdirectory"? Using a single callable approach for all
cases requires the common case to look like this:
def make_path(obj, filename):
return 'photos/%s' % filename
file = models.FileField(upload_to=make_path)
That's a far cry from the existing technique:
file = models.FileField(upload_to='photos')
It's been a long weekend and this is making my head hurt. I'll think
about it more this week and see what happens.
-Gul
Well, the original idea was indeed to offer a new argument for
cusotmizing things. What I definitely don't want to do, though, is to
offer three separate options, as you're proposing. I wasn't completely
opposed to a choice of two arguments, but also offering a callable to
be passed to upload_to ... I don't want to go there. Just documenting
it would be a nightmare. :(
-Gul
Hi,
I would call the key word argument filename_callback.
Both arguments are XOR. You must provide only one of them.
If both are given, an exception is raised.
This is backward compatible and flexible.
I think that result of filename_callback must be an absolute path.
Thomas
Am Freitag, 26. Oktober 2007 15:39 schrieb Marty Alchin:
> Okay, just a quick update. I don't have a new patch ready yet, because
> I wanted to outline a couple things and ask another question. I've
> implemented the callable method for determining filenames, including
> the ability to include a subdirectory in the filename. Also, ignore my
> comment about reusing "default" for the callable argument. Doing so
> causes more problems than it solves, and it's not really better
> anyway. So, "filename" is what I'm going with for now.
>
> In doing so, I ran once again into a concern I had a while back,
> regarding the strftime stuff currently supported by FileField. I've
> gotten them to play nicely with the new filename callable, but it
> makes for a bit of overlap, and i was wondering if it'd be possible to
> merge them somehow. Consider the following example:
>
> def get_filename(obj, filename):
> return '%s/%s' % (obj.id, filename)
>
> file = models.FileField (upload_to='songs/%Y/%m/%d', filename=get_filename)
To me having the two mutually-exclusive arguments sounds too much like trying to fake static typing in a dynamic language. I don't see the advantage of separate names for basically the same information provided in different ways. If it walks like a duck...
if callable(upload_to):
self.upload_to, filename = os.path.split(upload_to())
else:
self.upload_to = upload_to