Extending FieldField (the way ImageField does)

35 views
Skip to first unread message

Serge Wroclawski

unread,
Nov 19, 2016, 4:37:37 PM11/19/16
to Django users
Hi all,

I've been a casual user of Django for years, but recently have need to make a new field based on FileField. I decided to take a look at ImageField, since it is very similar to what I'm doing.

Specifically I'm looking at using mutagen (python-mutagen) to get information about audio files. Mutagen can tell me the bitrate, duration and then tags for the audio.

That's quite similar to how ImageField uses PIL underneith, so all's good there.

But in my investigation, I'm coming across some code I am having difficulty following.

Both the ImageField itself and the ImageFieldDescriptor class call a method on ImageField called update_dimension_fields.

The docstring says that it updates the field's width and height attributes, but even after reading the code many times, I am not seeing where the height and width are retrieved from the image itself, rather than either being passed in values or values already in the database.

Another question here is in regards to tags. Similar to image's exif data, these audio files can contain tags, and in the case of many formats (such as Ogg Vorbis) this tag data can be relatively arbitrary. It would seem then that I'm stuck with three ugly options:

1. I can make an explicit reference in my new OggVorbisFileField for each field that the user may care about in their model (ie title, artist) similar to how it works with ImageField's height and width

2. Instead of referencing each field, have an ugly "tags" field that contains something like a pickle'd dictionary of key/value pairs

3. Make an arbitrary set of key/value pairs in the DB through a many/many relationship for each of the tags

The fields that I care about (title, artist, etc.) will already be stored in my model that will contain the new OggVorbisField, so I'm wondering if there's a best practice here that I should try to follow.

Thanks all,

- Serge

Michal Petrucha

unread,
Nov 20, 2016, 4:26:43 AM11/20/16
to Django users
On Sat, Nov 19, 2016 at 01:34:31PM -0800, Serge Wroclawski wrote:
> Hi all,
>
> I've been a casual user of Django for years, but recently have need to make
> a new field based on FileField. I decided to take a look at ImageField,
> since it is very similar to what I'm doing.
>
> Specifically I'm looking at using mutagen (python-mutagen) to get
> information about audio files. Mutagen can tell me the bitrate, duration
> and then tags for the audio.
>
> That's quite similar to how ImageField uses PIL underneith, so all's good
> there.
>
> But in my investigation, I'm coming across some code I am having difficulty
> following.
>
> Both the ImageField itself and the ImageFieldDescriptor class call a method
> on ImageField called update_dimension_fields.
>
> The docstring says that it updates the field's width and height attributes,
> but even after reading the code many times, I am not seeing where the
> height and width are retrieved from the image itself, rather than either
> being passed in values or values already in the database.

All right, so ImageFileDescriptor.__set__ calls
self.field.update_dimension_fields(instance, force=True), where
instance is a model instance, and self.field is an ImageField attached
to that model. ImageField.update_dimension_fields, in turn, calls
getattr(instance, self.attname), which runs the descriptor's __get__ –
that one ensures that the result is an instance of ImageFieldFile,

So at this point inside update_dimension_fields, the file variable is
an ImageFieldFile.

Next up, you can see that it does some checking to find out which
dimension fields to update, if any, and finally gets to the important
part:

# file should be an instance of ImageFieldFile or should be None.
if file:
width = file.width
height = file.height

So here, the dimensions are retrieved from an ImageFieldFile through
its width and height attributes. Where do those come from? If you look
at the ancestors of ImageFieldFile, you'll see that it inherits from
django.core.files.images.ImageFile. This class implements the width
and height properties, which call
django.core.files.images.get_image_dimensions. This is the meat of the
dimension-extracting code, which calls Pillow to do the heavy lifting.

I hope this makes it at least a little bit clearer.

> Another question here is in regards to tags. Similar to image's exif data,
> these audio files can contain tags, and in the case of many formats (such
> as Ogg Vorbis) this tag data can be relatively arbitrary. It would seem
> then that I'm stuck with three ugly options:
>
> 1. I can make an explicit reference in my new OggVorbisFileField for each
> field that the user may care about in their model (ie title, artist)
> similar to how it works with ImageField's height and width
>
> 2. Instead of referencing each field, have an ugly "tags" field that
> contains something like a pickle'd dictionary of key/value pairs
>
> 3. Make an arbitrary set of key/value pairs in the DB through a many/many
> relationship for each of the tags
>
> The fields that I care about (title, artist, etc.) will already be stored
> in my model that will contain the new OggVorbisField, so I'm wondering if
> there's a best practice here that I should try to follow.

If the set of fields that you care about is a small one, and you
already have regular model fields for them, then personally, I'd
choose the first option. If you have any good reason to store all
metadata, then both options 2 and 3 make sense, although when it comes
to option 2, you shouldn't pickle dicts. Instead, if you're using
postgres, just use a HStoreField, or a JSONField; on other databases,
you can use one of the many JSON field implementations backed by a
simple TextField. Option 2 will be easier to implement and maintain
than option 3, I think, whereas option 3 is more “pure” in terms of
relational database design.

Good luck,

Michal
signature.asc
Reply all
Reply to author
Forward
0 new messages