Proposal for making Storage API timezone aware

53 views
Skip to first unread message

James Aylett

unread,
Sep 7, 2015, 12:52:59 PM9/7/15
to Django developers (Contributions to Django itself)

This is something that's been bugging me for a while, but it's tricky
to get a good compatibility story. Here I'll lay out what's going on
now including obstacles to change, where I think we should be and a
possible route for getting there.


# Current situation

Storage backends are documented as returning naive datetime objects
from a number of calls [1]. (Currently no similar calls return aware
objects ever.)

One significant user of these calls is staticfiles, which compares
modified_time between (possibly) different storage backends when doing
collectstatic as a way of avoiding copying static asset files when not
needed.

Storage backends can be freely implemented, and a number of third
party extensions do so. Two popular ones are (there are likely more):

 * django-compressor (which actually overrides the datetime-returning
   methods; django-pipeline also has custom backends, but does not)
 
 * django-storages (including support for a large number of "remote"
   backends, and some others -- one of the most important being
   S3Boto); this is mildly complicated by at least one fork,
   django-storages-redux, although the original now seems stalled
   (which is why that fork exists)

Additionally, various other extensions (and user code) consume from
one or more of the time methods. (For instance, django-filebrowser
does; this itself has various forks.) In many cases, they won't actually
care whether they're being passed an aware or naive datetime.

This would only be a backwards compatibility problem, except that
S3Boto returns a naive datetime in UTC, but FileSystemStorage (and
others) return in the local timezone. This results in changed files not
being overwritten [2] if your time zone is "west" of UTC. (This is a bug
in django-storages, which came about from when the documentation
didn't mention timezones at all for these methods.)





# What should happen

I think the following should be the long-term goal:

1. If USE_TZ=False then datetimes returned from storage backends should be
   naive in local time
   
2. If USE_TZ=True then datetimes returned from storage backends should be
   aware in UTC

This would I believe match the behaviour of other TZ-aware systems in
Django. It should allow, for instance, for datetimes coming from
storage backends to round trip through the ORM without warnings.


# A route through this

To go straight to the end result is a significant enough breaking
change that it would annoy a lot of people. Here's one approach that I
think would be fairly painless for people relying on this to Just Work.

We want the following properties in our approach:

 * ensure that clients of storage engines get a suitable deprecation
   warning and timeline to enable them to move

 * provide a BC approach for clients of storage engines so they can write
   code that works from LTS to LTS

 * ensure that implementors of storage engines get a suitable deprecation
   warning and timeline to enable them to move (this will probably also
   involve users of those storage engines getting the warnings)

 * provide a BC approach for implementors of storage engines so they can
   write code that works from LTS to LTS

I've included steps where extensions have to change as well as changes
to Django; I think it's important in this case to make sure that
popular extensions are moved forward. This is obviously dependent
somewhat on the communities around those extensions, so there is some
risk there.

There are other ways of doing things to this, but I think this is a clean
approach. There's almost certainly something I've forgotten or simply got
wrong though.


## Step 1: introduce new methods

get_modified_time(name)
get_creation_time(name)
get_accessed_time(name)

Same signature as the old ones, but with "get_" at the front. Return
naive or aware datetime objects as dictated by USE_TZ and described
above.

Change core code (I believe this is still only staticfiles'
collectstatic) to use the new methods.

Put deprecation warnings on the three old methods in both Storage (the
base class) and FileSystemStorage in core.

Implement the new methods for FileSystemStorage.

Implement the new methods for Storage in terms of the old methods,
with deprecation warnings.


Outcomes:

 * client code using storage backends gets deprecation warnings until
   they move to the new methods

 * storage backends that don't implement the new methods will result
   in deprecation warnings when clients (including staticfiles'
   collectstatic) use new methods

 * all old code continues to work, with deprecation warnings emitted
   by Django (except where 3rd party storage backends have overridden
   the old methods)

 * lands a long-term API that supports timezones and USE_TZ correctly
   (although storage backend implementors will still have to support
   both for now)

 * there are distinct deprecation warnings for the two types of code
   that need updating.


## Step 2: update 3rd party storage backends

Work with django-storages-redux, django-compressor and any other
popular extensions to directly implement the new methods and mark
the old methods as deprecated.

(The deprecation warnings need to be extension-specific, because their
messages need to cover people on older versions of Django and discuss
the extension's policy to supporting those.)


Outcomes:

 * if using the latest Django and the latest extension, no deprecation
   warnings unless client code uses the old methods

 * the latest extension with a previous version of Django continues to
   work, with deprecation warnings emitted by the extension

 * all old client code continues to work, with deprecation warnings
   emitted by the extension

 * fixes the bug with static files not being collected sometimes (this
   could be fixed earlier separately to all this)


## Step 3: require new methods

Replace the Storage default implementation of the new methods so they
just raise NotImplementedError (as the old methods currently do).

Drop the old methods completely at this point.


Outcomes:

 * if using the latest Django and extensions from step 2, everything
   works

 * if using the latest Django and extensions from before step 2,
   you'll start seeing NotImplementedError (given we'll be able to plan
   the version of Django where this will happen, the extensions can
   be explicit about compatibility in their requirements)

 * if using an older version of Django, everything is as for step 2 (with
   deprecation warnings)


## Step 4: tidy up

django-storages-redux, django-compressor and any other popular
extensions can remove the old methods.


Outcomes:

 * drop support for pre-step 1 versions of Django

 * step 1 version of Django will use the new methods directly; no problems

 * step 3 version of Django will use the new methods directly; no problems


# Timelines

It would be preferable to get step 1 done by the next LTS (1.11), so
that third party and client code compatible with 1.8 LTS continues to
work with 1.11, with deprecation warnings.

Ideally step 2 will happen soon after step 1 (once that version of
Django is released), because it introduces a deprecation warning if
you're using a storage backend that overrides the old methods with
client code that calls those methods. (I don't think there's anything
more we can do to help here. Providing the storage backends are
updated to include the new methods, they'll be compatible with the
step 3 Django version, and if they still contain undeprecated old
methods then that's just part of their own API they need to support.)

My understanding is that step 3 could follow two releases after step
1, unless that itself is LTS.

If extensions are aiming to support two LTS releases, step 4 probably
can't be done until the next LTS _after_ 1.11. It's actually not a
huge problem if it never happens, beyond carrying additional code.


# Getting it done

Each of these steps is individually a relatively small changeset.

Once people are happy with the approach, I'm happy to try to get step
1 done [*]. For step 2 I'll certainly submit a PR for
django-storages-redux (as well as S3Boto, the SFTP and FTP backends
will need updating). Step 3 is a fairly minor cleanup (again, I'm
happy to do it when the time comes.) Step 4 would be down to the
relevant extension authors.


[*] although from what I remember last time I looked into this, the
tests around staticfiles have some...issues that may need addressing
at the same time.

Carl Meyer

unread,
Sep 7, 2015, 2:07:15 PM9/7/15
to django-d...@googlegroups.com
Hi James,

This proposal makes good sense to me. I think the problem is worth
fixing (I saw this problem on a previous project and never did get to
the bottom of what was causing it!), and you've carefully identified a
backwards-compatible route to fixing it, where no existing code breaks
(further) without a deprecation path.

Having to switch to new names for those methods doesn't bother me; the
new names you're proposing (with `get_` prepended) are at least as good
as the old ones. Methods should ideally be named as verb phrases anyway,
and properties/attributes as nouns.

Carl
> --
> You received this message because you are subscribed to the Google
> Groups "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to django-develop...@googlegroups.com
> <mailto:django-develop...@googlegroups.com>.
> To post to this group, send email to django-d...@googlegroups.com
> <mailto:django-d...@googlegroups.com>.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/20ca02c9-be27-47ce-a95f-e62396c73de8%40googlegroups.com
> <https://groups.google.com/d/msgid/django-developers/20ca02c9-be27-47ce-a95f-e62396c73de8%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

signature.asc
Reply all
Reply to author
Forward
0 new messages