Multiple timezone support for datetime representation

3091 views
Skip to first unread message

Aymeric Augustin

unread,
Sep 3, 2011, 11:40:52 AM9/3/11
to django-d...@googlegroups.com
Hello,

The GSoC proposal "Multiple timezone support for datetime representation" wasn't picked up in 2011 and 2010. Although I'm not a student and the summer is over, I'd like to tackle this problem, and I would appreciate it very much if a core developer accepted to mentor me during this work, GSoC-style.

Here is my proposal, following the GSoC guidelines. I apologize for the wall of text; this has been discussed many times in the past 4 years and I've tried to address as many concerns and objections as possible.

Definition of success
---------------------

The goal is to resolve ticket #2626 in Django 1.4 or 1.5 (depending on when 1.4 is released).

Design specification
--------------------

Some background on timezones in Django and Python
.................................................

Currently, Django stores datetime objects in local time in the database, local time being defined by the TIME_ZONE setting. It retrieves them as naive datetime objects. As a consequence, developers work with naive datetime objects in local time.

This approach sort of works when all the users are in the same timezone and don't care about data loss (inconsistencies) when DST kicks in or out. Unfortunately, these assumptions aren't true for many Django projects: for instance, one may want to log sessions (login/logout) for security purposes: that's a 24/7 flow of important data. Read tickets #2626 and #10587 for more details.

Python's standard library provides limited support for timezones, but this gap is filled by pytz <http://pytz.sourceforge.net/>. If you aren't familiar with the topic, strongly recommend reading this page before my proposal. It explains the problems of working in local time and the limitations of Python's APIs. It has a lot of examples, too.

Django should use timezone-aware UTC datetimes internally
.........................................................

Example : datetime.datetime(2011, 09, 23, 8, 34, 12, tzinfo=pytz.utc)

In my opinion, the problem of local time is strikingly similar to the problem character encodings. Django uses only unicode internally and converts at the borders (HTTP requests/responses and database). I propose a similar solution: Django should always use UTC internally, and conversion should happen at the borders, i.e. when rendering the templates and processing POST data (in form fields/widgets). I'll discuss the database in the next section.

Quoting pytz' docs: "The preferred way of dealing with times is to always work in UTC, converting to localtime only when generating output to be read by humans." I think we can trust pytz' developers on this topic.

Note that a timezone-aware UTC datetime is different from a naive datetime. If we were using naive datetimes, and assuming we're using pytz, a developer could write:

mytimezone.localize(datetime_django_gave_me)

which is incorrect, because it will interpret the naive datetime as local time in "mytimezone". With timezone-aware UTC datetime, this kind of errors can't happen, and the equivalent code is:

datetime_django_gave_me.astimezone(mytimezone)

Django should store datetimes in UTC in the database
....................................................

This horse has been beaten to death on this mailing-list so many times that I'll keep the argumentation short. If Django handles everything as UTC internally, it isn't useful to convert to anything else for storage, and re-convert to UTC at retrieval.

In order to make the database portable and interoperable:
- in databases that support timezones (at least PostgreSQL), the timezone should be set to UTC, so that the data is unambiguous;
- in databases that don't (at least SQLite), storing data in UTC is the most reasonable choice: if there's a "default timezone", that's UTC.

I don't intend to change the storage format of datetimes. It has been proposed on this mailing-list to store datetimes with original timezone information. However, I suspect that in many cases, datetimes don't have a significant "original timezone" by themselves. Furthermore, there are many different ways to implemented this outside of Django's core. One is to store a local date + a local time + a place or timezone + is_dst flag and skip datetime entirely. Another is to store an UTC datetime + a place or timezone. In the end, since there's no obvious and consensual way to implement this idea, I've chosen to exclude it from my proposal. See the "Timezone-aware storage of DateTime" thread on this mailing list for a long and non-conclusive discussion of this idea.

I'm expecting to take some flak because of this choice :) Indeed, if you're writing a multi-timezone calendaring application, my work isn't going to resolve all your problems — but it won't hurt either. It may even provide a saner foundation to build upon. Once again, there's more than one way to solve this problem, and I'm afraid that choosing one would offend some people sufficiently to get the entire proposal rejected.

Django should convert between UTC and local time in the templates and forms
...........................................................................

I regard the problem of local time (in which time zone is my user?) as very similar to internationalization (which language does my user read?), and even more to localization (in which country does my user live?), because localization happens both on output and on input.

I want controllable conversion to local time when rendering a datetime in a template. I will introduce:
- a template tag, {% localtime on|off %}, that works exactly like {% localize on|off %}; it will be available with {% load tz %};
- two template filters, {{ datetime|localtime }} and {{ datetime|utctime }}, that work exactly like {{ value|localize }} and {{ value|unlocalize }}.

I will convert datetimes to local time when rendering a DateTimeInput widget, and also handle SplitDateTimeWidget and SplitHiddenDateTimeWidget which are more complicated.

Finally, I will convert datetimes entered by end-users in forms to UTC. I can't think of cases where you'd want an interface in local time but user input in UTC. As a consequence, I don't plan to introduce the equivalent of the `localize` keyword argument in form fields, unless someone brings up a sufficiently general use case.

How to set each user's timezone
...............................

Internationalization and localization are based on the LANGUAGES setting. There's a widely accepted standard to select automatically the proper language and country, the Accept-Language header.

Unfortunately, some countries like the USA have more than one timezone, so country information isn't enough to select a timezone. To the best of my knowledge, there isn't a widely accepted way to determine the timezones of the end users on the web.

I intend to use the TIME_ZONE setting by default and to provide an equivalent of `translation.activate()` for setting the timezone. With this feature, developers can implement their own middleware to set the timezone for each user, for instance they may want to use <http://pytz.sourceforge.net/#country-information>.

This means I'll have to introduce another thread local. I know this is frowned upon. I'd be very interested if someone has a better idea.

It might be no longer necessary to set os.environ['TZ'] and run time.tzset() at all. That would avoid a number of problems and make Windows as well supported as Unix-based OSes — there's a bunch of tickets in Trac about this.

I'm less familiar with this part of the project and I'm interested in advice about how to implement it properly.

Backwards compatibility
.......................

Most previous attempts to resolve have stumbled upon this problem.

I propose to introduce a USE_TZ settings (yes, I know, yet another setting) that works exactly like USE_L10N. If set to False, the default, you will get the legacy (current) behavior. Thus, existing websites won't be affected. If set to True, you will get the new behavior described above.

I will also explain in the release notes how to migrate a database — which means shifting all datetimes to UTC. I will attempt to develop a script to automate this task.

Dependency on pytz
..................

I plan to make pytz a mandatory dependency when USE_TZ is True. This would be similar to the dependency on on gettext when USE_I18N is True.

pytz gets a new release every time the Olson database is updated. For this reason, it's better not to copy it in Django, unlike simplejson and unittest2.

It was split from Zope some time ago. It's a small amount of clean code and it could be maintained within Django if it was abandoned (however unlikely that sounds).

Miscellaneous
.............

The following items have caused bugs in the past and should be checked carefully:

- caching: add timezone to cache key? See #5691.
- functions that use LocalTimezone: naturaltime, timesince, timeuntil, dateformat.
- os.environ['TZ']. See #14264.
- time.tzset() isn't supported on Windows. See #7062.

Finally, my proposal shares some ideas with https://github.com/brosner/django-timezones; I didn't find any documentation, but I intend to review the code.

About me
--------

I've been working with Django since 2008. I'm doing a lot of triage in Trac, I've written some patches (notably r16349, r16539, r16548, also some documentation improvements and bug fixes), and I've helped to set up continuous integration (especially for Oracle). In my day job, I'm producing enterprise software based on Django with a team of ten developers.

Work plan
---------

Besides the research that's about 50% done, and discussion that's going to take place now, I expect the implementation and tests to take me around 80h. Given how much free time I can devote to Django, this means three to six months.

Here's an overview of my work plan:

- Implement the USE_TZ flag and database support — this requires checking the capabilities of each supported database in terms of datetime types and time zone support. Write tests, especially to ensure backwards compatibility. Write docs. (20h)

- Implement timezone localization in templates. Write tests. Write docs. (10h)

- Implement timezone localization in widgets and forms. Check the admin thoroughly. Write tests. Write docs. (15h)

- Implement the utilities to set the user's timezone. Write tests. Write docs. (15h)

- Reviews, etc. (20h)

What's next?
------------

Constructive criticism, obviously :) Remember that the main problems here are backwards-compatibility and keeping things simple.

Best regards,

--
Aymeric.

Annex: Research notes
---------------------

Wiki
....

[GSOC] https://code.djangoproject.com/wiki/SummerOfCode2011#Multipletimezonesupportfordatetimerepresentation

Relevant tickets
................

#2626: canonical ticket for this issue

#2447: dupe, an alternative solution
#8953: dupe, not much info
#10587: dupe, a fairly complete proposal, but doesn't address backwards compatibility for existing data

Relevant related tickets
........................

#14253: how should "now" behave in the admin when "client time" != "server time"?

Irrelevant related tickets
..........................

#11385: make it possible to enter data in a different timezone in DateTimeField
#12666: timezone in the 'Date:' headers of outgoing emails - independant resolution

Relevant threads
................

2011-05-31 Timezone-aware storage of DateTime
http://groups.google.com/group/django-developers/browse_thread/thread/76e2b486d561ab79

2010-08-16 Datetimes with timezones for mysql
https://groups.google.com/group/django-developers/browse_thread/thread/5e220687b7af26f5

2009-03-23 Django internal datetime handling
https://groups.google.com/group/django-developers/browse_thread/thread/ca023360ab457b91

2008-06-25 Proposal: PostgreSQL backends should *stop* using settings.TIME_ZONE
http://groups.google.com/group/django-developers/browse_thread/thread/b8c885389374c040

2007-12-02 Timezone aware datetimes and MySQL (ticket #5304)
https://groups.google.com/group/django-developers/browse_thread/thread/a9d765f83f552fa4

Relevant related threads
........................

2009-11-24 Why not datetime.utcnow() in auto_now/auto_now_add
http://groups.google.com/group/django-developers/browse_thread/thread/4ca560ef33c88bf3

Irrelevant related threads
..........................

2011-07-25 "c" date formating and Internet usage
https://groups.google.com/group/django-developers/browse_thread/thread/61296125a4774291

2011-02-10 GSoC 2011 student contribution
https://groups.google.com/group/django-developers/browse_thread/thread/0596b562cdaeac97/585ce1b04632198a?#585ce1b04632198a

2010-11-04 Changing settings per test
https://groups.google.com/group/django-developers/browse_thread/thread/65aabb45687e572e

2009-09-15 What is the status of auto_now and auto_now_add?
https://groups.google.com/group/django-developers/browse_thread/thread/cd1a76bca6055179

2009-03-09 TimeField broken in Oracle
https://groups.google.com/group/django-developers/browse_thread/thread/bba2f80a2ca9b068

2009-01-12 Rolling back tests -- status and open issues
https://groups.google.com/group/django-developers/browse_thread/thread/1e4f4c840b180895

2008-08-05 Transactional testsuite
https://groups.google.com/group/django-developers/browse_thread/thread/49aa551ad41fb919

Paul McMillan

unread,
Sep 3, 2011, 2:19:32 PM9/3/11
to django-d...@googlegroups.com
Hi Aymeric,

First, thanks for all your triage work on track, it's very much
appreciated. This is a large proposal, but I think it covers the issue
pretty completely.

I agree with your approach of using UTC internally and converting on output.

My main concern with your proposal is that it may require two
different code paths for a lot of code. I have no objection to the
dependency on pytz, but I don't want this to result in pytz-aware code
that diverges significantly from the non-pytz default path.

I would like this new code to eventually become the default setting,
but acknowledge that there doesn't seem to be a good way to do that
and maintain our current backwards compatibility (and external
dependency) policy.

This isn't my area of expertise, so I'm not a good mentor candidate,
but I'd be excited to help you with this project where I can.

I'm wholeheartedly +1 on this.

Best,
-Paul

Anssi Kääriäinen

unread,
Sep 3, 2011, 3:19:31 PM9/3/11
to Django developers
I'm +1 also.

Some notes:

- Performance: The USE_L10N setting made rendering tables consisting
of numbers much slower (see #14290 for example). It would be good if
there wasn't any major regressions due to this. I suggest making a few
testcases upfront For example one could be a fetch of 10000 objects
from the DB. Another one is to then render the 10000 objects. The last
one is quite slow already if USE_L10N is True, so there is hope this
could actually improve that case.

- Concurrency: My quick reading of the documentation of pytz
suggests that this should not be a problem. But your implementation
would be thread safe, right? The main problem here is localizing the
timestamps.

- I want to save the following information in the database: The
meeting starts at 08:00 localtime 2011-09-04. Localtime is based on
some other information, maybe the location of the meeting room. I do
not want this information to change if there is a DST change. In the
proposal you are saying that you are not going to tackle this problem.
However, it would be very valuable if there were a way to save just a
plain datetime information in the database if the user so chooses. The
user is then free to interpret if it means 08:00 in Helsinki/Europe
timezone, or something else. Storing in UTC does not allow for this
because of DST changes. I guess I am saying that having a
models.DatetimeField(plain=True) would be a good thing...

- The default for USE_TZ should be False if not specified but True
for new projects. That is, it is included in the settings template as
True but it is False in the global_settings.py.

The design looks good. I am very interested in this. One more piece in
making Django more useful in the enterprise setting. I hope you find
somebody qualified to mentor your project.

- Anssi

Mikhail Korobov

unread,
Sep 3, 2011, 6:48:33 PM9/3/11
to django-d...@googlegroups.com
Great job on summarizing the issue!

I have one concern though. Can you please explain why is USE_TZ option better than introducing e.g. UtcDateTimeField?

USE_TZ=True will break all existing code (including external apps) which relies on django 1.3 documented DateTimeField behavior, this can be scary and will introduce a lot of "if getattr(settings, USE_TZ, False): #..." statements in external apps for backwards compatibility.

Good UtcDateTimeField implementation can be released as a separate package (and then eventually included in django itself). This way existing django projects will be able to use it without waiting for a release and backwards compatibility won't be broken. Are there obstacles in django itself that prevent this option? 

Daniel Swarbrick

unread,
Sep 4, 2011, 6:34:17 AM9/4/11
to Django developers
I'm wholeheartedly on +1 on this.

@Anssi Kääriäinen, re your localtime meeting example: this type of
scenario has been discussed on this list previously, and the generally
acknowledged solution is to use separate date and time fields, neither
of which (on their own) have a concept of timezone.

@Aymeric Augustin, re storing UTC in Postgres: this is somewhat moot,
since Postgres stores timestamps internally as a UTC Julian date
regardless of which timezone the client connection uses (client TZ
offset is dynamically added/subtracted at insert/update/select-time).
In the case of MySQL and SQLite however, it is a valid point.

I can understand some people's concerns about the dependency
requirement of pytz in order to achieve this. It would have been great
to see pytz added to the standard Python libs in 3.x, since (IMHO)
it's such a fundamental requirement of any language these days. In any
case, it will be great to see Django's international date/time
handling be on a par with its L10N and I18N features.

Aymeric Augustin

unread,
Sep 6, 2011, 4:47:44 PM9/6/11
to django-d...@googlegroups.com
Thanks Paul, Anssi, Mikhail and Daniel for reviewing the proposal!

Here are some answers to your remarks.


On 3 sept. 2011, at 20:19, Paul McMillan wrote:
> My main concern with your proposal is that it may require two
> different code paths for a lot of code.

Currently, my best guess is:
- limited changes in the models layer
- some new code in the forms and template layers (if USE_TZ: …)
- most of the duplication will be in the tests, because I have to test everything with and without USE_TZ.

Of course, I can't be sure before actually writing the code :)

> I would like this new code to eventually become the default setting,
> but acknowledge that there doesn't seem to be a good way to do that
> and maintain our current backwards compatibility (and external
> dependency) policy.

I'm going to use the same trick used by USE_L10N and explained by Anssi: set USE_TZ to False in global_settings.py and to True in the template of settings.py. This preserves backwards compatibility but the new code is the default for new projects.

On 3 sept. 2011, at 21:19, Anssi Kääriäinen wrote:

> - Performance: The USE_L10N setting made rendering tables consisting
> of numbers much slower (see #14290 for example). It would be good if
> there wasn't any major regressions due to this.

With USE_TZ, rendering a datetime will be slower because of the conversion to local time. I have no idea of the performance of pytz. I'll do some benchmarks.

> - Concurrency: My quick reading of the documentation of pytz
> suggests that this should not be a problem. But your implementation
> would be thread safe, right? The main problem here is localizing the
> timestamps.

Certainly, the implementation must be thread safe, because requests may run concurrently in different threads for users in different timezones. I'm pretty sure it will work out of the box, and it will be even more isolated if I can avoid setting os.environ['TZ'] and calling time.tzset().

> - I want to save the following information in the database: The
> meeting starts at 08:00 localtime 2011-09-04. Localtime is based on
> some other information, maybe the location of the meeting room.

> (...) it would be very valuable if there were a way to save just a


> plain datetime information in the database if the user so chooses.

In my opinion, and more importantly in PostgreSQL's implementation, a DateTime represents a "point in time", in an absolute sense — ignoring relativistic effects :)

For your example, I support Daniel's suggestion: use a DateField and a TimeField, plus a way to obtain the time zone (like a foreign key to the meeting room). That's exactly the information you said you wanted to store :)

I've hesitated a lot to add a "naive=False/True" keyword argument to DateTimeField. Right now, I'm still convinced that mixing naive and aware DateTimes in the same application a recipe for disaster and that Django shouldn't facilitate it.

On 4 sept. 2011, at 00:48, Mikhail Korobov wrote:

> I have one concern though. Can you please explain why is USE_TZ option better than introducing e.g. UtcDateTimeField?

In short, Django shouldn't provide two different ways to represent a "point in time". Quoting the Zen of Python: "There should be one-- and preferably only one --obvious way to do it."

> USE_TZ=True will break all existing code (including external apps) which relies on django 1.3 documented DateTimeField behavior, this can be scary and will introduce a lot of "if getattr(settings, USE_TZ, False): #..." statements in external apps for backwards compatibility.

This is an interesting question. For backwards compatibility, if a DateTimeField is set to a naive datetime (like datetime.datetime.now()) and USE_TZ is True, the value should be interpreted in local time and a warning should be raised.

With this rule, I believe that most code should work just fine with or without USE_TZ, thanks to duck typing. Basically, what works on a naive datetime also works on an aware datetime, including timedelta arithmetic.

Of course, applications dedicated to dealing with timezones may break. If you're using one, you should probably stick with USE_TZ = False.

> Good UtcDateTimeField implementation can be released as a separate package (and then eventually included in django itself). This way existing django projects will be able to use it without waiting for a release and backwards compatibility won't be broken. Are there obstacles in django itself that prevent this option?

In short, my proposal can't be implemented outside of core because it requires a modification of the template engine.

If it could, I'm sure someone would have done it already — ticket #2626 has been open for five years.

In fact, I tried to implement these ideas without modifying Django in my current project. The result is way too messy for publication, let alone for eventual inclusion in Django. My entry point in the template engine is django.utils.formats.localize, and I'm monkey patching it to convert datetimes to local time. This means I'm mixing localization and timezones. I have to set TIME_ZONE = 'UTC' to get UTC in the database and introduce another setting, LOCAL_TIME_ZONE, for the local time of the project. Finally my developers must never forget to use custom DateTimeFields in models and forms, as well as a custom ModelAdmin base class, or UTC data shows up in the admin. It's totally unwieldy.

On 4 sept. 2011, at 12:34, Daniel Swarbrick wrote:
> re storing UTC in Postgres: this is somewhat moot,
> since Postgres stores timestamps internally as a UTC Julian date
> regardless of which timezone the client connection uses

Under PostgreSQL, I assume I have to use a TIMESTAMP WITH TIME ZONE and set the connection's timezone to UTC if I want non-Django applications using the same database to obtain correct results, regardless of their connection's timezone. To be honest, this is pure speculation; I must check this for each database engine.


Best regards,

--
Aymeric Augustin.

Paul McMillan

unread,
Sep 6, 2011, 5:20:41 PM9/6/11
to django-d...@googlegroups.com
> I'm going to use the same trick used by USE_L10N and explained by Anssi: set USE_TZ to False in global_settings.py and to True in the template of settings.py. This preserves backwards compatibility but the new code is the default for new projects.

This isn't gonna work because your new code will have a hard dependency on pytz.

-Paul

Aymeric Augustin

unread,
Sep 7, 2011, 3:14:44 AM9/7/11
to django-d...@googlegroups.com
2011/9/6 Paul McMillan <pa...@mcmillan.ws>:

>> I'm going to use the same trick used by USE_L10N and explained by Anssi: set USE_TZ to False in global_settings.py and to True in the template of settings.py. This preserves backwards compatibility but the new code is the default for new projects.
>
> This isn't gonna work because your new code will have a hard dependency on pytz.

Django uses PIL for ImageField, but it isn't a hard dependency because
it's only imported within functions that actually use it. We can do
the same for pytz if it's only used in a few places.

If a module uses pytz in many functions, I suggest this pattern:

from django.conf import settings
if settings.USE_TZ:
import pytz

And it's a bug to hit code that depends on pytz when USE_TZ is False.

If we want to make some functions (like timezone conversion helpers)
available to anyone who has pytz, regardless of the value of USE_TZ,
we could use this pattern:

try:
import pytz
except ImportError:
if settings.USE_TZ:
raise

--
Aymeric Augustin.

Daniel Swarbrick

unread,
Sep 7, 2011, 1:01:06 PM9/7/11
to Django developers
On Sep 6, 10:47 pm, Aymeric Augustin
<aymeric.augus...@polytechnique.org> wrote:

> Under PostgreSQL, I assume I have to use a TIMESTAMP WITH TIME ZONE and set the connection's timezone to UTC if I want non-Django applications using the same database to obtain correct results, regardless of their connection's timezone. To be honest, this is pure speculation; I must check this for each database engine.

Django defaults to "timestamp with time zone" for DateTimeFields on
Postgres. Django also explicitly sets the client connection timezone
to that specified in settings.py, however normal behaviour for a
Postgres client is to inherit the TZ of the host that the client is
running on. Naive timestamps (eg. without timezone) in an INSERT are
assumed to be of the client connection timezone.

SELECTs of "timestamp with time zone" output the UTC offset, for
example, immediately after connecting, and without manually specifying
the TZ, psql inherits my host's TZ (Europe/Berlin):

talk2=> select id, last_login, date_joined from auth_user where id=1;
id | last_login | date_joined
----+-------------------------------+------------------------
1 | 2011-09-05 18:43:38.050294+02 | 2008-05-03 02:29:14+02

I can change to another TZ easily:

talk2=> set timezone to 'America/Chicago';
SET
talk2=> select id, last_login, date_joined from auth_user where id=1;
id | last_login | date_joined
----+-------------------------------+------------------------
1 | 2011-09-05 11:43:38.050294-05 | 2008-05-02 19:29:14-05

Notice the timestamps are different, and have a -05 UTC offset now
instead of a +02.

So if you want non-Django applications to use the DB, you just need
the application to be aware of the UTC offset included in the output,
and apply that accordingly. Or if you prefer, you can indeed just set
the client connect TZ to UTC:

talk2=> set timezone to UTC;
SET
talk2=> select id, last_login, date_joined from auth_user where id=1;
id | last_login | date_joined
----+-------------------------------+------------------------
1 | 2011-09-05 16:43:38.050294+00 | 2008-05-03 00:29:14+00

The main point is that whilst the internal storage of timestamps in
Postgres is UTC Julian dates, the output is always rendered as
localtime for the client connection TZ, whatever that may be.

Kirit Sælensminde (kayess)

unread,
Sep 7, 2011, 11:55:55 PM9/7/11
to Django developers
I just wanted to chime in with a couple of extra notes here. I won't
comment on most of it as I think it's all spot on :)

On Sep 3, 10:40 pm, Aymeric Augustin
<aymeric.augus...@polytechnique.org> wrote:
> Django should use timezone-aware UTC datetimes internally
> .........................................................
>
> Example : datetime.datetime(2011, 09, 23, 8, 34, 12, tzinfo=pytz.utc)

There is also a Django time field. How to handle this is kind of hard
as having a timezone on a time field may not make much sense. At the
moment it is up to application code to combine this with a date in a
sane manner. If Django switches to use UTC internally then this is
going to become harder for many users to do correctly. Maybe some
library functions that will help them to do this will work?

Using UTC internally isn't enough in the way that using Unicode
internally is to mean that application writers can factor out time
handling in global applications. For example, if I have a Django
application that tracks retail sales worldwide then a report of sales
figures for any given day probably needs to use the correct local time
at each store to make sense from a business perspective. Using UTC
internally may however make developers aware that there is an issue
for them address earlier -- a good thing.


The other thing that ought to be thought about with this is the admin
time/date widgets which have the 'now' link. This is always filled in
with the browser's current time which plays havoc when admin is used
by a user in a different time zone to the site's settings. It should
be possible to capture the UTC offset along with the time so that the
correct number of minutes is added/subtracted when the field is
processed by Django. Thankfully daylight savings can be ignored here.

Many browsers will send the local time of the request to the server.
This can be used to guess the correct timezone, but it won't get
things right (there's no way to work out the correct DST setting from
this). If the country can be identified in some way then the two
together should be good for most users. The UTC offset in the request
is all that's needed to get localize any times that are sent back
though as again, daylight savings can be ignored -- so long as we
aren't rash enough to presume that this offset tells us the actual
time zone.

Hope this is useful,


Kirit

Daniel Swarbrick

unread,
Sep 8, 2011, 10:17:34 AM9/8/11
to Django developers
On Sep 8, 5:55 am, Kirit Sælensminde (kayess)
<kirit.saelensmi...@gmail.com> wrote:

> There is also a Django time field. How to handle this is kind of hard
> as having a timezone on a time field may not make much sense. At the
> moment it is up to application code to combine this with a date in a
> sane manner. If Django switches to use UTC internally then this is
> going to become harder for many users to do correctly. Maybe some
> library functions that will help them to do this will work?

Whilst both the Python datetime.time type and the Postgres 'time with
time zone' column type support a TZ, I personally feel that it only
makes sense to attach TZ info for datetime.datetime (eg. 'timestamp
with time zone'). Other people have raised the point before about the
need to retain some way of expressing "wall clock time", and I *think*
separate date and time fields address that issue reasonably well.

Let's say we were to use TZ with simple datetime.time types. A user in
America/Chicago (UTC-05) inputs a time of 18:00. This gets normalized
to UTC and stored as 23:00. A user in Australia/Sydney (UTC+10) wishes
to view that information, and sees the value localized for them as
09:00 - *the next day*. This is why I think it's best to leave TZ out
of simple date or time types, and only use it on datetime types, that
treat the concept of time as a constant, linear progression that is
relative to the observer, not the event that takes place.

The normalization to UTC is really only a factor for databases (or
serializers) that don't already do their own internal storage as UTC.
The application developer should be unaware that it was stored as UTC,
since they are provided a non-naive datetime with tzinfo attached,
which they can then localize to any other TZ they wish.

> The other thing that ought to be thought about with this is the admin
> time/date widgets which have the 'now' link. This is always filled in
> with the browser's current time which plays havoc when admin is used
> by a user in a different time zone to the site's settings. It should
> be possible to capture the UTC offset along with the time so that the
> correct number of minutes is added/subtracted when the field is
> processed by Django. Thankfully daylight savings can be ignored here.

If admins (or users) are inputting localized datetimes, the app will
need to know their timezone, or apply a default system-wide timezone.
I don't see this as much different from the way that the language
cookie is used.

Perhaps a formfield could allow for the specification of a UTC offset
in the input, eg "2011-09-08 16:02+02" and extract that offset from
the input. DST is irrelevant since UTC offset is not the same as
timezone. A location's timezone is the same all year round, but the
UTC offset changes at various times of the year if that timezone
observes DST. That's part of the information provided by the Olson
tzdata database. Usually when a timestamp shows a UTC offset, any DST
offset is included in that UTC offset, so it can still be normalized
to UTC regardless of DST being in effect or not.

> Many browsers will send the local time of the request to the server.
> This can be used to guess the correct timezone, but it won't get
> things right (there's no way to work out the correct DST setting from
> this). If the country can be identified in some way then the two
> together should be good for most users. The UTC offset in the request
> is all that's needed to get localize any times that are sent back
> though as again, daylight savings can be ignored -- so long as we
> aren't rash enough to presume that this offset tells us the actual
> time zone.

The JavaScript Date() object has several methods that could be useful
here, including getTimezoneOffset(), which returns difference between
UTC and local time, in minutes.

Remember that Postgres is not storing the timezone. It's not even
storing the UTC offset. The timezone is only a client connection
"environment var", that gets added/subtracted to timestamps as they
are input or retrieved. Likewise I don't think the objective here is
to store a TZ with the DateTimeField, but rather just to have UTC
offset information available during handling, that facilitates the
normalization to/from UTC for non-TZ-capable databases. Whether that
UTC offset information is in the form a full timezone name, such as
America/Chicago, or simply a UTC offset such as UTC-05, is not
particularly relevant.

Aymeric Augustin

unread,
Sep 11, 2011, 5:18:09 PM9/11/11
to django-d...@googlegroups.com
Hello,

Given the positive feedback received here and on IRC, I've started the implementation.

Being most familiar with mercurial, I've forked the Bitbucket mirror. This page that compares my branch to trunk:
https://bitbucket.org/aaugustin/django/compare/..django/django

I've read a lot of code in django.db, and also the documentation of PostgreSQL, MySQL and SQLite regarding date/time types.

I've written some tests that validate the current behavior of Django. Their goal is to guarantee backwards-compatibility when USE_TZ = False.

At first they failed because runtests.py doesn't set os.environ['TZ'] and doesn't call time.tzset() , so the tests ran with my system local time. I fixed that in setUp and tearDown. Maybe we should call them in runtests.py too for consistency?

By the way, since everything is supposed to be in UTC internally when USE_TZ is True, it is theoretically to get rid of os.environ['TZ'] and time.tzset(). They are only useful to make timezone-dependant functions respect the TIME_ZONE setting. However, for backwards compatibility (in particular with third-party apps), it's better to keep them and interpret naive datetimes in the timezone defined by settings.TIME_ZONE (instead of rejecting them outright). For this reason, I've decided to keep os.environ['TZ'] and time.tzset() even when USE_TZ is True.

Best regards,

--
Aymeric Augustin.

Aymeric Augustin

unread,
Sep 17, 2011, 3:59:17 AM9/17/11
to django-d...@googlegroups.com
Hello,

This week, I've gathered all the information I need about how the database engines and adapters supported by Django handle datetime objects. I'm attaching my findings.

The good news is that the database representations currently used by Django are already optimal for my proposal. I'll store data in UTC:
- with an explicit timezone on PostgreSQL,
- without timezone on SQLite and MySQL because the database engine doesn't support it,
- without timezone on Oracle because the database adapter doesn't support it.


Currently, Django sets the "supports_timezones feature" to True for SQLite. I'm skeptical about this choice. Indeed, the time zone is stored: SQLite just saves the output of "<datetime>.isoformat(), which includes the UTC offset for aware datetime objects. However, the timezone information is ignored when reading the data back from the database, thus yielding incorrect data when it's different from the local time defined by settings.TIME_ZONE.

As far as I can tell, the "supports_timezones" and the "needs_datetime_string_cast" database features are incompatible, at least with the current implementation of "typecast_timestamp". There's a comment about this problem that dates back to the merge of magic-removal, possibly before:
https://code.djangoproject.com/browser/django/trunk/django/db/backends/util.py?annotate=blame#L79

SQLite is the only engine who has these two flags set to True. I think "supports_timezones" should be False. Does anyone know why it's True? Is it just an historical artifact?


Finally, I have read the document that describes "to_python", "value_to_string", and r"get_(db_)?prep_(value|save|lookup)". The next step is to adjust these functions in DateFieldField, depending on the value of settings.USE_TZ.

DATABASE-NOTES.html

Aymeric Augustin

unread,
Sep 24, 2011, 9:24:12 AM9/24/11
to django-d...@googlegroups.com
Hello,

This week, I've been working on a related topic that I had missed entirely in my initial proposal: serialization.

Developers will obtain aware datetimes from Django when USE_TZ = True. We must ensure that they serialize correctly.

Currently, the serialization code isn't very consistent with datetimes:
- JSON: the serializer uses the '%Y-%m-%d %H:%M:%S' format, losing microseconds and timezone information. This dates back to the initial commit at r3237. See also #10201.
- XML: the serializer delegates to DateTimeField.value_to_string, who also uses the '%Y-%m-%d %H:%M:%S' format.
- YAML: the serializer handles datetimes natively, and it includes microseconds and UTC offset in the output.

I've hesitated between converting datetimes to UTC or rendering them as-is with an UTC offset. The former would be more consistent with the database and it's recommended in YAML. But the latter avoids modifying the data: not only is it faster, but it's also more predictable. Serialization isn't just about storing the data for further retrieval, it can be used to print arbitrary data in a different format. Finally, when the data comes straight from the database (the common case), it will be in UTC anyway.

Eventually, I've decided to serialize aware datetimes without conversion. The implementation is here:
https://bitbucket.org/aaugustin/django/compare/..django/django

Here are the new serialization formats for datetimes:
- JSON: as described in the specification at http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf > 15.9.1.15 Date Time String Format.
- XML: as produced by datetime.isoformat(), ISO8601.
- YAML: unchanged, compatible with http://yaml.org/type/timestamp.html — the canonical representation uses 'T' as separator and is in UTC, but it's also acceptable to use a space and include an offset like pyyaml does.
These formats follow the best practices described in http://www.w3.org/TR/NOTE-datetime.

This fix is backwards-incompatible for the JSON and XML serializers: it includes fractional seconds and timezone information, and it uses the normalized separator, 'T', between the date and time parts. However, I've made sure that existing fixtures will load properly with the new code. I'll mention all this in the release notes.

Unrelatedly, I have switched the SQLite backend to supports_timezones = False, because it really doesn't make sense to write the UTC offset but ignore it when reading back the data.

Best regards,

--
Aymeric Augustin.

> <DATABASE-NOTES.html>


>
> On 11 sept. 2011, at 23:18, Aymeric Augustin wrote:
>
>> Hello,
>>
>> Given the positive feedback received here and on IRC, I've started the implementation.
>>
>> Being most familiar with mercurial, I've forked the Bitbucket mirror. This page that compares my branch to trunk:
>> https://bitbucket.org/aaugustin/django/compare/..django/django
>>
>> I've read a lot of code in django.db, and also the documentation of PostgreSQL, MySQL and SQLite regarding date/time types.
>>
>> I've written some tests that validate the current behavior of Django. Their goal is to guarantee backwards-compatibility when USE_TZ = False.
>>
>> At first they failed because runtests.py doesn't set os.environ['TZ'] and doesn't call time.tzset() , so the tests ran with my system local time. I fixed that in setUp and tearDown. Maybe we should call them in runtests.py too for consistency?
>>
>> By the way, since everything is supposed to be in UTC internally when USE_TZ is True, it is theoretically to get rid of os.environ['TZ'] and time.tzset(). They are only useful to make timezone-dependant functions respect the TIME_ZONE setting. However, for backwards compatibility (in particular with third-party apps), it's better to keep them and interpret naive datetimes in the timezone defined by settings.TIME_ZONE (instead of rejecting them outright). For this reason, I've decided to keep os.environ['TZ'] and time.tzset() even when USE_TZ is True.
>>

>> Best regards,
>>
>> --
>> Aymeric Augustin.
>>
>>

Aymeric Augustin

unread,
Oct 2, 2011, 8:38:26 AM10/2/11
to django-d...@googlegroups.com
Hello,

This week, I've finished the work on serialization by making the deserializers capable of handling UTC offsets. I had to rewrite DateTimeField.to_python to extract and interpret timezone offsets. Still, deserialization of aware datetimes doesn't work with PyYAML: http://pyyaml.org/ticket/202

I also implemented the storage and retrieval of aware datetime objects in PostgreSQL, MySQL and Oracle. Conversions happen:
- on storage, in `connection.ops.value_to_db_datetime`, called from `get_db_prep_value`;
- on retrieval, in the database adapter's conversion functions.
The code is rather straightforward. When USE_TZ is True, naive datetimes are interpreted as local time in TIME_ZONE, for backwards compatibility with existing applications.

SQLite is more tricky because it uses `django.db.backends.util.typecast_timestamp` to convert string to datetimes. However:
- this function is used elsewhere of Django, in combination with the `needs_datetime_string_cast` flag.
- it performs essentially the same operations as `DateTimeField.to_python`.
I'll review the history of this code, and I'll try to refactor it.

Besides adding support for SQLite, I still have to:
- check that datetimes behave correctly when they're used as query arguments, in aggregation functions, etc.
- optimize django.utils.tzinfo: fix #16899, use pytz.utc as the UTC timezone class when pytz is available, etc.

I won't have much time for this project next week. See you in two weeks for the next check-in!

Best regards,

--
Aymeric Augustin.

Aymeric Augustin

unread,
Oct 16, 2011, 5:48:58 AM10/16/11
to django-d...@googlegroups.com
Hello,

I've implemented the storage and retrieval of aware datetime objects in SQLite. This involved some refactoring, because SQLite returns datetimes as strings, and I didn't want to duplicate the parsing code. All database backends are now covered.

Although it isn't strictly necessary for my project, I suspect the django.db.backends.util.typecast_* functions could use further refactoring. They date back to r3, and the database adapters have become better at converting data to Python objects since then. This deserves more research.

I've also committed a patch for #16906 that I wanted to merge in my branch before further work, and fixed some related tickets (#16899, #16923, #4076).

I've had other priorities since my last check-in; I hope I'll have more time for this project now.

Best regards,

--
Aymeric Augustin.

Daniel Swarbrick

unread,
Oct 18, 2011, 8:48:22 AM10/18/11
to Django developers

On Oct 16, 11:48 am, Aymeric Augustin
<aymeric.augus...@polytechnique.org> wrote:
> Hello,
>
> I've implemented the storage and retrieval of aware datetime objects in SQLite. This involved some refactoring, because SQLite returns datetimes as strings, and I didn't want to duplicate the parsing code. All database backends are now covered.
>

Is it worth looking at registering a custom converter for SQLite
(http://docs.python.org/library/sqlite3.html#converting-sqlite-values-
to-custom-python-types) to handle the date objects? Converters can be
registered per-column - the SELECT query would need to be tweaked to
hint to the SQLite libs to call the registered converter. I'm not sure
if that would be too intrusive to the existing Django query builder.

Aymeric Augustin

unread,
Oct 23, 2011, 11:21:39 AM10/23/11
to django-d...@googlegroups.com
Hello,

I was getting weary of archeology in database adaptation code, so I've been working on the template layer this week.

It is now possible to control the localization of datetimes in templates. I also introduced a "current time zone" (before someone asks: yes, it's a thread local) and the possibility to change it in Python and template code. This works with and without pytz. The next step in this area is to handle conversions is forms and widgets.

I still have to figure out where it's better to use the "default time zone" (settings.TIME_ZONE) and where it's better to use "current time zone" to interpret naive datetimes. Depending on the answer, I may have to make the model layer aware of the "current time zone".

As usual, if you're curious: https://bitbucket.org/aaugustin/django/compare/..django/django

Best regards,

--
Aymeric Augustin.

Aymeric Augustin

unread,
Oct 23, 2011, 12:16:32 PM10/23/11
to django-d...@googlegroups.com
On 18 oct. 2011, at 14:48, Daniel Swarbrick wrote:

> On Oct 16, 11:48 am, Aymeric Augustin
> <aymeric.augus...@polytechnique.org> wrote:
>> Hello,
>>
>> I've implemented the storage and retrieval of aware datetime objects in SQLite. This involved some refactoring, because SQLite returns datetimes as strings, and I didn't want to duplicate the parsing code. All database backends are now covered.
>
> Is it worth looking at registering a custom converter for SQLite
> (http://docs.python.org/library/sqlite3.html#converting-sqlite-values-
> to-custom-python-types) to handle the date objects?

My sentence wasn't totally accurate: Django's currently using converters for datetimes, and I changed them to return aware datetimes.

> Converters can be
> registered per-column - the SELECT query would need to be tweaked to
> hint to the SQLite libs to call the registered converter. I'm not sure
> if that would be too intrusive to the existing Django query builder.

I didn't know that. It could be useful to convert results of aggregation. At the moment, I don't know how they are converted to Python types.

Anssi Kääriäinen

unread,
Oct 23, 2011, 1:36:09 PM10/23/11
to Django developers
On Oct 23, 7:16 pm, Aymeric Augustin
<aymeric.augus...@polytechnique.org> wrote:
> > Converters can be
> > registered per-column - the SELECT query would need to be tweaked to
> > hint to the SQLite libs to call the registered converter. I'm not sure
> > if that would be too intrusive to the existing Django query builder.
>
> I didn't know that. It could be useful to convert results of aggregation. At the moment, I don't know how they are converted to Python types.

Aggregates can define if they are ordinal or computed. In practice
computed=True means the value returned will be converted to float,
ordinal=True means an integer. If none of the above is defined (which
would be the interesting case for datetimes) the source field for the
aggregate computation is used to define the return value. The
conversion is done in connection.ops.convert_values, which I guess is
the same place used when doing the conversion normally.

For your case, if the aggregate is going to return a timezone aware
datetime and for some reason the source field is not timezone aware
datetime field, you would need to refactor the sql.Aggregate class. I
guess this will not be a problem for you. It is likely things will
just work without any special handling for aggregates. On the other
hand, if you need to register a converter beforehand for an aggregate,
I don't think that is supported at all currently.

If you grep for 'is_ordinal' and 'resolve_aggregate' in db/models/sql
directory, you will find all the interesting places for aggregate type
conversion handling.

- Anssi

Aymeric Augustin

unread,
Oct 30, 2011, 9:45:18 AM10/30/11
to django-d...@googlegroups.com
Hello,

This check-in is highly topical, as clocks rolled back one hour in most parts of Europe this morning :)

I've implemented conversions in forms and widgets, and in the admin too. I've made template filters convert their input to the current time zone when appropriate. I've added a context processor and a template tag that give access to the current time zone in templates.

I've written docs.

As a sanity check, I created a demo app: https://bitbucket.org/aaugustin/django-tz-demo

This means I've reached the point where the branch is usable. Try it! Clone https://bitbucket.org/aaugustin/django, put it in PYTHONPATH, set USE_TZ=True, and you're good to go. If you have questions, I'm mYk in #django-dev on Freenode.

I still have a few details to iron out before the branch is ready for merging:
- flesh out the docs with some code samples
- write the release notes
- check how values used as parameters and computed values behave in the ORM
- check how the timesince and timeuntil template filters behave
- ensure the cache depends on the current timezone
- investigate this comment in django/contrib/gis/gdal/field.py: # TODO: Adapt timezone information.
- fix bugs found by reviewers :)

Additional work could include:
- performance tests
- refactoring django.utils.timezone and django.utils.tzinfo -- tzinfo supports functions such as naturaltime, timesince, timeuntil, dateformat, but it wasn't compatible with the features I needed for the time zone support, so I wrote a new module.
- refactoring datetime parsing / formatting utilities throughout Django, and in particular django.db.backends.util.typecast_* -- I keep finding bugs such as #17134

Initially, I considered writing a script to automatically shift timestamps in the database backends that need it, to ease the migration of existing projects to USE_TZ=True. Upon further thought, I'm reluctant to mess with people's databases, even with a big fat disclaimer. If I write such a script, I'll upload it to djangosnippets.

Best regards,

--
Aymeric Augustin.

ptone

unread,
Oct 30, 2011, 12:07:38 PM10/30/11
to Django developers

I just did an initial scan through: https://bitbucket.org/aaugustin/django/compare/..django/django
and this looks like a great piece of work.

I do think a better label than "value" could be used for the
thread.local for the current timezone, as this is really a thread
global - something less ambiguous and conflict prone would be better
IMO. Also an explicit link to the activate command from the section
where you introduce the current time zone concept. Perhaps with a
simple code example of how you would set current time per user in a
view.

https://bitbucket.org/aaugustin/django/compare/..django/django#chg_docs/topics/i18n/timezones.txt_newline109

Was a threading.local the only way to do this - I'm sure you've given
it some thought - there are relatively few uses of thread locals in
Django - translation being one. If the timezone were stored on the
current request, would that not be available to most places in django
code that need TZ within the current thread?

On Oct 30, 6:45 am, Aymeric Augustin
<aymeric.augus...@polytechnique.org> wrote:
> Initially, I considered writing a script to automatically shift timestamps in the database backends that need it, to ease the migration of existing projects to USE_TZ=True. Upon further thought, I'm reluctant to mess with people's databases, even with a big fat disclaimer. If I write such a script, I'll upload it to djangosnippets.

I think a very brief outline of the steps required for migration may
be worthwhile in the release notes.

-Preston

Aymeric Augustin

unread,
Oct 30, 2011, 3:38:00 PM10/30/11
to django-d...@googlegroups.com
Hi Preston,

On 30 oct. 2011, at 17:07, ptone wrote:
> I do think a better label than "value" could be used for the
> thread.local for the current timezone, as this is really a thread
> global - something less ambiguous and conflict prone would be better
> IMO.

I used this name for consistency with django.utils.translation.trans_real.

Thread locals store data as attributes, and there is only one value to store in timezone._active, hence the generic name. Since it's purely internal to django.utils.timezone, there's no risk of conflicts.

I think it's fine as is.

> Also an explicit link to the activate command from the section
> where you introduce the current time zone concept.

Good point. I've added a link.

> Perhaps with a simple code example of how you would set
> current time per user in a view.

Yes, I was precisely adding an example to "Selecting the current time zone" :) I'll push it tonight.

> Was a threading.local the only way to do this - I'm sure you've given
> it some thought - there are relatively few uses of thread locals in
> Django - translation being one. If the timezone were stored on the
> current request, would that not be available to most places in django
> code that need TZ within the current thread?

One of the design principles of Django is loose coupling [1]. For instance, this means that "the template system knows nothing about Web requests" -- thanks to whoever wrote the exact quote I needed :)

The template and the form layers use the current time zone. (I believe nothing else does.) So, storing the current timezone in the request doesn't really help, since those two layers don't have access to the request.

More generally, I agree that global state should be avoided. Thread locals are just a symptom: if you want thread-safe global state, you have to use a thread local. [2]

Here, I'm using a small bit of global state to store the current time zone. It's stored exactly like the current locale. I'm not making the situation significantly worse; if we refactor the storage of the locale, it'll be trivial to refactor the storage of the timezone in the same way.

Finally, one of the problems of thread locals is that they don't provide strong isolation between threads (everything is in the same memory space after all). From this perspective, storing the current timezone in a thread local isn't nearly as bad as storing the entire request object -- think request.POST['password'].

> I think a very brief outline of the steps required for migration may
> be worthwhile in the release notes.

I agree. I've added some instructions at the bottom of the documentation page, and I'll point there from the release notes.

Thanks for your feedback!

--
Aymeric.

[1] https://docs.djangoproject.com/en/1.3/misc/design-philosophies/#loose-coupling
[2] http://groups.google.com/group/django-users/msg/7e7e75325bd086a1

Aymeric Augustin

unread,
Nov 1, 2011, 10:58:37 AM11/1/11
to django-d...@googlegroups.com
Hello,

I'm glad to announce that the time zone support branch is ready for review.

Review the changes: https://bitbucket.org/aaugustin/django/compare/..django/django
Clone the branch: hg clone https://bitbucket.org/aaugustin/django
Play with the demo app: hg clone https://bitbucket.org/aaugustin/django-tz-demo

The docs are supposed to explain everything you need, and I described many design decisions in my weekly check-ins, so I'm purposefully keeping this email short on details :)

Finally here's a teaser: http://myks.org/stuff/django-tz-demo-2.png

----

During this project, most things went according to the initial plan.

There is only one outstanding issue that I know of. `QuerySet.dates()` operates in the database timezone, ie. UTC, while an end user would expect it to take into account its current time zone. This affects `date_hierarchy` in the admin and the date based generic views. The same problem also exists for the `year`, `month`, `day`, and `week_day` lookups.

Under PostgreSQL, provided the current time zone has a valid name (e.g. it's provided by pytz), we could temporarily set the connection's timezone to this time zone and restore it to UTC afterwards. With other backends, I can't think of a simpler solution than fetching all values and doing the aggregation in Python. Neither of these thrills me, so for the time being, I've documented this as a known limitation.

Best regards,

--
Aymeric.

Luke Plant

unread,
Nov 8, 2011, 3:58:53 PM11/8/11
to django-d...@googlegroups.com
Hi Aymeric,

Finally got around to this! My review follows. I didn't follow the
original discussions very closely, so forgive any questions that have
been answered already. On the other hand, the code and comments probably
ought to answer the kind of questions I have, without needing to search
mailing list archives, so it might be useful that I haven't been
following that closely.

The code looks in very good shape. I've got quite a lot of things listed
below, but the vast majority of my comments are small or very small nits
that can be easily corrected, with the possible exception of my
questions regarding django.utils.timezone.LocalTimezone

== django/contrib/admin/util.py ==

> elif isinstance(field, models.DateField):
> return formats.localize(timezone.aslocaltime(value))
> elif isinstance(field, models.DateField) or isinstance(field,
models.TimeField):
> return formats.localize(value)

Isn't the first clause in the second elif rendered useless by the
previous elif?

== django/db/backends/mysql/base.py ==

> # Finally, MySQLdb always returns naive datetime objects.

Is it possible that a future version of MySQLdb could change this? If
so, would it be better to not make this assumption, and future proof the
code? We could perhaps just add an assertion for the assumption we make,
so that we don't end up silently corrupting data with a future version
of MySQLdb that acts differently, because at the moment the function
datetime_or_None_with_timezone_support just wipes out any existing
tzinfo field.

There is a similar issue in the Oracle backend I think.

> raise ValueError("MySQL backend does not support timezone-aware
datetimes.")

This seems to be misleading - shouldn't the message add the caveat "if
USE_TZ == False" or something?

== django/db/backends/oracle/base.py ==

> raise ValueError("Oracle backend does not support timezone-aware
datetimes.")

As above, should add "if USE_TZ == False"

> def output_type_handler(cursor, name, defaultType,
> size, precision, scale):

tiny nit - is there a reason for camelCase just for the defaultType
variable?

== django/db/backends/sqlite3/base.py ==

> supports_timezones = False

What exactly is meant by this? The timezone support in the sqlite
backend appears to be very similar to that in MySQL.

If SQLite genuinely does not support time zones in some important way,
this should probably be noted in the release notes, and definitely in
docs/ref/databases.txt

== django/utils/timezone.py ==

> import time as _time

There doesn't seem to be any clash to avoid, so I don't see the need for
importing like this.

However, my biggest question is about the LocalTimezone class. We now
have two implementations of this, one in django.utils.tzinfo, which
seems to have more extensive tests, and a new one in
django.utils.timezone. The old one is still in use, both by old code
e.g. in django/contrib/syndication/views.py and new code e.g.
django/utils/dateformat.py under DateFormat.__init__

The new one is used in other places.

There is no explanation of the need for these two, or when they should
be used, which to me seems like a minimum requirement.

Perhaps we just need to combine the two LocalTimezone implementations in
tzinfo.py? If we can't do that for some backward compatibility reasons,
can we have some explanation, and preferably a deprecation path for the
older code? Finally, do we need to address code that uses the old
LocalTimezone in some way - should it be unconditionally using our own
LocalTimezone instead of something from pytz when available?

> # This function exists for consistency with get_default_timezone_name
> def get_default_timezone_name():

I'm guessing the comment should refer to get_current_timezone_name instead?

> def is_aware()

It is probably worth adding a comment indicating that the logic is taken
straight from Python docs, with a link -
http://docs.python.org/library/datetime.html#available-types


== docs/ ==

Some instances of 'time-zone' can be found, instead of 'time zone'


== docs/howto/custom-template-tags.txt ==

> Filters and time zones

I think this section needs a 'versionadded' directive.

> to to your filter

^ typo

== docs/ref/models/querysets.txt ==

> When :doc:`time zone support </topics/i18n/timezones>` is enabled, the
> database stores datetimes in UTC, which means the aggregation is
> performed in UTC. This is a known limitation of the current
> implementation.

This implies that when time zone support is not enabled, datetimes are
not stored in UTC - which is false at least for Postgres. Perhaps this
could be reworded in some way?

Similarly with the next warning about 'year', 'month' etc. lookups.

> - It includes the time zone for aware datetime objects. It raises
> an exception for aware time objects.

First instance of 'aware' should be 'naive'. Second sentence should end
'aware datetime or time objects' for completeness.

== docs/ref/utils.txt ==

Needs a versionadded directive for django.utils.dateparse and
django.utils.timezone


== docs/topics/i18n/timezones.txt ==

> Provided the current time zone raises an exception for datetimes that
> don't exist or are ambiguous because they fall in a DST transition
> (pytz_ does), such datestimes will be reported as invalid values.

There is a typo - datestimes - but I found the paragraph confusing due
it starting with 'Providing'. Perhaps this would be better:

> If the current time zone raises an exception for datetimes that
> don't exist or are ambiguous because they fall in a DST transition
> (the timezones provided by pytz_ do this), such datetimes will be
> reported as invalid values.


== tests ==

I haven't actually reviewed any of the test code - I ran out of energy,
and test code is so dull, sorry!

I did run coverage for the tests and found:

- without pytz installed, the new LocalTimezone class

> During this project, most things went according to the initial plan.
>
> There is only one outstanding issue that I know of.
> `QuerySet.dates()` operates in the database timezone, ie. UTC, while
> an end user would expect it to take into account its current time
> zone. This affects `date_hierarchy` in the admin and the date based
> generic views. The same problem also exists for the `year`, `month`,
> `day`, and `week_day` lookups.
>
> Under PostgreSQL, provided the current time zone has a valid name
> (e.g. it's provided by pytz), we could temporarily set the
> connection's timezone to this time zone and restore it to UTC
> afterwards. With other backends, I can't think of a simpler solution
> than fetching all values and doing the aggregation in Python. Neither
> of these thrills me, so for the time being, I've documented this as
> a known limitation.

You could argue a case for saying that the behaviour has some
advantages: if one admin user filters a list of objects by date, and
sends the link to a colleague, they would probably expect the colleague
to end up with the same list of objects.

But even if that isn't very convincing, I'm happy with this limitation
given the implementation difficulties of fixing it.

Great work, BTW.

Luke

--
"Trouble: Luck can't last a lifetime, unless you die young."
(despair.com)

Luke Plant || http://lukeplant.me.uk/

Aymeric Augustin

unread,
Nov 10, 2011, 2:46:01 AM11/10/11
to django-d...@googlegroups.com
Hi Luke,

Thanks for the review! I've updated the branch with your suggestions.

Here are a few comments -- when I have something to say other than
"fixed, thanks".

2011/11/8 Luke Plant <L.Pla...@cantab.net>:


> == django/contrib/admin/util.py ==
>
>>    elif isinstance(field, models.DateField):
>>        return formats.localize(timezone.aslocaltime(value))
>>    elif isinstance(field, models.DateField) or isinstance(field,
> models.TimeField):
>>        return formats.localize(value)
>
> Isn't the first clause in the second elif rendered useless by the
> previous elif?

It's a bug: the first elif should test for models.DateTimeField. The
problem didn't appear in the tests because DateTimeField is a subclass
of DateField.


> == django/db/backends/mysql/base.py ==
>
>> # Finally, MySQLdb always returns naive datetime objects.
>
> Is it possible that a future version of MySQLdb could change this? If
> so, would it be better to not make this assumption, and future proof the
> code? We could perhaps just add an assertion for the assumption we make,
> so that we don't end up silently corrupting data with a future version
> of MySQLdb that acts differently, because at the moment the function
> datetime_or_None_with_timezone_support just wipes out any existing
> tzinfo field.

I'm now checking if the value is naive before overwriting its tzinfo.
If the database adapter returns an aware datetime (in a future
version), it's passed through unchanged.

> There is a similar issue in the Oracle backend I think.

The issue existed in SQLite, MySQL and Oracle.


>> raise ValueError("MySQL backend does not support timezone-aware
> datetimes.")
>
> This seems to be misleading - shouldn't the message add the caveat "if
> USE_TZ == False" or something?

I've updated the message for SQLite, MySQL and Oracle.


> == django/db/backends/sqlite3/base.py ==
>
>>  supports_timezones = False
>
> What exactly is meant by this? The timezone support in the sqlite
> backend appears to be very similar to that in MySQL.
>
> If SQLite genuinely does not support time zones in some important way,
> this should probably be noted in the release notes, and definitely in
> docs/ref/databases.txt

All backends but PostgreSQL have supports_timezones = False. It means
that the database engine (SQLite, MySQL) or its adapter (cx_Oracle)
doesn't support timezones. This flag existed before my branch, and it
was True for SQLite. I switched it to False because SQLite doesn't
actually support timezones. This shouldn't affect anything besides
Django's own test suite -- database features aren't part of the public
API (AFAIK).

I've added a paragraph about this change to the release notes.


> == django/utils/timezone.py ==
>
>> import time as _time
>
> There doesn't seem to be any clash to avoid, so I don't see the need for
> importing like this.

This code is taken straight from Python's docs, and I wanted to keep
it as close as possible to the original for clarity and ease of
maintenance -- see #16899 for example.

I can't use the original code as is, because the timezone isn't known
at compile time, only after the settings have been loaded. But I still
think it's worth minimizing the diff.

> However, my biggest question is about the LocalTimezone class. We now
> have two implementations of this, one in django.utils.tzinfo, which
> seems to have more extensive tests, and a new one in
> django.utils.timezone. The old one is still in use, both by old code
> e.g. in django/contrib/syndication/views.py and new code e.g.
> django/utils/dateformat.py under DateFormat.__init__
>
> The new one is used in other places.

Yes. I spent a lot of time on this problem, and I eventually decided
to add a second implementation of the LocalTimezone, because there are
too many issues in the "old" implementation to build more code upon
it. The "new" implementation focuses on correctness and simplicity.

Here are the problems of the "old" implementation.

Its __init__ method takes an argument, which is problematic for
pickling and deepcopying. From Python's docs:

> Special requirement for pickling: A tzinfo subclass must have an __init__() method that can be called with no arguments, else it can be pickled but possibly not unpickled again. This is a technical requirement that may be relaxed in the future.

Adding a __getinitargs__ method resolved this problem, but it's magic,
and we're still doing something that's officially discouraged.

Also, I don't like the hack in the "old" implementation to support
post-2037 dates. Sure, it doesn't crash, but it can return wrong data,
because DST doesn't apply during the same period every year. So much
for refusing the temptation to guess in the face of ambiguity :)

This inaccuracy is irrelevant for syndication (dates are in the past)
and acceptable for display purposes in dateformat (naturaltime will
return "in X years", an offset of 1 hour doesn't matter). But when
we're saving data in the database, we can't take the risk to corrupt
it.

> Perhaps we just need to combine the two LocalTimezone implementations in
> tzinfo.py? If we can't do that for some backward compatibility reasons,
> can we have some explanation, and preferably a deprecation path for the
> older code? Finally, do we need to address code that uses the old
> LocalTimezone in some way - should it be unconditionally using our own
> LocalTimezone instead of something from pytz when available?

There are two problems in deprecating the "old" implementation:
- the output of self.tzname() would change — it depends on the
argument taken by __init__;
- post-2037 dates would no longer be supported.

To sum up:
- the "old" implementation isn't suitable for the needs of my branch;
- there's a lot of history, and we can't deprecate it without creating
regressions: https://code.djangoproject.com/log/django/trunk/django/utils/tzinfo.py

> There is no explanation of the need for these two, or when they should
> be used, which to me seems like a minimum requirement.

I've added a comment.


>> - It includes the time zone for aware datetime objects. It raises
>>   an exception for aware time objects.
>
> First instance of 'aware' should be 'naive'. Second sentence should end
> 'aware datetime or time objects' for completeness.

The current wording describes what my branch does:
- naive datetime => output without TZ info
- aware datetime => output with TZ info
- naive time => output without TZ info
- aware time => exception


> == tests ==
>
> I haven't actually reviewed any of the test code - I ran out of energy,
> and test code is so dull, sorry!
>
> I did run coverage for the tests and found:
>
> - without pytz installed, the new LocalTimezone class

You apparently ran out of energy mid-sentence :) Do you remember what
you were going to say?

Thanks again for the review!

--
Aymeric.

Luke Plant

unread,
Nov 10, 2011, 9:06:55 AM11/10/11
to django-d...@googlegroups.com
On 10/11/11 07:46, Aymeric Augustin wrote:

>> I haven't actually reviewed any of the test code - I ran out of energy,
>> and test code is so dull, sorry!
>>
>> I did run coverage for the tests and found:
>>
>> - without pytz installed, the new LocalTimezone class
>
> You apparently ran out of energy mid-sentence :) Do you remember what
> you were going to say?

At that point I realised that a previous comment in my email was wrong
(I thought I'd found a bug in the new version of LocalTimezone), went
back and fixed it, but never deleted that line! Please ignore that half
finished sentence :-)

Luke

--
"Where a person wishes to attract, they should always be ignorant."
(Jane Austen)

Luke Plant || http://lukeplant.me.uk/

Luke Plant

unread,
Nov 11, 2011, 6:13:14 PM11/11/11
to django-d...@googlegroups.com
Just in case you were waiting for a reply, I've read your response and
everything seems good to me.

Luke


On 10/11/11 07:46, Aymeric Augustin wrote:

> Hi Luke,
>
> Thanks for the review! I've updated the branch with your suggestions.
>
> Here are a few comments -- when I have something to say other than
> "fixed, thanks".

--
"In your presence there is fullness of joy; at your right hand are
pleasures forevermore" Psalm 16:11

Luke Plant || http://lukeplant.me.uk/

Anssi Kääriäinen

unread,
Nov 18, 2011, 8:50:05 AM11/18/11
to Django developers
On Nov 1, 4:58 pm, Aymeric Augustin
<aymeric.augus...@polytechnique.org> wrote:
> I'm glad to announce that the time zone support branch is ready for review.

I did a simple performance test on PostgreSQL (fetch 1000 objs with
datetime field) and could not see any real performance regressions. I
did the same for template rendering, and according to my tests,
setting USE_TZ=True leads to around 10% slower template rendering (in
a template doing nothing else than rendering 1000 datetime fields). I
could not see any regressions when USE_TZ was False compared to Django
trunk. So, at least according to this simple test there is nothing to
worry about performance-wise. The end of the post includes the test
code for reference.

> The docs are supposed to explain everything you need, and I described many design decisions in my weekly check-ins, so I'm purposefully keeping this email short on details :)

This part of the documentation raises one question:
"""
Unfortunately, during DST transitions, some datetimes don't exist or
are
ambiguous. In such situations, pytz_ raises an exception. Other
:class:`~datetime.tzinfo` implementations, such as the local time zone
used as
a fallback when pytz_ isn't installed, may raise an exception or
return
inaccurate results. That's why you should always create aware datetime
objects
when time zone support is enabled.
"""

Won't this lead to web applications where errors suddenly start
popping up all over the place, but only twice (or once?) a year?

A setting "ON_NAIVE_DATETIMES=[do_nothing|log|error]" could be nice,
at least for testing. This way you could spot the errors early on. I
don't know if that would be too expensive performance wise, though.

> There is only one outstanding issue that I know of. `QuerySet.dates()` operates in the database timezone, ie. UTC, while an end user would expect it to take into account its current time zone. This affects `date_hierarchy` in the admin and the date based generic views. The same problem also exists for the `year`, `month`, `day`, and `week_day` lookups.
>
> Under PostgreSQL, provided the current time zone has a valid name (e.g. it's provided by pytz), we could temporarily set the connection's timezone to this time zone and restore it to UTC afterwards. With other backends, I can't think of a simpler solution than fetching all values and doing the aggregation in Python. Neither of these  thrills me, so for the time being, I've documented this as a known limitation.

I don't see this as a major issue, except if Django needs to support
this behaviour for backwards compatibility reasons. But as this is
documented as a limitation, not as a "feature", this is fine as is.
It would be nice to fix this in the future.

For PostgreSQL, tracking the current time zone for the connection
seems a bit hard - we can't rely a SET TIME ZONE is really going to be
in effect, unless we also track rollbacks/commits (see #17062). And
changing it for every DB query seems bad, too.

One idea is to use "SELECT col AT TIME ZONE 'GMT+2' as col". This way
the session time zone doesn't need to be changed. For example:
> SET TIME ZONE 'UTC';
> create table testt(dt timestamptz);
> insert into testt values(now());
> select dt at time zone 'GMT+2' as dt from testt;
2011-11-18 07:06:47.834771

> select dt at time zone 'Europe/Helsinki' as dt from testt;
2011-11-18 11:06:47.834771

Oracle seems to have a similar feature [1]. I don't know if other
databases offer similar functionality. MySQL has probably support for
something similar when using TIMESTAMP columns.

But as said, I don't see this as a major issue. It would be nice to
support this some day.

I haven't read the code in detail, but what I have seen seems to be
very good quality code. I hope to see this in trunk soon.

- Anssi

[1]: http://docs.oracle.com/cd/B19306_01/server.102/b14225/ch4datetime.htm
Search for AT TIME ZONE.

---- The test code

from django.core.management import setup_environ
import settings
setup_environ(settings)

# M is a model with one text field and one datetime field.
from tester.models import M
from datetime import datetime
from django.db import connection
from django.template import Template
from django.template import Context
import pytz



def load_data():
bulk = []
for i in range(0, 1000):
bulk.append(M(f1='foob', dt=datetime.now()))
M.objects.bulk_create(bulk)
load_data()

start = datetime.now()
for i in range(0, 10):
for obj in M.objects.all():
pass
print datetime.now() - start
print obj.dt

templ = """
{% for obj in objects %}
{{ obj.dt }}
{% endfor %}
"""

t = Template(templ)
c = Context({'objects': list(M.objects.all())})
start = datetime.now()
for i in range(0, 10):
output = t.render(c)
print datetime.now() - start

print 'The template generated... %s' % output[0:100]

Aymeric Augustin

unread,
Nov 18, 2011, 10:31:52 AM11/18/11
to django-d...@googlegroups.com
Hi Anssi,

2011/11/18 Anssi Kääriäinen <anssi.ka...@thl.fi>
So, at least according to this simple test there is nothing to
worry about performance-wise.

Thanks for the performance tests! They were still on my TODO list.

 
This part of the documentation raises one question:
"""
Unfortunately, during DST transitions, some datetimes don't exist or
are
ambiguous. In such situations, pytz_ raises an exception. Other
:class:`~datetime.tzinfo` implementations, such as the local time zone
used as
a fallback when pytz_ isn't installed, may raise an exception or
return
inaccurate results. That's why you should always create aware datetime
objects
when time zone support is enabled.
"""

Won't this lead to web applications where errors suddenly start
popping up all over the place, but only twice (or once?) a year?

If you're using aware datetimes internally, user input is the only source of naive datetimes in local time. Ambiguous values are handled properly by a validation error: http://myks.org/stuff/django-tz-demo-2.png

Since the hour of the DST switch is chosen to minimize human activity, this won't be an issue for most websites.

UIs that allow entering unambiguous datetimes in local time aren't intuitive — basically, you need an "is_dst" checkbox. If you really need to enter non-ambiguous data in local time during the DST switch, you probably shouldn't be working in local time at all. Armies use UTC, for instance.
 
A setting "ON_NAIVE_DATETIMES=[do_nothing|log|error]" could be nice,
at least for testing. This way you could spot the errors early on. I
don't know if that would be too expensive performance wise, though.

During the ambiguous hour, we have the choice between guessing randomly and raising an exception. The Zen of Python says: "In the face of ambiguity, refuse the temptation to guess."

That said, you can use your own tzinfo implementations if you want to customize that behavior.

 
One idea is to use "SELECT col AT TIME ZONE 'GMT+2' as col". This way
the session time zone doesn't need to be changed. For example:
> SET TIME ZONE 'UTC';
> create table testt(dt timestamptz);
> insert into testt values(now());
> select dt at time zone 'GMT+2' as dt from testt;
 2011-11-18 07:06:47.834771

> select dt at time zone 'Europe/Helsinki' as dt from testt;
 2011-11-18 11:06:47.834771

Oracle seems to have a similar feature [1]. I don't know if other
databases offer similar functionality. MySQL has probably support for
something similar when using TIMESTAMP columns.

But as said, I don't see this as a major issue. It would be nice to
support this some day.
 
This looks interesting, although I don't know how difficult it would be to integrate in the ORM.

I'm going to create a ticket and post your research there.

I hope to see this in trunk soon.

It is now :)

--
Aymeric.

Anssi Kääriäinen

unread,
Nov 18, 2011, 10:57:38 AM11/18/11
to Django developers
On Nov 18, 5:31 pm, Aymeric Augustin
<aymeric.augus...@polytechnique.org> wrote:
> Hi Anssi,
>
> 2011/11/18 Anssi Kääriäinen <anssi.kaariai...@thl.fi>
>
> > So, at least according to this simple test there is nothing to
> > worry about performance-wise.
>
> Thanks for the performance tests! They were still on my TODO list.

The tests could be better... It might be worth it to write something
testing this into djangobench. I just don't know which branch is the
current one :)

> If you're using aware datetimes internally, user input is the only source
> of naive datetimes in local time. Ambiguous values are handled properly by
> a validation error:http://myks.org/stuff/django-tz-demo-2.png

My point is that it would be very useful to know you are actually
using aware datetimes internally. Especially when migrating an old
project to use this feature, it is easy to miss some places that do
not use aware datetimes. And you might find it out when you hit an
hour where suddenly your webapp is failing all over the place. So the
setting I proposed would report usage of naive datetimes all the year,
so that you could spot the problems in testing, not in production.

> > One idea is to use "SELECT col AT TIME ZONE 'GMT+2' as col". This way
> > the session time zone doesn't need to be changed. For example:
> This looks interesting, although I don't know how difficult it would be to
> integrate in the ORM.
>
> I'm going to create a ticket and post your research there.

This is easy as far as writing the cast goes (a couple of lines in
backend/operations should do the trick), the problem is that columns
do not support parameters for the SQL, and fixing this will need to
touch n+1 places. Although allowing column parameters would be useful
in the long run even without the timezone problem. Aliasing the
columns properly could be a problem, too.

It could be possible to get rid of the SET TIME ZONE at the start of
new connections completely. This would save a couple of network round
trips for every connection. Although there might be some small
backwards compatibility problems for users using raw SQL...

> It is now :)

Nice! And thank you!

- Anssi

Aymeric Augustin

unread,
Nov 18, 2011, 11:06:59 AM11/18/11
to django-d...@googlegroups.com
2011/11/18 Anssi Kääriäinen <anssi.ka...@thl.fi>

My point is that it would be very useful to know you are actually
using aware datetimes internally. Especially when migrating an old
project to use this feature, it is easy to miss some places that do
not use aware datetimes. And you might find it out when you hit an
hour where suddenly your webapp is failing all over the place. So the
setting I proposed would report usage of naive datetimes all the year,
so that you could spot the problems in testing, not in production.

Ah, yes, that's an good point. It's very easy to intercept naive datetimes that reach the database adapter:
https://code.djangoproject.com/browser/django/trunk/django/db/models/fields/__init__.py#L788

I considered adding a warning there, but I decided against it because I thought it would be too obnoxious.

This deserves further thought.

--
Aymeric.

Hanne Moa

unread,
Nov 19, 2011, 6:24:03 AM11/19/11
to django-d...@googlegroups.com
On 18 November 2011 16:57, Anssi Kääriäinen <anssi.ka...@thl.fi> wrote:
> My point is that it would be very useful to know you are actually
> using aware datetimes internally. Especially when migrating an old
> project to use this feature, it is easy to miss some places that do
> not use aware datetimes. And you might find it out when you hit an
> hour where suddenly your webapp is failing all over the place. So the
> setting I proposed would report usage of naive datetimes all the year,
> so that you could spot the problems in testing, not in production.

Tying it to the DEBUG-option should be enough. If DEBUG is on, snitch
on naive date times. If locale is on you can give a warning when in
the danger zone, just before or after the DST-changeover.


HM

Aymeric Augustin

unread,
Feb 11, 2013, 4:03:14 PM2/11/13
to django-d...@googlegroups.com
Le 1 nov. 2011 à 15:58, Aymeric Augustin <aymeric....@polytechnique.org> a écrit :
>
> There is only one outstanding issue that I know of. `QuerySet.dates()` operates in the database timezone, ie. UTC, while an end user would expect it to take into account its current time zone. This affects `date_hierarchy` in the admin and the date based generic views. The same problem also exists for the `year`, `month`, `day`, and `week_day` lookups.


Fifteen months later, I finally got around to fixing this. I would appreciate (1) a review (2) tests under Oracle.

Ticket : https://code.djangoproject.com/ticket/17260
Patch: https://github.com/django/django/pull/715

The changes are quite involved — I didn't implement this originally for a reason :)

** Changes **

1) Add a new method QuerySet.datetimes(). This method takes a tzinfo argument that says in which time zone datetimes should be converted prior to truncation and aggregation. It defaults to the current time zone, which will produce the expected results in general.

2) Add plumbing to the ORM to pass the tzinfo argument down to the SQL generation layer, where database specific features are used to perform the time zone conversion within the database.

3) Make QuerySet.dates() return dates instead of datetimes. This is a backwards incompatible change, but:
- I don't want a method called "dates" to return aware datetimes;
- Real code may still work thanks to duck-typing;
- Replacing .dates() with .datetimes() is trivial.

4) Prevent QuerySet.dates() from operating on datetime fields. This is a backwards incompatible change, but:
- The restriction is only enforced when USE_TZ is True;
- It's necessary to stop mixing dates and datetimes unsafely when USE_TZ is True;
- QuerySet.dates() returned incorrect results in 1.4 and 1.5 when USE_TZ was True anyway.

5) Add __hour, __minute and __second lookups, and make all lookups on datetime fields in the current time zone. There's no way to pass an explicit time zone to the lookups, but the current time zone can be overridden if necessary. There's some special handling for __year lookups because they're translated to __range lookups. This is backwards incompatible if someone has fields named hour, minute or second.

** Database support **

- PostgreSQL / PostGIS: everything works.
- SQLite: everything works, requires pytz; Spatialite: untested.
- MySQL / MySQL GIS: everything works, requires loading the time zone definitions.
- Oracle / Oracle GIS: untested.

A database feature called "has_zoneinfo_database" says if the time zone definitions are available.

** Remarks **

There are similar code paths for dates and datetimes, which result in some code duplication. I've tried to minimize it through subclassing wherever possible.

It's weird to add the time zone parameter in SQLDateTimeCompiler.as_sql. This hack is necessary because the ORM doesn't handle parameters for select expressions. For example, SQLCompiler.get_columns returns "sql", unlike get_from_clause which returns "sql, params".

Thanks,

--
Aymeric.



Aymeric Augustin

unread,
Feb 12, 2013, 4:25:36 PM2/12/13
to django-d...@googlegroups.com
I spent the better part of the day struggling with Oracle and unsurprisingly nothing works :(

Given this model:

class Event(models.Model):
dt = models.DateTimeField()

I'm trying to implement this query with time zone support enabled:

Event.objects.filter(dt__month=1)

That's a test from Django's test suite:
https://github.com/aaugustin/django/blob/queryset-datetimes/tests/modeltests/timezones/tests.py#L424

----------------------------------------

I have found a syntax that should work, but it crashes Oracle.

>>> import cx_Oracle
>>> conn = cx_Oracle.connect('django', 'Im_a_lumberjack', cx_Oracle.makedsn('127.0.0.1', 1521, 'orcl'))
>>> c = conn.cursor()
>>> c.execute("ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY-MM-DD HH24:MI:SS' NLS_TIMESTAMP_FORMAT = 'YYYY-MM-DD HH24:MI:SS.FF'")
>>> c.execute("""INSERT INTO "TIMEZONES_EVENT" ("DT") VALUES (:arg0)""", ['2010-12-31 22:30:00'])
>>> c.execute("""SELECT FROM_TZ("DT", 'UTC') AT TIME ZONE (:arg) FROM "TIMEZONES_EVENT" """, ['Africa/Nairobi']).fetchall()
[(datetime.datetime(2011, 1, 1, 1, 30),)]

* At this point we have a single object in the database with DT == 2010-12-31 22:30:00 Z == 2011-01-01 01:30:00 +03:00.
* Oracle knows that Africa/Nairobi is +03:00.

>>> c.execute("""SELECT EXTRACT(MONTH FROM CAST((FROM_TZ("DT", 'UTC') AT TIME ZONE (:arg)) AS DATE)) FROM "TIMEZONES_EVENT" """, ['Africa/Nairobi']).fetchall()
[(1,)]

* This expression dutifully extracts the month in local time.
* The time zone name is passed in argument.

>>> c.execute("""SELECT "DT" FROM "TIMEZONES_EVENT" WHERE EXTRACT(MONTH FROM CAST((FROM_TZ("DT", 'UTC') AT TIME ZONE ('Africa/Nairobi')) AS DATE)) = 1""").fetchall()
[(datetime.datetime(2010, 12, 31, 22, 30),)]

* That's almost what I want, but the time zone name is hardcoded. You'll see why in a second.

>>> c.execute("""SELECT "DT" FROM "TIMEZONES_EVENT" WHERE EXTRACT(MONTH FROM CAST((FROM_TZ("DT", 'UTC') AT TIME ZONE (:arg)) AS DATE)) = 1""", ['Africa/Nairobi']).fetchall()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
cx_Oracle.OperationalError: ORA-03113: end-of-file on communication channel

* That's exactly the same query as above, except the time zone name is passed in argument.
* Not only does it crash, but it closes the database connection!

Obviously I'm hitting a bug in Oracle. Does anyone have ideas to solve this?

----------------------------------------

One alternative is the use string interpolation to put the time zone name in the query. (It works; that's how I first implemented the feature.)

To avoid SQL injection via this parameter, heavy-handed sanitization of the time zone name will be necessary, like r'^[\w/+-]+$'.

Pro:
- Removes the hack in DateTimeSQLCompiler to inject the time zone parameter.
- Works around the bug in Oracle.

Cons:
- Prevents using arbitrary time zone names. This was discussed at length on the ticket , but I don't know how much of a problem it is in practice. Windows has non-standard, localized time zone names; do databases use them?

--
Aymeric.



Ian Kelly

unread,
Feb 12, 2013, 5:06:32 PM2/12/13
to django-d...@googlegroups.com
On Tue, Feb 12, 2013 at 2:25 PM, Aymeric Augustin
<aymeric....@polytechnique.org> wrote:
>>>> c.execute("""SELECT "DT" FROM "TIMEZONES_EVENT" WHERE EXTRACT(MONTH FROM CAST((FROM_TZ("DT", 'UTC') AT TIME ZONE (:arg)) AS DATE)) = 1""", ['Africa/Nairobi']).fetchall()
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> cx_Oracle.OperationalError: ORA-03113: end-of-file on communication channel
>
> * That's exactly the same query as above, except the time zone name is passed in argument.
> * Not only does it crash, but it closes the database connection!
>
> Obviously I'm hitting a bug in Oracle. Does anyone have ideas to solve this?

I encounter the same bug in Oracle 10g XE. I tried it also on an 11.2
database, and it seemed to work, but I ran into a different issue:
neither of the time zones 'Africa/Nairobi' nor 'UTC' existed in the
database. I substituted 'Africa/Cairo' and 'Etc/GMT' for the test.
Wondering if that was just a local configuration issue, I tried it
again on another newer 11.2 database. This one did recognize all the
time zones, but also exhibited the connection loss bug.

Ian Kelly

unread,
Feb 12, 2013, 5:12:39 PM2/12/13
to django-d...@googlegroups.com
Against the possible issue of non-recognition of UTC, I suggest using
'0:00' instead of 'UTC'.

Aymeric Augustin

unread,
Feb 12, 2013, 5:35:56 PM2/12/13
to django-d...@googlegroups.com
Thanks for looking into this problem!

I tried '0:00' instead of 'UTC', and '+03:00' instead of
'Africa/Nairobi' with the same result. My first queries
show that conversions are performed correctly
between these two time zones.

When I pass an invalid time zone name I get:
ORA-01882: timezone region not found

I'm testing with the "Oracle Developer Day" Virtual Box VM
(Oracle Database 11g Enterprise Edition Release 11.2.0.2.0)
and v$timezones contains time zone definitions out of the box.

I have cx_Oracle 5.1.2 compiled against Instant Client 10.2
in 32 bits mode (the 11.2 package is broken, and the 64 bits
10.2 package segfaults).

--
Aymeric.



Aymeric Augustin

unread,
Feb 13, 2013, 2:31:36 PM2/13/13
to django-d...@googlegroups.com
I bit the bullet and implemented passing the time zone name:
- via a parameter on PostgreSQL, SQLite and MySQL,
- through string interpolation on Oracle, with a regexp to validate the value.

As a pre-requisite, the ORM needs to handle parameters in the SELECT clause:
https://github.com/aaugustin/django/commit/924a144ef8a80ba4daeeafbe9efaa826566e9d02

Structures that provide an .as_sql() method have been changed to return a
tuple of (sql string, list of params). This includes Aggregate, SQLEvaluator,
and a few others. I fixed some inconsistencies along the way. I also changed
spatial_lookup_sql to this format, because it usually delegates to as_sql.

Note that SQLCompiler.as_sql() returns (sql string, tuple of params) and isn't
involved here.

Then I made tzname a parameter of the database backend's datetime_extract_sql
and datetime_trunc_sql methods:
https://github.com/aaugustin/django/commit/01b6366f3ce67d57a58ca8f25e5be77911748638

The SQL for Oracle isn't very elegant; the final version of the query I
discussed yesterday is:
SELECT "DT" FROM "TIMEZONES_EVENT" WHERE EXTRACT(MONTH FROM CAST(TO_DATE(TO_CHAR((FROM_TZ("TIMEZONES_EVENT"."DT", 'UTC') AT TIME ZONE 'Africa/Nairobi'), 'YYYY-MM-DD HH24:MI:SS'), 'YYYY-MM-DD HH24:MI:SS') AS TIMESTAMP)) = 1

I plan to commit this in a few days; let me know if need more time to review
it!

Thanks.

--
Aymeric.

Ian Kelly

unread,
Feb 13, 2013, 2:54:18 PM2/13/13
to django-d...@googlegroups.com
On Tue, Feb 12, 2013 at 3:35 PM, Aymeric Augustin
<aymeric....@polytechnique.org> wrote:
>> Against the possible issue of non-recognition of UTC, I suggest using
>> '0:00' instead of 'UTC'.
>
> Thanks for looking into this problem!
>
> I tried '0:00' instead of 'UTC', and '+03:00' instead of
> 'Africa/Nairobi' with the same result. My first queries
> show that conversions are performed correctly
> between these two time zones.
>
> When I pass an invalid time zone name I get:
> ORA-01882: timezone region not found

I didn't mean to suggest this as a fix to the connection dropping
problem. I only meant that the final SQL should use '0:00' instead of
'UTC' to guard against databases that for whatever reason do not have
a UTC time zone definition, such as the one that I found. Otherwise
the query would always fail with an ORA-01882 on those databases.

Aymeric Augustin

unread,
Feb 13, 2013, 3:26:31 PM2/13/13
to django-d...@googlegroups.com
Le 13 févr. 2013 à 20:54, Ian Kelly <ian.g...@gmail.com> a écrit :
> I only meant that the final SQL should use '0:00' instead of
> 'UTC' to guard against databases that for whatever reason do not have
> a UTC time zone definition, such as the one that I found. Otherwise
> the query would always fail with an ORA-01882 on those databases.

Ah -- sorry, I misunderstood. Yes, it's a good idea. Fixed here:
https://github.com/aaugustin/django/commit/4f290bdb60b7d8534abf4ca901bd0844612dcbda

--
Aymeric.



Aymeric Augustin

unread,
Sep 8, 2013, 6:01:42 AM9/8/13
to django-d...@googlegroups.com
Hello,

As noted by Luke in his review of the time zone support branch, there's a large overlap between django.utils.tzinfo and django.utils.timezone. See the excerpt of our discussion below. I prepared a pull request for this long overdue refactoring.

https://github.com/django/django/pull/1601

There's one known backwards incompatibility, which is also the reason why I initially included a second definition of the local time zone. If you haven't installed pytz, and you're using datetimes before 1970 or 2038, you may encounter exceptions in:
- Django's time zone related date formats, specifically "e", "I", "O", "T" and "Z".
- the syndications framework — I don't expect blog posts to be dated before 1970 or far into the future, but I suppose the framework could be used for historical data;
- the django.utils.tzinfo.LocalTimezone class, which is a public API.

In all these cases, the fix is simply to install pytz, and the exception message tells you so (#18766). I believe we'll have to make this change at some point and now seems reasonable.

In fact, if you set USE_TZ = True, you're already in the same situation: you need pytz to support datetimes before 1970 or 2038. This hasn't been a problem in practice since Django 1.4 was released. The change I'm proposing here is slightly more disruptive because it may require pytz when upgrading from Django 1.6 to 1.7 even if you don't opt-in to time zone support, but the comparison still stands.

Thoughts?

--
Aymeric.

PS: `pip install pytz` doesn't work with pip 1.4. Use `pip install "pytz > dev"` instead. Questions go to distutils-sig, please :)
Reply all
Reply to author
Forward
0 new messages