--
Ticket URL: <https://code.djangoproject.com/ticket/17260>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Comment (by aaugustin):
Here are the comments that were sent to the mailing list:
''Aymeric Augustin (myself)''
Under PostgreSQL, provided the current time zone has a valid name (e.g.
it's provided by pytz), we could temporarily set the connection's timezone
to this time zone and restore it to UTC afterwards. With other backends, I
can't think of a simpler solution than fetching all values and doing the
aggregation in Python. Neither of these thrills me, so for the time
being, I've documented this as a known limitation.
''Luke Plant''
You could argue a case for saying that the behaviour has some advantages:
if one admin user filters a list of objects by date, and sends the link to
a colleague, they would probably expect the colleague to end up with the
same list of objects. But even if that isn't very convincing, I'm happy
with this limitation given the implementation difficulties of fixing it.
''Anssi Kääriäinen''
I don't see this as a major issue, except if Django needs to support this
behaviour for backwards compatibility reasons. But as this is documented
as a limitation, not as a "feature", this is fine as is. It would be nice
to fix this in the future. For PostgreSQL, tracking the current time zone
for the connection seems a bit hard - we can't rely a SET TIME ZONE is
really going to be in effect, unless we also track rollbacks/commits (see
#17062). And changing it for every DB query seems bad, too. One idea is to
use "SELECT col AT TIME ZONE 'GMT+2' as col". This way the session time
zone doesn't need to be changed. For example:
{{{
> SET TIME ZONE 'UTC';
> create table testt(dt timestamptz);
> insert into testt values(now());
> select dt at time zone 'GMT+2' as dt from testt;
2011-11-18 07:06:47.834771
> select dt at time zone 'Europe/Helsinki' as dt from testt;
2011-11-18 11:06:47.834771
}}}
Oracle seems to have
[http://docs.oracle.com/cd/B19306_01/server.102/b14225/ch4datetime.htm a
similar feature]. I don't know if other databases offer similar
functionality. MySQL has probably support for something similar when using
TIMESTAMP columns. But as said, I don't see this as a major issue. It
would be nice to support this some day.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:1>
* owner: nobody => aaugustin
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:2>
--
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:3>
Comment (by aaugustin):
For the record, I'm afraid this will quickly become contentious once 1.4
is out...
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:4>
Comment (by akaariai):
You mean contentious as in "wontfix, breaks existing programs"? In my
opinion that would be bad. It is pretty clear to me that the current
behavior is broken. The dates should be in the currently active time zone.
I think I know how to fix this:
- aggregates can return parameters (useful for other features, too
"conditional aggregates" for example). This is returning parameters from
the cols part of the query.
- other needed parts already can return parameters (where conditions
etc).
- use the AT TIME ZONE trick above, or if that is not available, use just
" + timedelta offset minutes".
- hope that everything works.
The problem is there is a _lot_ of places to fix. In addition, the ORM is
a little complex to work with sometimes. I just can't see getting this
into 1.4 in time.
Any way to mark this as a known bug which _will_ be fixed in the future?
The other way forward is introducing "at_timezone" argument for the
operations, but how to do that for `__date` for example?
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:5>
Comment (by aaugustin):
I mean "we're going to take some flak".
I plan to fix this eventually.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:6>
Comment (by akaariai):
Yes, I can see this is very complex to fix in a way that actually works
even closely the same on every backend.
It might be possible to just add the offset into the datetime, then do the
conversion to the date. This all in the DB. This could actually work, as
DST is usually around 3-4 o'clock AM, so there would not be error in date
conversions.
I am tempted to give this a try...
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:7>
* cc: kmike84@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:8>
* cc: anssi.kaariainen@… (added)
Comment:
I have a "POC" version ready. It just adds some timedeltas to the values
where needed. In the database everything is still done at time zone UTC,
we just add the offsets as Python reports them.
The biggest problem with this approach is DST handling. We use the offset
as it is _now_, however the database date might have a different offset
(offset now is 3 hours, in the summer 2 hours). So, we have an error of
one hour for summer dates.
This means that this issue is really hard to fix: if we use current
offsets, we will make an error of one hour across DST (not to mention
actual time zone changes in South America...). That is not correct enough
to call this fixed. We could use "AT TIME ZONE 'tz name'" where available,
but that has just one little problem: we don't know the time zone name.
So, I really can't see an easy fix for this. We could invent a mapping
mechanism for the Python time zone name -> DB time zone name. That isn't
easy either, I believe PostgreSQL can have different time zone names on
Windows and Linux. So, the user should actually define his converter
dictionary in the settings.py, for each database. Or we could have our own
mappings: time zone name -> list of time zones to try. On first connect
inspect what time zones are available. Endless source of bugs, but on the
other hand that isn't much different from supporting the different
locales. This is the best approach available.
After all this MySQL would still be broken. I don't believe there is
anything Django can do to make time zones work correctly together with
dates queries on that database. I really wonder if it is a good idea to
encourage people to use USE_TZ with MySQL. The setup is broken, and as far
as I see, will be broken.
The attached patch is, as said, just a proof of concept, or more correctly
a poor of the concept not working.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:9>
Comment (by akaariai):
One more possibility: just require that every time zone used is usable as
database time zone as is. Current time zones follow this rule (I mean the
configured settings' time zone). It might be hard to require this one
after 1.4 is released, and it might be also hard for those who do not use
pytz.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:10>
Comment (by akaariai):
OK, another POC patch which has some support for AT TIME ZONE when using
PostgreSQL.
As far as I can see SQLite is somewhat easy to fix, PostgreSQL and Oracle
are fixable if we can translate the Python time zone names to something
the database understands. It seems the best fix for MySQL is adding the
offset which has the DST problem.
I am now done with tinkering with this ticket. I hope the POCs have some
value for solving this properly after 1.4 has been released. At this point
my only worry is that the option of requiring that all time zone names
used must be usable as-is in the database will be lost after 1.4 is
released.
Lesson of the day: In POSIX syntax GMT+3 is GMT-3. Of course. As if the
timezone handling wasn't confusing enough already...
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:11>
Comment (by akaariai):
An update about MySQL: it has [http://dev.mysql.com/doc/refman/5.0/en
/date-and-time-functions.html#function_convert-tz convert_tz] function
which does something equivalent to AT TIME ZONE. However, there are some
problems with convert_tz:
- The timezone table for MySQL has to be loaded, if I understand the
documentation correctly this is not done by default
- If you try to convert a datetime using an unknown (to MySQL) timezone
name, you get null back
- If the datetime falls out of the range 1970-2038 you will not get any
conversion, you get the input datetime back.
So, the convert_tz function would cure the problem for MySQL, assuming the
datetime converted is in range 1970-2038. That could be documented as
known limitation. This was the last missing piece of this puzzle:
- SQLite should be easy to support by using Python code directly in the
query.
- PostgreSQL and Oracle should work using AT TIME ZONE syntax. Python
<-> DB timezone names must match or there must be a converter.
- MySQL should work with convert_tz (with the above cave-eats), Again
Python <-> DB timezone names must match or there must be a converter.
There is a lot of work left to actually implement the above properly for
all core databases. However, that should be pretty straightforward coding
to do. The big problem is converting the timezone names from Python to DB
format. Any ideas how to do that in somewhat clean fashion?
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:12>
Comment (by aaugustin):
#18217 was a duplicate.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:13>
Comment (by aaugustin):
I've given a lot of thought to this problem when preparing a talk for
DjangoCon.eu. I suggest to:
1) **Fetch all the datetimes and perform the aggregation in Python to
compute `QuerySet.dates()` on a `DateTimeField` when `USE_TZ = True`.**
- That shouldn't be hard to implement.
- Performance may suffer, but it has always been an expensive operation.
This would just shift the load from the database server to the application
server.
- We'll need a parameter to specify which time zone to use. It should
default to the default time zone, because that's what the model layer
always does. In many cases, programmers will want to use the current time
zone instead, but for the sake of explicitness and consistency, I still
prefer to use the default time zone by default. I'm open to discussing
this.
- If some backends can easily do this operation in the database
(PostgreSQL) they can provide their own efficient implementation.
2) **Deprecate the `__day/month/year/week_day` lookups on a
`DateTimeField` when `USE_TZ = True`**
- There's no way we can pass an explicit time zone argument with the
lookup syntax.
- Even if the `__year` lookup can be emulated with a range lookup, the
`__day` and `__month` lookups can't. They allow finding "all objects on
the 1st of any month" or "all objects in January of any year" which is
slightly pathological.
- Like akaariai says, "there's a lot of work left", and I think that if
we start on the path of hacking that, we'll be stuck forever trying to fix
a fundamentally broken operation.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:14>
Comment (by aaugustin):
I'm still making up my mind on this ticket. I like Anssi's last proposal
more and more for 1).
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:15>
Comment (by akaariai):
Not directly related to this ticket, but if we can convert the datetimes
to correct time zone in the DB, there are other uses than solving this
ticket, too. For example it would be very useful to be able to generate
"sum of payments per month and client" aggregate using the ORM. In these
cases getting the timezones correctly can be important. Of course, the TZ
issue isn't the only issue preventing this type of query currently...
BTW there could be qs.timezone() operation which sets the whole queryset
to work in the given timezone.
Still, as said before, in-db support will take a lot of time to get
working correctly.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:16>
Comment (by aaugustin):
My last idea was to introduce a `datetime_trunc_sql` operation in addition
to `date_trunc_sql`.
When `USE_TZ = False`, `datetime_trunc_sql` would be equivalent to
`date_trunc_sql`. When `USE_TZ = True`, `datetime_trunc_sql` would convert
to the appropriate time zone before truncating. It could take the timezone
as an optional argument.
Then `QuerySet.dates()` would use `datetime_trunc_sql` for `DateTimeField`
and `date_trunc_sql` for `DateField`.
----
By the way, our use case is totally valid — I've implemented a lot of
reports like this in Python, since it's currently not possible in DB.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:17>
Comment (by akaariai):
That datetime_trunc_sql seems like a clean solution.
I believe the timezone must be a cursor.execute() parameter, otherwise
there could be potential for sql injection. Not 100% sure of this though.
What should one do if the timezone isn't available in the DB? Just error
out? This seems like the cleanest approach, but of course, this isn't
backwards compatible. Maybe raise a warning, and continue using no
conversions? Basically we would then need a method get_db_tz_name for the
backends, which would convert the Python timezone to the DB timezone, and
error/warn if the timezone can't be converted to db timezone. This should
prevent the SQL-injection possibility too. I believe we could ask the
get_db_tz_name() to escape the time zone properly.
So, +1 for the datetime_trunc_sql approach. I really like the idea of
using the DB for this, as aggregation for example can be a lot cheaper to
do directly in the DB instead of fetching the data into Python.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:18>
Comment (by aaugustin):
I'd better keep things simple by documenting that `tzname` must return a
string that the database will understand as a time zone name. The tzinfo
database, used by `pytz`, is the de-facto standard. Adding a conversion
functions sounds like overengineering to me.
While SQL injections through time zone names are unlikely, indeed, we must
prevent them.
----
NB: SQLite will be restricted to using the default time zone, as defined
by `os.environ['TZ']`. I'm willing to live with this limitation. We can
put that under the "use a more feature complete database" umbrella.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:19>
Comment (by akaariai):
I don't buy the argument that tzinfo names are usable on every DB. Are
they on Oracle, DB2, and MSSQL? Are they usable on databases running on
Windows? If I understand correctly Windows doesn't use the tzinfo
database.
MySQL will be interesting once again. If the timezone table isn't
populated, it returns silently unconverted data. This is other part of the
get_db_tz_name() - the idea is that this function could warn/error in this
case. It would basically query the DB (on MySQL at least) if the timezone
is valid. If not, it would do the warn/error, if yes, then it would cache
that the tzinfo name is usable.
But, I think you are right here about not adding the conversion functions.
They would be absolutely horrible to maintain.
If we require the user to explicitly set the tzname by using
.use_tz(tzname) queryset method, then it is possible to document that the
user must ensure that the tzname is usable in the DB. The tzname could be
a callable (most likely one returning the current active time zone name)
which is resolved at execute time. Is this the idea you were referring to
by "documenting that tzname must return a string that the database will
understand"? If so, +1 to that approach.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:20>
* cc: semente+django@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:21>
Comment (by aaugustin):
I've started a branch to work on this problem:
https://github.com/aaugustin/django/compare/queryset-datetimes
This branch is unstable and will be rebased often, don't fork it.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:22>
Comment (by aaugustin):
I've started by documented the target public API.
Conversions/truncations will use the current time zone by default because:
- aggregation should generally happen in the end user's time zone;
- it allows using `timezone.override` as shown below.
The design is sufficiently flexible to deal with all the cases discussed
above.
Even "uncooperative" setups (MySQL, Windows) can be made to work with a
custom tzinfo whose `tzname` returns the appropriate value for the
database:
{{{
qs.datetimes('dt', 'day', timezone=non_standard_tzinfo_object)
}}}
{{{
with timezone.override(non_standard_tzinfo_object):
qs.filter(dt__day=1)
}}}
Using `.datetimes()` doesn't require a custom tzinfo class; the
appropriate time zone name for the database can be provided directly:
{{{
qs.datetimes('dt', 'day', timezone='Non-standard name")
}}}
I don't care if MySQL silently returns wrong data when it's misconfigured.
This isn't worse than all the other data truncation or corruption bugs
(for instance, when inserting a string longer than the field size) that
happen with MySQL's default configuration. If you use MySQL and care about
data integrity, you have to read its manual (and much more).
Finally I don't see the need for a `get_db_tz_name` method; the standard
`tzname` suffices.
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:23>
* has_patch: 0 => 1
Comment:
Pull request: https://github.com/django/django/pull/715
Discussion: https://groups.google.com/d/msg/django-
developers/zwQju7hbG78/2FtO50OYaWQJ
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:24>
* status: new => closed
* resolution: => fixed
Comment:
In [changeset:"e74e207cce54802f897adcb42149440ee154821e"]:
{{{
#!CommitTicketReference repository=""
revision="e74e207cce54802f897adcb42149440ee154821e"
Fixed #17260 -- Added time zone aware aggregation and lookups.
Thanks Carl Meyer for the review.
Squashed commit of the following:
commit 4f290bdb60b7d8534abf4ca901bd0844612dcbda
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Wed Feb 13 21:21:30 2013 +0100
Used '0:00' instead of 'UTC' which doesn't always exist in Oracle.
Thanks Ian Kelly for the suggestion.
commit 01b6366f3ce67d57a58ca8f25e5be77911748638
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Wed Feb 13 13:38:43 2013 +0100
Made tzname a parameter of datetime_extract/trunc_sql.
This is required to work around a bug in Oracle.
commit 924a144ef8a80ba4daeeafbe9efaa826566e9d02
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Wed Feb 13 14:47:44 2013 +0100
Added support for parameters in SELECT clauses.
commit b4351d2890cd1090d3ff2d203fe148937324c935
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Mon Feb 11 22:30:22 2013 +0100
Documented backwards incompatibilities in the two previous commits.
commit 91ef84713c81bd455f559dacf790e586d08cacb9
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Mon Feb 11 09:42:31 2013 +0100
Used QuerySet.datetimes for the admin's date_hierarchy.
commit 0d0de288a5210fa106cd4350961eb2006535cc5c
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Mon Feb 11 09:29:38 2013 +0100
Used QuerySet.datetimes in date-based generic views.
commit 9c0859ff7c0b00734afe7fc15609d43d83215072
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Sun Feb 10 21:43:25 2013 +0100
Implemented QuerySet.datetimes on Oracle.
commit 68ab511a4ffbd2b811bf5da174d47e4dd90f28fc
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Sun Feb 10 21:43:14 2013 +0100
Implemented QuerySet.datetimes on MySQL.
commit 22d52681d347a8cdf568dc31ed032cbc61d049ef
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Sun Feb 10 21:42:29 2013 +0100
Implemented QuerySet.datetimes on SQLite.
commit f6800fd04c93722b45f9236976389e0b2fe436f5
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Sun Feb 10 21:43:03 2013 +0100
Implemented QuerySet.datetimes on PostgreSQL.
commit 0c829c23f4cf4d6804cadcc93032dd4c26b8c65e
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Sun Feb 10 21:41:08 2013 +0100
Added datetime-handling infrastructure in the ORM layers.
commit 104d82a7778cf3f0f5d03dfa53709c26df45daad
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Mon Feb 11 10:05:55 2013 +0100
Updated null_queries tests to avoid clashing with the __second lookup.
commit c01bbb32358201b3ac8cb4291ef87b7612a2b8e6
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Sun Feb 10 23:07:41 2013 +0100
Updated tests of .dates().
Replaced .dates() by .datetimes() for DateTimeFields.
Replaced dates with datetimes in the expected output for DateFields.
commit 50fb7a52462fecf0127b38e7f3df322aeb287c43
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Sun Feb 10 21:40:09 2013 +0100
Updated and added tests for QuerySet.datetimes.
commit a8451a5004c437190e264667b1e6fb8acc3c1eeb
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Sun Feb 10 22:34:46 2013 +0100
Documented the new time lookups and updated the date lookups.
commit 29413eab2bd1d5e004598900c0dadc0521bbf4d3
Author: Aymeric Augustin <aymeric....@m4x.org>
Date: Sun Feb 10 16:15:49 2013 +0100
Documented QuerySet.datetimes and updated QuerySet.dates.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:25>
Comment (by Simon Charette <charette.s@…>):
In [changeset:"c4a876a0c1ad5e671c26898ccabb1e03837f6fcf" c4a876a]:
{{{
#!CommitTicketReference repository=""
revision="c4a876a0c1ad5e671c26898ccabb1e03837f6fcf"
Refs #17260 -- Prevented Oracle timezone conversion from stripping
microseonds.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/17260#comment:26>