[Django] #25339: Aggregation and annotation by time period and intervals (by month, week, day, hour, etc)

42 views
Skip to first unread message

Django

unread,
Sep 2, 2015, 6:17:36 AM9/2/15
to django-...@googlegroups.com
#25339: Aggregation and annotation by time period and intervals (by month, week,
day, hour, etc)
----------------------------------------------+----------------------------
Reporter: malefice | Owner: nobody
Type: New feature | Status: new
Component: Database layer (models, ORM) | Version: 1.8
Severity: Normal | Keywords: QuerySet.extra
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+----------------------------
I often found that aggregation of data by dynamically generating time
periods given a desired time interval is a very sought after feature. I
personally created the snippet below to accommodate my needs, but as per
the note
[https://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.extra
here], the extra() method will be deprecated soon. I think it will be a
waste to deprecate this API, since there is simply a lot of DBMS-specific
SQL functions available, but if there is any way we can do this via Django
ORM, then I am still up for it. The snippet below shows a custom QuerySet
method I often use.

{{{#!python
def get_metrics(self, frequency):
select_fields = OrderedDict()
if frequency == 'week':
select_fields['time_period'] = "date_trunc(%s,
initial_timestamp::TIMESTAMP WITH TIME ZONE AT TIME ZONE %s + '1
day'::interval) - '1 day'::interval"
else:
select_fields['time_period'] = "date_trunc(%s,
initial_timestamp::TIMESTAMP WITH TIME ZONE AT TIME ZONE %s)"
select_params = (frequency, settings.TIME_ZONE,)
queryset = self.extra(select=select_fields,
select_params=select_params).values('time_period', ....)
queryset = queryset.annotate(
# Add annotations
)
return queryset
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/25339>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 2, 2015, 6:19:05 AM9/2/15
to django-...@googlegroups.com
#25339: Aggregation and annotation by time period and intervals (by month, week,
day, hour, etc)
-------------------------------------+-------------------------------------

Reporter: malefice | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Normal | Resolution:

Keywords: QuerySet.extra | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by malefice):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Old description:

New description:

--

--
Ticket URL: <https://code.djangoproject.com/ticket/25339#comment:1>

Django

unread,
Sep 2, 2015, 7:37:59 AM9/2/15
to django-...@googlegroups.com
#25339: Aggregation and annotation by time period and intervals (by month, week,
day, hour, etc)
-------------------------------------+-------------------------------------
Reporter: malefice | Owner: nobody
Type: New feature | Status: closed

Component: Database layer | Version: 1.8
(models, ORM) | Resolution:
Severity: Normal | worksforme

Keywords: QuerySet.extra | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by jarshwah):

* status: new => closed
* resolution: => worksforme


Comment:

We began talking about deprecating `.extra()` because now we have user
defined expressions that can replace (the majority of) `.extra`. See
[https://docs.djangoproject.com/en/dev/ref/models/expressions/#func-
expressions Func Expressions] specifically.

Django 1.9 will also include Day/Month/Year/Week transformations, so
you'll be able to do something similar to:

{{{
from django.db.models.lookups import MonthTransform

Sale.objects.filter(
sale_date__year=2015
).annotate(
month=MonthTransform('sale_date')
).values('month').aggregate('id')
}}}

If you find there's something you cannot do with expressions that you can
currently do with extra (with regards to SELECT), please come talk to us
in #django or post to django-users. We are already aware that extra is
still the only way to introduce custom joins.

--
Ticket URL: <https://code.djangoproject.com/ticket/25339#comment:2>

Django

unread,
Sep 11, 2015, 7:37:21 AM9/11/15
to django-...@googlegroups.com
#25339: Aggregation and annotation by time period and intervals (by month, week,
day, hour, etc)
-------------------------------------+-------------------------------------
Reporter: malefice | Owner: nobody

Type: New feature | Status: closed
Component: Database layer | Version: 1.8
(models, ORM) | Resolution:
Severity: Normal | worksforme
Keywords: QuerySet.extra | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by malefice):

I do not think that excerpt does the same functionality mine does. It
seems that it only filters for the year 2015, extracts the month number,
and then performs a 'GROUP BY' based on the month number. It does not
generate an actual timeseries that can easily be parsed by javascript and
charting libraries.

My excerpt outputs actual time periods based on a range of dates with
specific intervals/frequency. If for example, the `start_date` is
2015-01-01, `end_date` is 2015-07-07, and `frequency` is month, it will
aggregate the results that match the date range and group them by month
dates like so: 2015-01-01, 2015-02-01, 2015-03-01, ... 2015-07-01.

If the `frequency` is hour, then it will output each and every hour like
so: 2015-01-01 00:00:00, 2015-01-01 01:00:00, 2015-01-01 02:00:00, ...
2015-05-01 00:00:00, 2015-05-01 01:00:00 , ... , 2015-07-07 22:00:00, and
2015-07-07 23:00:00. You will have a column containing the actual
timeseries.

One can only do that efficiently in PostgreSQL by using the `DATE_TRUNC`
function, and I looked in the source code, but it does not seem to be used
in anywhere that might resemble something like my excerpt. I think it is
possible to write a custom Transform for that, but it would be nice to
have it ready out of the box. At least in my case, projects always involve
aggregating and presenting data with a timeseries.

--
Ticket URL: <https://code.djangoproject.com/ticket/25339#comment:3>

Django

unread,
Sep 14, 2015, 8:54:34 PM9/14/15
to django-...@googlegroups.com
#25339: Aggregation and annotation by time period and intervals (by month, week,
day, hour, etc)
-------------------------------------+-------------------------------------
Reporter: malefice | Owner: nobody

Type: New feature | Status: closed
Component: Database layer | Version: 1.8
(models, ORM) | Resolution:
Severity: Normal | worksforme
Keywords: QuerySet.extra | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by jarshwah):

Malefice, if you feel this would be something useful for Django, then
please feel welcome to open up a PR with an implementation that uses
Expressions, and we can consider the inclusion at that point. It will have
to work with the other supported backends too though in some way.

--
Ticket URL: <https://code.djangoproject.com/ticket/25339#comment:4>

Django

unread,
Sep 15, 2015, 3:04:47 AM9/15/15
to django-...@googlegroups.com
#25339: Aggregation and annotation by time period and intervals (by month, week,
day, hour, etc)
-------------------------------------+-------------------------------------
Reporter: malefice | Owner: nobody

Type: New feature | Status: closed
Component: Database layer | Version: 1.8
(models, ORM) | Resolution:
Severity: Normal | worksforme
Keywords: QuerySet.extra | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by akaariai):

I think DateTrunk expression makes perfect sense for Django. It is often
very useful for aggregation for example. Similarly date formatting and
date parsing functions would be useful.

Oracle and PostgreSQL have date truncation methods out of the box, for
sqlite and mysql it seems we have to use some tricks (format first the
date as string '%y-%m-01', and then read that back in as date for month
truncate for example).

--
Ticket URL: <https://code.djangoproject.com/ticket/25339#comment:5>

Reply all
Reply to author
Forward
0 new messages