Implement a custom range-based filter for date_heirarchy

100 views
Skip to first unread message

Haki Benita

unread,
Dec 16, 2017, 11:01:28 AM12/16/17
to Django developers (Contributions to Django itself)
Hey, I'm a new member to this group and I would like to suggest implementing a custom filter for date_hiearchy.

The date_hierarchy filter is currently relying on the default filter implementation by Django Admin.
I think we can use the heirarchical nature of the date hierarchy to generate a better filter condition.

A URL generated by a the date hierarchy template tag might look like this:
admin/app/model?created__year=2017&created__month=12&created__day=16

The query generated by this URL will look like this (Postgresql):
where created between '2017-01-01' and '2017-31-12' and extract('month', created) == 12 and extract('day', created) == 16;

The problem with this condition is that it uses databse functions to filter the date which makes it very difficult for the database to utilize range based indexes (such as btree).
I'm sure many developers (me included) are adding btree indexes on the date hierarchy field to support such queries but the generated filter prevents the database from using them.
There are solutions outside of Django for this problem such as function based indexes but those come at a cost which can easily be avoided.

Another approach from within Django is to simplify the condition in a way that the database can better utilize range indexes:
where created > '2017-12-16' and created < '2017-12-17';

I wrote about this problem in this blog post and implemented the above as a SimpleListFilter in this package.

To implement the simplified condition within Django I suggest adding the following to ChangeList:

- Identify date_heirarchy fields using the following pattern:
 re.compile(r'^{}__(day|month|year)$'.format(self.date_hierarchy_field))

- In ChangeList.get_filters, after applying the custom ListFilters and before applying the "default" filtering on what's left; if date_hierarchy is defined for the model, Identify the filters, apply a range based filter and remove the values from the parameter list.

Other considerations

- This change is backward compatible.
- Will most likely improve performace of large list views with date_heirarchy.
- The custom filter is applied after the ListFilters so projects that implemented their own filters on date hierarchy fields will not be effected.

Please let me know what you think and if there are other things I haven't considered.

Haki Benita.


charettes

unread,
Dec 16, 2017, 11:18:49 AM12/16/17
to Django developers (Contributions to Django itself)
Hello Haki,

I think the optimizations you are suggesting and the way you've implemented them
make a lot of sense.

I left a few comments on your PR[0] regarding the template tag in charge of
rendering the date hierarchy header in the list view which could benefit from
the same optimization but except for that I think your patch is complete.

Thanks,
Simon

[0] https://github.com/django/django/pull/9469

Aymeric Augustin

unread,
Dec 16, 2017, 11:28:51 AM12/16/17
to django-d...@googlegroups.com
Hello Haki,

Yes, I always found that SQL query weird, I think we can do better.

I didn't review your implementation but I wanted to encourage you anyway :-)

This optimization matters for large tables. In that case, finding all the years / months / days for which objects exist in the database may be more expensive than selecting the objects to display. I believe others have tried optimizing this by :

- showing years between the min / max values, which are trivial to determine with an index ; this works well when all your data was generated during the lifetime of your service, so you have at most a few decades to display, often less, and there's data for each year
- providing links for all possible months or days within a given year or month respectively.

While that optimization is very related to the one you're proposing, I believe they're independent, you don't need to do both at once.

Best regards,

-- 
Aymeric.



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/24ab9a20-cd8e-40c4-a257-3c8b4ee12fde%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Haki Benita

unread,
Dec 17, 2017, 2:43:06 AM12/17/17
to Django developers (Contributions to Django itself)
Hey Aymeric,

Performace issues I experienced with date hierarchy are:
1. The query generated by the filter for the list view is not utilizing range indexs - I hope this PR will resolve this issue.
2. The date hierarchy template tag generate a list of distinct values for the hierarchy level - I solved this issue, as you mentioned above, by overriding the template tag and providing my own implementation (only past dates, all options etc). I found this old ticket which is still open - maybe we can revisit this as well.

Haki.
Reply all
Reply to author
Forward
0 new messages