Django ORM query syntax enhancement

1,342 views
Skip to first unread message

Alexey Zankevich

unread,
Aug 16, 2015, 9:18:26 AM8/16/15
to Django developers (Contributions to Django itself)
Hi all,

This topic is related to the current ORM query syntax with underscores.
There are lots of arguing related to it, anyway it has pros and cons.

Let's take a concrete example of querying a model:

>>> GameSession.objects.filter(user__profile__last_login_date__gte=yesterday)


Pros:

1. The syntax is easy to understand
2. Can be extended with custom transforms and lookups

However, there are several cons:

1. Long strings is hard to read, especially if we have fields with underscores.
It's really easy to make a mistake by missing one:

>>> GameSession.objects.filter(user_profile__last_login_date__gte=yesterday)

Not easy to catch missing underscore between user and profile, is it? Even
though, it's not easy to say whether it should be "user_profile" attribute or
user.profile foreign key.

2. Query strings can't be reused, thus the approach violates DRY principle.
For example, we need to order results by last_login_date:

>>> GameSession.objects.filter(user__profile__last_login_date__gte=yesterday) \
.order_by('user__profile__last_login_date')

We can't keep user__profile_login_date as a variable as in the first part of the
expression we use a keyword argument, meanwhile in the second part - just a 
string. And thus we just have to type query path twice.

3. Lookup names not natural to Python language and require to be remembered or
looked up in documentation. For example, "__gte" or "__lte" lookups tend to be
confused with "ge" and "le" due to similarity to methods "__ge__" and "__le__".

4. Lookup keywords limited to a single argument only, very inconvenient when
necessary to filter objects by range.

I was thinking a lot trying to solve those issues, keeping in mind Django
approaches. Finally I came up with solution to extend Q objects with dot
expression syntax:

>>> GameSession.objecs.filter(Q.user.profile.last_login_date >= yesterday)

Q is a factory instance for old-style Q objects. Accessing attribute by dot
returns a child factory, calling factory will instantiate old-style Q object.

>>> Q
<QFactory object at 0x7f407298ee10>

>>> Q.user.profile
<QFactory object at 0x7f40765da310>

>>> Q(user__name='Bob')
<Q: (AND: ('user__name', 'Bob'))>

It overrides operators, so comparing factory with value returns a related Q
object:

>>> Q.user.name == 'Bob'
<Q: (AND: ('user__name', 'Bob'))>

Factory has several helper functions for lookups which aren't related to any
Python operators directly:

>>> Q.user.name.icontains('Bob')
<Q: (AND: ('user__name__icontains', 'Bob'))>

And helper to get query path as string, which requred by order_by or
select_related queryset methods: 

>>> Q.user.profile.last_login_date.get_path()
'user__profile__last_login_date'

You can check implementation and more examples here

How it solves issues:

#1. Dots hard to confuse with underscores
#2. Query paths can be reused:

>>> factory = Q.user.profile.last_login_date
>>> query = GameSession.objects.filter(factory >= yesterday)
>>> query = query.order_by(factory.get_path())

#3. Not neccessary to remember most of lookup names and use comparison operators
instead.
#4. Possible to use multiple keyword arguments:

>>> Q.user.profile.last_login_date.in_range(from_date, to_date)
<Q: (AND: ('user__profile__last_login_date__lte', from_date),
('user__profile__last_login_date__gte', to_date))>


This approach looked the best for me due to several reasons:

1. It's explicit - it doesn't do anything but generating appropriate Q object. 
The result of comparison can be saved as Q object variable.

2. It's short - variants with using model for that will look much longer, when
joining two or more filters:

>>> GameSession.objects.user.profile_last_login_date >= yesterday  # awkward

3. Implementation will not require to change querset manager or model classes

4. Will still allow to use filters and Q class in the old way:

>>> q = Q(user__profile__last_login_date__gte=yesterday)

or

>>> GameSession.objects.filter(user__profile__last_login_date__gte=yesterday)

I'd like to make it as a part of Django ORM syntax and it will not be hard to
do, especially taking into account the library is already done and working. 
Anyway, I need your thought about the idea in general, as well as about
particular things like chosen method names - "get_path", "in_range" and etc.
As next step I can create a ticket in the issue tracker, or prepare DEP first.
In latter case I need to find a shepherd to work with.

Best regards,
Alexey

Josh Smeaton

unread,
Aug 16, 2015, 8:33:12 PM8/16/15
to Django developers (Contributions to Django itself)
Hi Alexey,

I find this approach really interesting, and I don't mind the API you've created at all. Even if this doesn't make it into Django, I think there's enough utility in your library that other people would want to use it. I'm not going to comment on specifics just yet like method names, because I figure that if we can agree on an approach, the bike-shedding can come later.

So first things first. I think we can all, or at least mostly, agree that having an alternative syntax for building filters is a good idea. String based lookups are cool, and won't go away, but sometimes you want to have greater control over what kind of filter you want to build without having to register global lookups and transforms. I see similarities with prefetch_related. You can provide a string based field lookup, or use a Prefetch() object to give users more control over the behaviour of the prefetch.

I've been working on a patch (https://github.com/django/django/pull/5090) to unify Transforms and Expressions (Func specifically). The work behind that can be (and should be) extended to include Lookups -- making them fully fledged expression objects. That work will need to happen regardless. Once this patch is finished and lands, then we should be able to extend .filter() and .exclude() to allow expressions to be composed as filters:

GameSession.objects.filter(GreaterThan('user__profile__last_login_date', yesterday))
# or, if last_login was a datetime, and we wanted to compare the date part only
GameSession.objects.filter(Equal(Date('user__profile__last_login'), datetime.now().date))
# or if we wanted to implement __gt__ and __eq__ etc:
GameSession.objects.filter(Date('user__profile__last_login') == datetime.now().date))

Loic and Anssi have also been working on alternative syntaxes (https://github.com/django/django/pull/4953):

GameSession.objects.filter(E('user__profile__last_login').date() == datetime.now().date)


Article.objects.filter(E('headline').collate('fi') == 'Article1')

Both of these approaches still suffer from "the string problem" that you're trying to address, but minimises the final component by extracting the lookup and transform components as objects instead. So I think your idea here could nicely coexist:

GameSession.objects.filter(Equal(Date(Q.user.profile.last_login), datetime.now().date))
GameSession
.objects.filter(E(Q.
user.profile.last_login).date() == datetime.now().date)

Or, even building this into F expressions rather than Q expressions:

GameSession.objects.filter(Equal(Date(F.user.profile.last_login), datetime.now().date))
GameSession
.objects.filter(E(F.
user.profile.last_login).date() == datetime.now().date)

I think it may look better with F objects, considering they are Field references, and since the Lookups (Equal/GTE) parts accept F expressions anyway. I'm not too concerned about this particular detail though.

A DEP is probably the right way to go here but I wonder if we should expand the scope to include alternative filter syntax as a whole (expressions/callable transforms) as well as the dot field reference notation you've proposed above. Then we can consider how all these things might work together, and clearly document why we've gone one way and not another. Obviously, alternatives can exist outside of core where the API is available.

I'll be happy to work as the shepherd if needed. But I'd also like some input from Loic and Anssi especially, as well as others in the core team with some interest in the ORM.

Regards,

Anssi Kääriäinen

unread,
Aug 17, 2015, 5:26:45 AM8/17/15
to django-d...@googlegroups.com
I like this idea, too.

The work I and Loic have been doing has been about transforms that can
accept arguments. I don't see why the two styles couldn't be mixed
directly.

.filter(Q.user.profile.last_login.date(timezone='Europe/Helsinki') <
'2015-01-01')
.filter(Q.user.last_name.collate('fi').unaccent() == 'Kaariainen')

OTOH if you want to use expressions directly, that should be OK, too
(if you don't mind the non-optimal readability):
.filter(Exact(unaccent(collate(Q.user.last_name, 'fi'))), 'Kaariainen'))

All in all, +1 to working towards these goals. I'm not yet convinced
we need all this in Django core directly. Maybe we should first try
these ideas in 3rd party apps, and then pick a proven solution into
Django core?

- Anssi
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-develop...@googlegroups.com.
> To post to this group, send email to django-d...@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/db4003cb-f366-430c-92c1-8388ff35804b%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.

Alexey Zankevich

unread,
Aug 18, 2015, 8:00:17 AM8/18/15
to Django developers (Contributions to Django itself)
Hi all,

Thanks for detailed response. I thought over the described expressions/
transforms patches and here are my thoughts about the best way to
implement simplified lookups.

Firstly, I want to describe which properties of the new syntax seem to be
important:

1. Using Python operators to do basic lookups as it's a natural way in Python
for that.

2. Syntax should be minimalistic for basic lookups, the use of that approach
will be more noticeable on multiple filter conditions. In other words, next
syntax is better:

>>> GameSession.objects.filter(Q.user.profile.last_login.date() == datetime.now().date)

than this one

>>> GameSession.objects.filter(E(F.user.profile.last_login).date() == datetime.now().date)

as it requires additional calls, which makes expressions less readable.

3. I definitely like the idea of having explicit classes for lookups and
transforms and think it's worth to bundle dotted query syntax with the patch.
Explicit classes will separate functionality and simplify functions signature
checks.

I suggest a mixed approach, with next properties.

1. We will have a separate class to define query paths (let's call it P, we
can still use F as "field", but having F as multipurpose class may confuse
users, as well as implementation may become more complicated). And it will be
the only purpose of the class. Below I'll reference it as "P" no matter we call
it in future.

2. Any transforms and lookups will take query string or P class, as well as
existing methods "select_related", "prefetch_related" and "order_by". The
simplest implementation will be overriding __str__ method of the path class
to generate related strings.

>>> path = P.user.last_login_date
>>> GameSession.objects.all().order_by(path)[:10]

>>> print(path)
user__last_login_date

3. Implement transforms and lookups as classes or functions (not bound to P class):

>>> GameSession.objects.filter(Unaccent(P.user.location.name) == "Cote d'Ivoire")

It will simplify cases with parametrized transforms (ex. mentioned collate('fi')).
Also eliminate fields collision with util functions like "date", which may be a
so-called field.

4. Below is a table describing accepted passed and returned parameters:

+-------------------+---------------------------+----------------------+
|  Class/Function   |    Allowed Param Types    | Comparison Operators |
+-------------------+---------------------------+----------------------+
| Transform         | str, P, Transform, Lookup | Return lookup        |
| Lookup            | str, P, Transform         | Raise exception      |
| P                 |                           | Return lookup        |
| .order_by         | str, P                    |                      |
| .select_related   | str, P                    |                      |
| .prefetch_related | str, P, Prefetch          |                      |
+-------------------+---------------------------+----------------------+


Samples:

>>> P.user.name == 'Bob'
Equal('user__name', 'Bob')

>>> Unaccent(P.user.name)
Unaccent('user__name')

>>> Collate(P.user.name, 'fi')
Collate('user__name', 'fi')

>>> Unaccent(P.user.name) == 'Bob'
Equal(Unaccent('user__name'), 'Bob')

>>> Equal(P.user.name, 'Bob') == 'Bob'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Exception: Lookups comparison is not allowed

>>> Contains(P.user.name, 'Bo')  # lookup
>>> Date(P.user.last_login_date, datetime.now().date)  # transform


Questions to discuss and possible pitfalls:

1. Handling descended ordering in order_by

>>> path = P.user.last_login_date
>>> -path
'-user__last_login_date'

or even

>>> -p
NegP('user__last_login_date')

Is it possible path subtraction is required in future by any reason? In that
case the given approach will be inconvenient. Or may be better to just let
users to format desc ordering themselves?

>>> '-{}'.format(path)

2. Don't think it's a big deal, but users might do next thing:

>>> path = P.user.last_login_date
>>> GameSession.objects.filter(path=today)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
FieldError: Cannot resolve keyword 'path' into field.

At least need to mention it in documentation

3. Classes vs factory functions:

>>> Equal(P.user.name, 'Bob')
Equal('user__name', 'Bob')

vs

>>> equal(P.user.name, 'Bob')
Equal('user__name', 'Bob')


If you are fine with described syntax I can start with DEP.

Regards,
Alexey

Collin Anderson

unread,
Aug 18, 2015, 10:54:53 AM8/18/15
to Django developers (Contributions to Django itself)
Just a quick thought: I could imagine some newbies could get confused by the python-like syntax and try to do:

Equal(P.user.get_full_name(), 'Bob Someone')

I don't think it's not a huge deal, but worth noting.

Anssi Kääriäinen

unread,
Aug 18, 2015, 12:49:57 PM8/18/15
to django-d...@googlegroups.com
I'm still thinking we shouldn't integrate any new query syntax into
1.9. Instead lets make it easy to create 3rd party apps that offer
different querying syntax, and then lets see which solution (if any)
gets popular enough to be integrated into Django.

- Anssi
> https://groups.google.com/d/msgid/django-developers/1aeba72e-a34a-4b92-8f74-55af0da2671b%40googlegroups.com.

Marc Tamlyn

unread,
Aug 18, 2015, 12:52:28 PM8/18/15
to django-d...@googlegroups.com

I strongly agree with the third party approach.

Michael Manfre

unread,
Aug 18, 2015, 12:54:48 PM8/18/15
to django-d...@googlegroups.com
+1 for making it doable for 3rd party apps.

Regards,
Michael Manfre


For more options, visit https://groups.google.com/d/optout.



--
GPG Fingerprint: 74DE D158 BAD0 EDF8

Alexey Zankevich

unread,
Aug 18, 2015, 2:55:21 PM8/18/15
to Django developers (Contributions to Django itself)
Once Josh completes this patch https://github.com/django/django/pull/5090
(.filter method accepting class-based lookups and transforms), it will be
almost everything required by third-party apps. Is it going to be a part of
Django 1.9, by the way?

Additionally, for pure flexibility, next method and classes need to accept
either a callable or an object supporting special path interface (for example,
just a single method get_path() returning string path).
They listed below:

1. The current methods.select_related, .prefetch_related, .order_by
2. Annotation classes Min, Max, Count, Sum
3. Future transforms and lookups classes


Examples:

>>> path = lambda x: 'user__last_login_date'
>>> GameSession.objects.all().order_by(path)

or

>>> class LoginDatePath(object):
...     def get_path(self):
...         return 'user__last_login_date'
>>> path = LoginDatePath()
>>> GameSession.objects.all().order_by(path)

Path generation is a critical part of external query syntax as it used in
almost all aspects of the ORM, meanwhile each related method accepts either a
string or (in some cases) a specific kind of class - OrderBy for order_by
method, Prefetch for prefetch_related and etc.

Josh Smeaton

unread,
Aug 19, 2015, 3:49:07 AM8/19/15
to Django developers (Contributions to Django itself)
If I finish the patch in time (I think I have about a month left), then it'll be included in 1.9. Review and comments on the PR will go a long way to helping me tidy it up sooner rather than later, so please feel free to review.

Regards,
...

Alexey Zankevich

unread,
Aug 20, 2015, 10:16:45 AM8/20/15
to Django developers (Contributions to Django itself)
What about the idea to add interface to specify paths with special class or callable?

Josh Smeaton

unread,
Aug 20, 2015, 9:26:34 PM8/20/15
to Django developers (Contributions to Django itself)
Most expressions already support strings, and most that support strings expect them to be field_paths that can be wrapped in F() expressions. So you have two options really, without built in support from other expressions.

1. P.field.other.some_field() returns an F()
2. P.field.other.some_field() returns a str that will be internally wrapped in an F().

The callable part of your syntax is just something I'm suggesting as an option. The requirement is that a string or F() be returned in some way though. I'm also using "P." as a stand-in as you used previously.

Supporting callables is an interesting idea, and I don't think that'll cause issues with existing expressions. The callable must return an expression though. If you wanted to write a patch to support callables, I'd be onboard to review ASAP.

Cheers

Alexey Zankevich

unread,
Aug 24, 2015, 2:13:32 PM8/24/15
to Django developers (Contributions to Django itself)
Hey Josh,

Here is the current state of F support by different functions:

* annotations (Count, Sum etc) - work
* order_by - works partially (only asc order supported)
* select_related, prefetch_related - don't work

So, F is not a universal interface for getting paths right now. Also, there is another problem with F - it overloads python operators already, so if we overload unary operation "-F", it will confuse users as in one case it will return CombinedExpression (ex. 0-F('fieldname')), in the other - OrderBy (ex. -F('fieldname')).
Since F originally designed to describe arithmetic upon fields during filtering, it will not be a perfect match to denote query paths.
As for accepting callable as a query path factory, I don't think it's a perfect match as well, as callable interface is too general, at least to be an unnamed argument.
For example, fields.CharField(default=callable) is a clear approach, as it's obvious what was implied. 
ModelClass.objects.all().order_by(callable) - still clear, but YearGte(callable, 40) isn't (since I'm talking about general interface to specify query paths in the whole project).
So, I suggest a separate class for that purposes, and it doesn't necessarily need to have short name like "P". Even better if it has meaningful name like "TreePath" or "FieldTree".

...

Josh Smeaton

unread,
Aug 24, 2015, 7:48:10 PM8/24/15
to Django developers (Contributions to Django itself)
Expressions have built in support for ordering with asc or desc, see here: https://docs.djangoproject.com/en/1.8/ref/models/expressions/#django.db.models.Expression.asc

Meaning that -F() is never ambiguous, which is the main reason we went for the F().desc() API rather than using the minus sign as an indicator of ordering.

To use the `P` example in an order by statement:

Model.objects.order_by(P.related.date_field().desc())

I've "called" date_field here, but that's just assuming an API. As long as an expression is being returned, you can call `desc` on it.

What some other ORMs do is reference fields directly on the model, though I'm not sure I like the verboseness (but it gives you safety):

Model.objects.order_by(Model.related.date_field.desc())

The extra `Model` is annoying, but it does mean that you can actually look up fields as you go rather than just crafting an `F()` that is eventually resolved into the `Col()`. This would also allow us to return expressions directly. Not sure how or if this API would work with the current metaclasses though.

For what it's worth, the names `TreePath` and `FieldTree` are too low level for a public facing API that is supposed to make it easier/safer to craft field paths. One letter class names are also a bad design in general, so we'd need something half way between both if going for a new class.

...

Alexey Zankevich

unread,
Aug 25, 2015, 6:12:56 AM8/25/15
to Django developers (Contributions to Django itself)
Gotcha.
So, F objects seem to be pretty good as generic interface to specify query paths, the only remaining patch - adding select_related and prefetch_related support. I'll preapare a preliminary pull request soon.
...

Alexey Zankevich

unread,
Aug 26, 2015, 11:04:02 AM8/26/15
to Django developers (Contributions to Django itself)
The patch allowing F objects for select_related and prefetch_related methods was trivial, unfortunately F uses "name" field as a part of Expression interface, so dotted querying will conflict with internal field (as "name" is quite a popular name for model field).

>>> Contains(F.user.name, "Bob")   # conflict
Will work on another patch with custom class.
...

Alexey Zankevich

unread,
Sep 22, 2015, 10:09:30 AM9/22/15
to Django developers (Contributions to Django itself)
Hi Josh,

As https://github.com/django/django/pull/5090 pull request merged into master, I wanted to extend django-orm-sugar library with some functionality, related to passing transforms or lookup objects. Currently it's not clear to me how lookups or transforms can be used in that way.
I'm talking about mentioned earlier syntax:

GameSession.objects.filter(GreaterThan('user__profile__last_login_date', yesterday))

Could you give me a starter point for that?

Thanks,
Alexey
...

Alexey Zankevich

unread,
Sep 29, 2015, 2:49:54 PM9/29/15
to Django developers (Contributions to Django itself)
Here is a list of issues to solve to support explicit transforms and lookups by filter (and exclude) methods.

1. Make Lookup.__init__ signature to support initialization with F objects or string path (e.g. GreaterThan(F('user__id'), 10) or GreaterThan('user__id', 10)), not sure it's possible to use simultaneously with the current approach with lhs, rhs initialization (even with moving it to a separate class method, e.g Lookup.build(lhs, rhs)), so I assume creating so-called util classes which will delegate SQL-related functionality to existing Lookup classes.

2. Chain transforms by passing them as argument:

Lower(Unaccent(F('user__name)))

3. Decide if Q objects shall support explicit lookups/transforms as argument as well - it's a kind of logical step, as without Q objects it will not be possible to perform complicated conditions (AND, OR, NOT).
In that case lookup/transform parsing should be moved from QuerySet object to Q object - filter will take already parsed lookup tree.
Example:

Q(user__name__lower__unaccent__icontains='Bob') will internally parse it and build next structure:

Q(Icontains(Lower(Unaccent(F('user__name')))), 'Bob')

Josh Smeaton

unread,
Sep 29, 2015, 8:46:35 PM9/29/15
to Django developers (Contributions to Django itself)
1. Lookups should become Expressions, just as Transforms have become Expressions. This will let us process Lookup arguments as Expressions all the way the way through. I think this should be a major goal for version 1.11.

2. Chaining transforms is now possible since they are just Func expressions. Func(Func(Func('field_name'))) is no issue.

3. Sounds like an OK idea, but I haven't looked into the details enough to really comment. I do think we should create the correct form as early as possible (parsing into a chain of Lookup/Transform expressions) so we don't have to do parsing in multiple places. The entry points to .filter() and .exclude(), or their direct counterparts in sql.query sound ideal. Anssi has mentioned elsewhere that WhereNode's should only contain fully resolved expressions, so resolving will need to be done directly after parsing (or during).

Part 1 above can be started now if you have the time or interest. We can nail down the particulars of part 3 while we're solving part 1. Part 1 may drive some of part 3.

Cheers

Anssi Kääriäinen

unread,
Sep 30, 2015, 12:33:14 AM9/30/15
to django-d...@googlegroups.com
On the core ORM side we need to make
.exclude(LessThan(F('friends__age'), 30)) do a subquery. This way
.exclude(friends__age__lt=30) does the same thing as the expression
version. This isn't that easy to do. If we just use
resolve_expression, then the friends relation will generate a join,
and then as second step do a negated filter on the joined value.
Instead we want to detect that the LessThan expression needs to be
pushed in to a subquery.

So, we need to solve:

A) A way to ask an expression if it is referencing a multijoin
(possible approach is to just have a method
"refs_multi_valued_relation(query)")
B) When the ORM sees an expression that is reffing a multijoin in an
exclude filter, then we need to push the expression in to a subquery.

A) requires some new work. This shouldn't be that hard to implement,
we just recursively ask subexpressions if they reference a multijoin.

Something like https://github.com/django/django/pull/4385 will make B)
much easier to implement.

I've been working on making Q-objects responsible for resolving
themselves. See https://github.com/django/django/pull/4801. This
should solve 3).

We don't seem to be missing any major parts. I (or some volunteer)
just need to finish the PRs, and then we should be really close to
full support for expressions in filter.

Josh: do you think we could get Lookup as expressions in to 1.10
instead of 1.11?

- Anssi
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-develop...@googlegroups.com.
> To post to this group, send email to django-d...@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/3bad8371-f9b4-47ff-a681-0108b320e9b5%40googlegroups.com.

Josh Smeaton

unread,
Sep 30, 2015, 1:31:24 AM9/30/15
to Django developers (Contributions to Django itself)
The alpha for 1.10 has already been cut, and I'm not sure that the kinds of changes needed here are appropriate to add now that the alpha is out. One could *maybe* make the argument that changing Lookup to an Expression now rather than later is the right move considering Transforms just underwent the same change for 1.10. Personally though, I don't think I have the time right now to do this change. I would support you if you were able, but we'd still be at the mercy of the technical board (I assume) for getting this change in for 1.10.

Do you think Lookup as Expressions requires the subquery/exclude fix you mention above? I would think not -- not until we were ready to document and support .filter(Lookup(F(), Value()). If it wasn't a requirement, it'd make the Lookup->Expression work much easier. It wouldn't even need to be documented (other than the release notes), as it'd just be an implementation change.

Cheers,

Josh Smeaton

unread,
Sep 30, 2015, 1:46:00 AM9/30/15
to Django developers (Contributions to Django itself)
I'm mixing my versions, sorry to those following along. 1.9 has just reached alpha. Lookups as Expressions should be doable for 1.10 which master is currently tracking.

Cheers

Anssi Kääriäinen

unread,
Sep 30, 2015, 2:19:51 AM9/30/15
to django-d...@googlegroups.com
I don't think we need split-to-subq support for Lookups before we make
them expressions. Lookups as expressions are usable outside .filter(),
and we need the split-to-subq support only in .filter(expression).

- Anssi
> https://groups.google.com/d/msgid/django-developers/5e7ee87b-6253-420b-9688-51c39d657cab%40googlegroups.com.

Alexey Zankevich

unread,
Sep 30, 2015, 12:13:41 PM9/30/15
to Django developers (Contributions to Django itself)
I'll try to turn lookups into expression (will use master branch).
I also have a question. Meanwhile the old query syntax with underscores is pretty good for simple queries (like Model.objects.filter(name='Bob'), it gets really ugly for parametrized calls, a new JSONField is a good example of that:

>>> Dog.objects.filter(data__owner__other_pets__0__name='Fishy')
[<Dog: Rufus>]

It will get even more messy if we want to pass a string as a second param of the func.

ex.:

1. filter(Decode(F('name'), 'utf-8'), Value(u'Бармаглот'))   # <- neat
2. filter(name__decode__utf8=u'Бармаглот')  # <- ?? ambiguous and not nice at all

So question - is it implied all the funcs, transforms and lookups to have underscore-based equivalent? It can affect the final implementation pretty much. In my opinion underscore-based equivalent should not be really required for funcs (the problem doesn't seem to affect transforms as they will not accept multiple params according to another thread).

Thanks,
Alexey

Marc Tamlyn

unread,
Sep 30, 2015, 2:06:57 PM9/30/15
to django-d...@googlegroups.com

I don't think complex cases need to have __ equivalents, but I also dispute that long chains like that are necessarily a bad API, I don't find your option 1 particularly neat compared to option 2. Worth noting though that you've used a 2-valued function there which is not a lookup in the same sense.

I think data__owner__other_pets__0__name is as nice to read as JSONExtract('data', path=['owner', 'other_pets', 0, 'name']) personally.

Marc

Alexey Zankevich

unread,
Sep 30, 2015, 2:47:27 PM9/30/15
to Django developers (Contributions to Django itself)
Hi Mark,

I can explain why long string is a bad design (other than anything I described earlier). In fact it assumes that the additional parameters can be serialized into alphanumeric string, which apply additional and unnecessary limitation. In my example above, it's not clear how to pass "utf-8" encoding, which contains a hyphen.
data__owner__other_pets__0__name stops working even if I want to have "other pets"  (let's say it's user-defined category).
Now about getitem operation - for JSONField it's no problem, as JSON itself doesn't support numeric-based keys in dictionaries, but Python does. So, it will not be possible to distinguish numeric zero from "0" key in PickleField, if I need to implement one by some reason.

Josh Smeaton

unread,
Sep 30, 2015, 9:00:41 PM9/30/15
to Django developers (Contributions to Django itself)
No, not all Lookups or Transforms are required to handle __underscore__ syntax. The entire point of supporting object based lookups is to handle cases that get more complex than a single argument transform or a left and right hand side lookup.

In particular, I think your Decode(utf8) example is a good one. It shows that you could maybe shoehorn multiple arg transforms into the __underscore__ api, but it wouldn't be entirely obvious how to do so. You'd probably need to register partial transformations for each encoding you wanted to support. The contrib.postgres module has (from memory) KeyTransform classes that do a similar thing.

Cheers

Anssi Kääriäinen

unread,
Oct 1, 2015, 2:38:18 AM10/1/15
to django-d...@googlegroups.com
+1 to not requiring all transforms to handle __underscore__ syntax.

I think what we want to do is allow users choose which syntax they
prefer. The idea is that Django will support both JSONExtract('data',
path=['owner', 'other_pets', 0, 'name']) and
data__owner__other_pets__0__name out of the box.

We will also make it possible to support callable transforms. For example:
filter(Exact(Decode(F('name'), 'utf-8'), Value(u'Бармаглот')))
is equivalent to
filter(F('name').decode('utf-8') == Value(u'Бармаглот'))

The callable transforms syntax will not be a part of Django, but it
will be possible to create an extension for this (it is actually
surprisingly easy to do once we have support for expressions in
filter).

- Anssi
> https://groups.google.com/d/msgid/django-developers/b97ea866-514c-4d03-bf60-e46a06e5fc5a%40googlegroups.com.

Loïc Bistuer

unread,
Oct 1, 2015, 3:18:40 AM10/1/15
to django-d...@googlegroups.com
> On Oct 1, 2015, at 13:38, Anssi Kääriäinen <akaa...@gmail.com> wrote:
>
> +1 to not requiring all transforms to handle __underscore__ syntax.

+1

> I think what we want to do is allow users choose which syntax they
> prefer. The idea is that Django will support both JSONExtract('data',
> path=['owner', 'other_pets', 0, 'name']) and
> data__owner__other_pets__0__name out of the box.
>
> We will also make it possible to support callable transforms. For example:
> filter(Exact(Decode(F('name'), 'utf-8'), Value(u'Бармаглот')))
> is equivalent to
> filter(F('name').decode('utf-8') == Value(u'Бармаглот'))
>
> The callable transforms syntax will not be a part of Django, but it
> will be possible to create an extension for this (it is actually
> surprisingly easy to do once we have support for expressions in
> filter).

I'm pretty convinced we need a better API / sugar as part of core in the long run.

The `filter(Exact(Decode(F('name'), 'utf-8'), Value(u'Бармаглот')))` syntax is not pythonic and very hard to decipher, and we've reached the limits of our historical double underscore syntax (doesn't support multiple arguments, limited to ascii, etc.).

I think the `F('name').decode('utf-8')` syntax is a good candidate for core:

- It's a small extension to the existing F objects, so it's easy to grasp for existing Django users.
- It's just a thin sugar on top of the canonical API.
- All the required machinery being already in place (lookup & transform registration) the change to Django is minimal.
> To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CALMtK1H8OGkXJys4Z-XiZ9kFspZ_naqNS%2BbOigsHCK40Snhj1g%40mail.gmail.com.

Alexey Zankevich

unread,
Oct 1, 2015, 3:49:27 AM10/1/15
to Django developers (Contributions to Django itself)
> I think the `F('name').decode('utf-8')` syntax is a good candidate for core.

When Django supports expression lookups, it will be possible to build syntax sugar third-party library on top of it.
For example, it will be possible to implement dotted syntax for F objects:

UserModel.objects.filter(Decode(F.user.name, 'utf-8') == u'Бармаглот')

alternatively it can be even combined with your syntax

UserModel.objects.filter(F.user.name.decode('utf-8') == u'Бармаглот')

However, the second solution will be more complicated from technical point of view (as will require lazy object, collecting all the func or transform calls), also it will require to be really careful with transform names not to conflict with possible field names.
For example, it's hardly possible to have "decode" field in the model, but easily - "date".

Anssi Kääriäinen

unread,
Oct 1, 2015, 4:38:39 AM10/1/15
to django-d...@googlegroups.com
Lets concentrate on making expressions in filter calls and lookups as
expressions reality. When we have done that we can try out different
syntax approaches in 3rd party apps. Finally, after we have field
tested the approaches, we can merge something in to Django.

- Anssi
> https://groups.google.com/d/msgid/django-developers/9ea58a11-ce31-4d80-be45-913dff947878%40googlegroups.com.

Alexey Zankevich

unread,
Oct 19, 2015, 6:49:37 AM10/19/15
to Django developers (Contributions to Django itself)
Hi all,

Here is a raw pull request, allowing lookup instances passing to filter method 
There are two new test cases working:

1. Passing lookup with F object: Article.objects.filter(GreaterThan(F('id'), Value(5)))
2. And passing lookup with transforms: Article.objects.filter(Contains(Upper(Lower(Upper(F('headline')))), 'ARTICLE 5'))

All the existing tests are passing successfully, by the way (at least for SQLite backend).

Several comments:

  • F expressions became real expressions (Expression subclass).
  • Expression got a new method - get_source_f_object, which walks through children classes and find if it has F object passed. In that case output_field-related code should be postponed, until compiler resolves it (which cannot be done during lookup init time).
  • some code related to query.build_filter method (responsible to lookups/transform parsing, and calculating joins), extracted into separate helper classes - QueryKeywordLookupHelper and QueryObjectLookupHelper. They have common code base, but some of methods work in a different way.
Except review, it would be really great if someone help with writing test cases (and I'll get them working in case of any issues). Also, there is a workaround field=None -> isnull=True for oracle database, can anybody check if it is working correctly?

Regards,
Alexey

Josh Smeaton

unread,
Oct 19, 2015, 7:09:14 AM10/19/15
to Django developers (Contributions to Django itself)
I think you forgot the PR link Alexey. I'd like to have a look at the changes you've made. I'm not sure about the extensions to F() just yet, I'm hoping they aren't actually needed for your patch to work, but will need to review to be sure.

Cheers

Asif Saifuddin

unread,
Oct 19, 2015, 7:18:16 AM10/19/15
to Django developers (Contributions to Django itself)
...

Alexey Zankevich

unread,
Oct 19, 2015, 7:21:56 AM10/19/15
to Django developers (Contributions to Django itself)
Here is a link https://github.com/django/django/pull/5443, sorry :)

Making F object as expression wasn't required, but I still had to create FieldExpression wrapper, which wrapped F object. Eventually, I thought it would confuse people, especially taking in account F objects referenced in doc as "F Expression". So, I removed FieldExpression wrapper and made F object as real expression.
...

Anssi Kääriäinen

unread,
Oct 19, 2015, 7:28:42 AM10/19/15
to django-d...@googlegroups.com
I did similar work in https://github.com/django/django/pull/4801. I
aimed for 1.9, but had to defer the work due to problems with
.exclude() filtering.

The problem is that if you query with
.exclude(LessThan(F('friends__age'), 30)), you'll get different
results than with .exclude(friends__age__lt=30). The latter does a
subquery while the expression version isn't capable of that.
Unfortunately, fixing this is a bit nasty. I don't see any other way
than some solution like https://github.com/django/django/pull/4385 -
that is we prewalk the expression tree and decide which parts need to
be pushed in to a subquery. This should work really nice with
expressions, too.

I haven't taken a closer look on your approach, so I can't say which
one we should use as basis for future work once we have solved the
subquery problem.

- Anssi
> https://groups.google.com/d/msgid/django-developers/17bc7731-8668-4652-a4d3-442fcba48fe1%40googlegroups.com.

Alexey Zankevich

unread,
Oct 19, 2015, 7:47:05 AM10/19/15
to Django developers (Contributions to Django itself)
Hey Anssi,

Just pushed test_lookup_exclude and test_lookup_exclude2, which passed successfully. Can you please review if they cover exactly described case?

Thanks,
Alexey

Alexey Zankevich

unread,
Oct 6, 2016, 1:04:56 PM10/6/16
to Django developers (Contributions to Django itself)
Hey all,

Just want to announce recent changes in Django ORM Sugar library, which might be useful in future Django query syntax enhancement (if ever happens).
1. Now it supports indexes and slices:

>>> Q.data.owner.other_pets[0].name='Fishy'
Q(data__owner__other_pets__0__name='Fishy')

>>> Q.tags[0] == 'thoughts'
Q(tags__0='thoughts')

>>> Q.tags[0:2].contains(['thoughts']) 
Q(tags__0_2__contains=['thoughts'])

2. Previously, it was only possible to call defined method (like it is done for is_not_null() function) or registered with decorator. Now if you try to call unknown function of sugar Q, it will internally pass function name and append it as __functionname to lookup field:

>>> Q.user.username.contains('Rodriguez')
Q(user__username__contains='Rodriguez')

There is no such function "contains" anymore in sugar Q, however, it just adds __contains to lookup field and passes parameter to it.

3. It will pass a tuple to lookup if multiple arguments passed:
>>> Q.user.create_datetime.range(d1, d2)
Q(user__create_datetime__range=(d1, d2))

I think the library is at the final state now and isn't going to get new substantial features, but responses are highly appreciated.

Regards,
Alexey

Robert Roskam

unread,
Oct 7, 2016, 11:20:29 AM10/7/16
to Django developers (Contributions to Django itself)
+1 from me. I really like this approach to help making my queries more DRY.

I also like how lightweight the library is altogether. Well done!


For those looking for a direct link, here you go: https://github.com/Nepherhotep/django-orm-sugar

Anssi Kääriäinen

unread,
Oct 12, 2016, 8:42:00 AM10/12/16
to Django developers (Contributions to Django itself)
+1 from me, too.

I still think we should be able to get into a point where you can run queries like:
    Q.user.firstname.collate('fi').lower() == 'anssi'

So, not only can you call regular transforms (like lower), but also transforms that take arguments. Other use cases are for example substring:
    Q.user.firstname.substring(from=5) == 'akaariai'

And of course, this would be super useful for things like drilling into JSON data.

 - Anssi

Alexey Zankevich

unread,
Oct 12, 2016, 2:24:07 PM10/12/16
to django-d...@googlegroups.com
Agreed about concatenation functions calls. It would be easy to achieve once filter will accept object-based lookups (previously we discussed expression objects).
I think it would be even better to accept objects of expression builder instead (versus expressions directly) - it will separate builder interface from combinable expressions:

# interface
>>> class ExpressionBuilder:
...    def get_expression(self):
...         pass

# calling get_expression function of the instance below
# should raise error as right hand expression is not created yet
>>> Builder.user.first_name.lower()
<ExpressionBuilder object at 0x7f3605145080>

# calling get_esxpression method of the instance below
# will return a full expression tree 
>>> Builder.user.first_name.lower().collate('fi') == 'anssi'
<ExpressionBuilder object at 0x7f3604f96400>

# query itself
qs = MyModel.objects.filter(ThirdParty.user.first_name.collate('fi').lower() == 'anssi')

It's not a huge difference, but will make it easier to extend. Do you think it's worth to put as DEP?

Regards,
Alexey


--
You received this message because you are subscribed to a topic in the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-developers/W0OYXhavY68/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/b0e30b7f-524e-4cbf-9bba-37cbb8425edc%40googlegroups.com.

Aric Coady

unread,
Oct 16, 2016, 6:38:26 PM10/16/16
to Django developers (Contributions to Django itself)
+1.  I implemented much the same thing for part of django-model-values, but went with F expressions as the basis instead.  Primarily because F expressions already support operator overloading and are a natural intermediate object;  from there one can create Q, OrderBy, and Func objects.

In []: from model_values import F

In []: F.user.created
Out[]: FExpr(user__created)

In []: F.user.created >= 0
Out[]: <Q: (AND: ('user__created__gte', 0))>

In []: F.user.created.min()
Out[]: Min(FExpr(user__created))

In []: -F.user.created
Out[]: OrderBy(FExpr(user__created), descending=True)

In []: F.text.iexact('...')
Out[]: <Q: (AND: ('text__iexact', '...'))>

Asif Saifuddin

unread,
Mar 22, 2017, 3:07:51 AM3/22/17
to Django developers (Contributions to Django itself)
Hi Aric,

I checked your package. it's nice too. thanks for the work.

Asif

Alexey Zankevich

unread,
Apr 1, 2017, 2:55:27 AM4/1/17
to Django developers (Contributions to Django itself)
Hey all,

Please check a draft DEP related to external query language support by Django ORM https://github.com/django/deps/pull/40

Regards,
Alexey

Asif Saif Uddin

unread,
Oct 6, 2021, 5:13:23 AM10/6/21
to Django developers (Contributions to Django itself)
Hey all,

can we have some consensus on this?

Asif

Adam Johnson

unread,
Oct 6, 2021, 5:55:58 AM10/6/21
to django-d...@googlegroups.com
I would not be for merging anything into Django at this time.

There are several libraries providing "enhanced" query syntax: django-orm-sugar, django-model-values, and django-natural-query. None of them seems to be particularly popular (<100 github stars, minimal PyPI downloads), and only django-model-values is maintained. Moreover they all achieve their goals without any changes to Django, so there's not much of a compelling reason to merge anything to core.

To consider adding "another way to do things" in the ORM we'd want more evidence that users want it. I'd suggest further contributions and advocacy for django-model-values or similar. Its documentation could definitely be improved, and blog posts could promote its advantages.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.

To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.

Asif Saif Uddin

unread,
Oct 10, 2021, 2:08:20 AM10/10/21
to Django developers (Contributions to Django itself)

Hi Adam,

I agree with some of your points. however djngo orm query syntax is the main pain point for anyone new to django orm.  The reason those packages are not widely used is because most people dont know about them and not about this DEP. And most new users mainly end up learning built in query syntax. not many know much about these Deps and orm syntactic sugars.

If some movement were seen in this front and advocacy then probably there will be more attention to syntactic sugar packages. The orm sugar package is still works in production with django orm as well. So it would be great to see the Dep could be improved based on the both packages good parts.

I think I should try to  write some blogs as well using those packages.

Best,
Asif
Reply all
Reply to author
Forward
0 new messages