Support for function application in ORDER BY

391 views
Skip to first unread message

Tim Martin

unread,
May 31, 2014, 7:22:14 AM5/31/14
to django-d...@googlegroups.com
Hi all,

I was looking at implementing the behaviour proposed in #13006 (https://code.djangoproject.com/ticket/13006). In short, the idea is to allow decorating fields in order_by() with a LowerCase() modifier to apply LOWER() in SQL before ordering:

   Articles.objects.all().order_by(LowerCase('title'))

to yield

   ...
   ORDER BY LOWER("articles"."title")
   ...

Obviously it's sensible if we can generalise this to apply a wide range of SQL functions with the same infrastructure.

It's easy enough to make this work, but I'm having trouble with generalising the solution. The problem is that the argument to order_by() can be a compound expression like "-author__name" and we have to parse it out. Only the SQLCompiler object (I think) has the necessary information to correctly figure out the joins and turn this into valid SQL.

I can see a couple of different approaches:

1) Have these decorator classes be very simple containers and keep all the logic in the SQLCompiler. The compiler object will detect a FunctionCallDecorator, query it for the ordering string and combine them, something like:

   if isinstance(field, FunctionCallDecorator):
      table, cols, order = self.find_ordering_name(field.get_field(), ...)

      function = field.get_function()

      # other stuff....
      order_by = "%s(%s.%s)" % (function, qn(table), qn2(col))

I'm not that happy with this solution, since it binds the implementation of FunctionCallDecorator very closely to the workings of SQLCompiler and to ORDER BY in particular.

2) Make the argument to order_by into a tree structure that can be processed by the SQLCompiler using the visitor pattern. Leaf nodes (strings) get mapped into expressions by the SQLCompiler as normal, while internal nodes get to define their own logic for combining the output of their child nodes into an expression. The ASC / DESC direction is a bit awkward here, since it's meaningless to have two function arguments that order in different directions, for example.

This is more similar to how we do WHERE nodes, if I'm understanding the code correctly. This seems to better support multi-argument functions (e.g. COALESCE) which would be a horrible kludge in the first method, and to support multiple function application (e.g. LOWER(LTRIM(...))). However, it feels a bit over-engineered.

There are obviously lots of other variations here, but I figured I'd ask before I went too much further with the design process. Does anyone have any feelings about how best to approach this?

Tim

Marc Tamlyn

unread,
May 31, 2014, 8:01:19 AM5/31/14
to django-d...@googlegroups.com
With Custom transforms[1] for lookups, we ought to be using the same mechanisms in my opinion. They will also be used for functional indexes in contrib.postgres. We would need to add a __lower transform to all relevant fields of course. I am considering refactoring how the "standard" lookups work so that text-based lookups are only available on those fields which are string based, and there is a __text transform on all other fields. This certainly makes sense in postgres, I've not looked at other dbs.

The syntax would then become .order_by('title__lower'). I have to admit I'm not familiar with how we add ORDER BY clauses as yet, but ideally we would be using something similar to the query expression API[2] - i.e. something with an as_sql and an as_vendorname_sql.

If you've had a look at the code, does this kind of approach seem like it might work?

As an aside, I'd also like transforms to be available in a couple of other places - most notably in values() calls and F() objects. So although they are currently specific to .filter() and .exclude(), the concept does (and hopefully will in the future) translate.



--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/141c1486-ca35-497d-82f7-13ad83900b9d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Josh Smeaton

unread,
Jun 1, 2014, 4:46:02 AM6/1/14
to django-d...@googlegroups.com
I would like to point you towards a patch I'm currently working on which is essentially your option 2: https://github.com/django/django/pull/2496

It doesn't seem very relevant at first, since it only applies to .annotate and .aggregate, but the infrastructure behind it certainly does. In short, the patch expands on the concept of "Expressions". Expressions are nodes in a tree that output SQL. Functions can be expressions, a field (F() object) can be an expression, or the number 5 can be an expression. Funnily enough, I implemented the SQL `Lower` function as a test: https://github.com/jarshwah/django/commit/362fec1330f1f40c45ece5769f9e35047c87a010#diff-4c9f3e475cf7fa25e6d21c2fb7b8c30fR185

What I'd like to do if this patch is accepted (and I believe it will be for 1.8; it's a matter of getting the backwards-compat right), is to expand the concept of expressions into the other pieces of the API, especially `order_by`. Once order_by understands expressions, it's trivial to order the result set by any kind of sql you could imagine..

- .order_by(NullsLast('field'))
- .order_by(Lower('field'))
- .order_by(Coalesce(F('field'), F('otherfield')))
etc.

Marc is right that custom transforms could be used if order_by was taught the API. I would prefer to teach order_by about Expressions though, since F() supports custom lookups and transforms already. His version would then look like: .order_by(F('title__lower')). I can see the value in a keyword__based API that is similar to the rest of django (like .filter()), but I find the syntax limiting when trying to compose multiple expressions together.

If you were interested in branching off and working on bringing the same functionality to order_by, I'd be happy to provide information or support if you had any questions.

Regards,

Josh

Tim Martin

unread,
Jun 1, 2014, 5:01:09 AM6/1/14
to django-d...@googlegroups.com
Funnily enough, I'd already seen that patch but I hadn't figured out the full significance of it. I attempted to solve #22288 (https://code.djangoproject.com/ticket/22288), but spotted that your patch would possibly solve the problem entirely, and at the very least breaks my attempt to solve that specific case.

The expression concept seems like a good direction to be going in. I'd definitely be interested in seeing how to extend this to solve the two tickets I'm working on. I'll take a look at your code and see what I can do with it, and ask if I get stuck.

Tim

Tim Martin

unread,
Jun 8, 2014, 6:28:58 AM6/8/14
to django-d...@googlegroups.com
I've reworked my approach to ORDER BY to base it on Josh's patch, and it works very well. It's very straightforward to implement something like this:

Article.objects.order_by(LowerCase('title'))  --> ORDER BY LOWER("some_articles"."title")

This makes use of all the existing expression compiling logic, so there is no extra "ORDER BY" logic to worry about.

What I'm not sure about is how to handle sort order. We can either support:

Article.objects.order_by(LowerCase('-title'))

which will require some special cases in the ORDER BY compilation logic (possibly quite substantial), or we can support:

Article.objects.order_by(Desc(LowerCase('title'))

The latter is a lot easier to implement, and strikes me as a nicer API. However, it's a significant step away from the ordering API we have at the moment. What do people think?

Tim

Marc Tamlyn

unread,
Jun 8, 2014, 6:36:10 AM6/8/14
to django-d...@googlegroups.com
It might be nice (although possibly impractical) to have something like order_by(-LowerCase('title')) [using __neg__ on LowerCase]. That might put logic on the "wrong" objects though. In any case, LowerCase('-title') looks weird.

Naturally just passing in the string 'title' or '-title' should continue to work as before.

I've not looked at the code at all but this seems like a nice design.

Marc


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.

Josh Smeaton

unread,
Jun 8, 2014, 8:24:01 AM6/8/14
to django-d...@googlegroups.com
I'm glad it was easy to implement, I was hoping that would be the case. It also means that expressions are general enough that they can be used elsewhere in the code base. Thanks for following this through!

> It might be nice (although possibly impractical) to have something like order_by(-LowerCase('title')) [using __neg__ on LowerCase]. That might put logic on the "wrong" objects though

Ooh, that does look nice, and should be easy to implement. You can use brackets for order of precedence to ensure that the negation is applied to the correct object I guess.

> Article.objects.order_by(LowerCase('-title'))
> which will require some special cases in the ORDER BY compilation logic (possibly quite substantial), or we can support:
> Article.objects.order_by(Desc(LowerCase('title'))

I've thought about this previously, and what I had in mind was:

- Expressions should support an ordering parameter of some kind in the constructor, and save it to an ordering property. Default to ASC
- If we go with Marc's suggestion, the __neg__ can set the above property. This would look very consistent with the existing ordering api.
- F() objects should parse their field argument, and internally set their ordering attribute appropriately -> F('-title') should strip the `-` and set a DESC ordering.
- Since expressions can be nested, you can't return ASC/DESC from the as_sql() method, otherwise you'd end up with weirdness like: Coalesce(field_a ASC, field_b ASC) instead of Coalesce(field_a, field_b) ASC. Therefore, the order_by machinery should call a `get_ordering()` (or something named similarly) and generate the ASC/DESC token accordingly.

Usage:

Article.objects.order_by(LowerCase('title', ordering='-'))
Article.objects.order_by(-LowerCase('title')) # I prefer this form 
Article.objects.order_by('-title') and Article.objects.order_by(F('-title')) #  should be equivalent (a string argument should be converted to an F())

I'm unsure whether or not expressions should inspect their "children" for an ordering if one isn't supplied in the outer-level:

Article.objects.order_by(LowerCase('-title'))

Should the Lowercase try to resolve the ordering of `title` and use it as it's own? I don't think so, because it could lead to weird conflicts where multiple arguments define different ordering:

Article.objects.order_by(Coalesce('-title', '+description'))  # is ASC or DESC used? Possibly confusing

Thoughts? 

Tim Martin

unread,
Jun 8, 2014, 12:07:42 PM6/8/14
to django-d...@googlegroups.com

On Sunday, 8 June 2014 13:24:01 UTC+1, Josh Smeaton wrote:
I've thought about this previously, and what I had in mind was:

- Expressions should support an ordering parameter of some kind in the constructor, and save it to an ordering property. Default to ASC
- If we go with Marc's suggestion, the __neg__ can set the above property. This would look very consistent with the existing ordering api.
- F() objects should parse their field argument, and internally set their ordering attribute appropriately -> F('-title') should strip the `-` and set a DESC ordering.
- Since expressions can be nested, you can't return ASC/DESC from the as_sql() method, otherwise you'd end up with weirdness like: Coalesce(field_a ASC, field_b ASC) instead of Coalesce(field_a, field_b) ASC. Therefore, the order_by machinery should call a `get_ordering()` (or something named similarly) and generate the ASC/DESC token accordingly.

That sounds like a workable approach. If I'm understanding right, this means putting the ordering flag on the ExpressionNode class so that all nodes will have this whether or not they are to be used in an ordering context?

My only reservation about this when combined with __neg__ is that it means that certain mathematical expressions will behave oddly. For example, you might want to do something like

   Accounts.objects.filter(balance__lt=-F('overdraft_limit'))

but this will end up flipping the ordering rather than doing a mathematical negative. Is this dealt with by putting logic into the SQLEvaluator so that it knows whether it's evaluating in an ordering context or a SELECT or WHERE clause context and does the right thing?
 
Usage:

Article.objects.order_by(LowerCase('title', ordering='-'))
Article.objects.order_by(-LowerCase('title')) # I prefer this form 
Article.objects.order_by('-title') and Article.objects.order_by(F('-title')) #  should be equivalent (a string argument should be converted to an F())

I'm unsure whether or not expressions should inspect their "children" for an ordering if one isn't supplied in the outer-level:

Article.objects.order_by(LowerCase('-title'))

Should the Lowercase try to resolve the ordering of `title` and use it as it's own? I don't think so, because it could lead to weird conflicts where multiple arguments define different ordering:

Article.objects.order_by(Coalesce('-title', '+description'))  # is ASC or DESC used? Possibly confusing

Thoughts?


I think supporting this gives lots of potential problems without much in the way of gain. Even the "correct" form Coalesce('-title', '-description') just looks odd to me. The DESC modifier applies to the expression as a whole, not to any one field within it.

Tim

Marc Tamlyn

unread,
Jun 8, 2014, 12:56:18 PM6/8/14
to django-d...@googlegroups.com
In my opinion, Accounts.objects.filter(balance__lt=-F('overdraft_limit')) would not change the ordering like that. I guess that's an API that doesn't work at the moment anyway.

An alternative is to use ~ instead of - meaning inverse instead of negative. This might be more appropriate (but then is ~LowerCase not UpperCase?).

Marc


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.

Josh Smeaton

unread,
Jun 8, 2014, 8:53:17 PM6/8/14
to django-d...@googlegroups.com
> That sounds like a workable approach. If I'm understanding right, this means putting the ordering flag on the ExpressionNode class so that all nodes will have this whether or not they are to be used in an ordering context?

Yes, that had been what I was thinking, based upon the implementation of PeeWee ( https://github.com/coleifer/peewee/blob/master/peewee.py#L292 ). At first it doesn't seem very nice; having an expression containing properties regarding ordering or even aggregates (like the current is_aggregate). But in my experience it has turned out to be the nicest place to put that information, that doesn't infect a large number of components. "An expression can be ordered" makes sense.

> My only reservation about this when combined with __neg__ is that it means that certain mathematical expressions will behave oddly

__neg__ isn't currently supported on expressions, so it has no current meaning. It could be used in the mathematical context though in the future, and would make sense.

> I think supporting this gives lots of potential problems without much in the way of gain. Even the "correct" form Coalesce('-title', '-description') just looks odd to me. The DESC modifier applies to the expression as a whole, not to any one field within it

Totally agree. Ordering applies to the expression, not any one node within that expression. So we need to figure out how best to support:

.order_by( F('field_a')+F('field_b') )

You can't apply ordering to field_a or field_b, because the returned Expression is actually `+ (field_a, field_b)`. Considering that, a wrapping `Desc(expression)` probably makes a lot of sense, but has the following problems:

1. The syntax is somewhat ugly, but fits with the theme of composable expressions.
2. If `Desc()` is just another expression, you'd need to prevent it from being used in an `inner` context. It can only be a wrapper. I think documenting this limitation is fine - but if there was a way to throw a useful error (rather than the database error that would surely follow), it'd be nice.
3. order_by(F('-field_a')) wouldn't be possible - there's no way for the `F()` to push the `Desc()` to the top of the tree. Since this isn't already supported anyway, I don't think it's that big of a deal.
4. order_by('-field_a') should produce: Desc(F('field_a')), but I think that'd be very easy. Expressions should be the only supported API for order_by, except a string referencing a field (like it is now) that is internally converted to an Expression. I do the same thing with Aggregates right now.

> An alternative is to use ~ instead of - meaning inverse instead of negative. This might be more appropriate (but then is ~LowerCase not UpperCase?).

I don't think anyone would ever expect ~LowerCase to mean UpperCase, but it is still somewhat confusing syntax. ~ is mostly used in Q() objects to mean NOT, and could be very confusing. We also run into similar issues with the `order_by( F()+F() )` use case above - applying a unary operator to either of those objects won't actually result in any ordering being applied.

Josh

Marc Tamlyn

unread,
Jun 9, 2014, 2:34:01 AM6/9/14
to django-d...@googlegroups.com
Agreed. It's worth noting that we could have a `Desc` operator for awkward edge cases where __neg__ doesn't work well, and the default implementation of __neg__ just returns Desc(self). I've not completely considered all the possible implications here, just throwing the idea around.


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.

Anssi Kääriäinen

unread,
Jun 9, 2014, 6:01:18 AM6/9/14
to django-d...@googlegroups.com
Using __neg__ for ordering seems like a good approach to me. I don't like F('-field') that much, if -F('field') works, then F('-field') is duplicate API for the exact same thing.

The only problem with -F('field') is that there might be situations where ORDER BY -field ASC gives different results than ORDER BY field DESC. I am not sure if any such thing exists (maybe null handling?), but if so, and if we also add support for __neq__ for somecol__lt=-F('field'), then users might get unexpected results from .order_by(-F('field')). OTOH if we document that '-' means DESC ordering when used at outermost level in .order_by(), the API is hopefully clear enough.

 - Anssi

Josh Smeaton

unread,
Jun 9, 2014, 6:19:35 AM6/9/14
to django-d...@googlegroups.com
If __neg__ is the preferred method of applying ordering then I think we should prevent __neg__ from being used in the mathematical sense in the other methods (filter, annotate etc). There hasn't been a need for negating an expression/F() object before, and negation would only work with certain expressions anyway (what would -Lower('description') even mean?).

The problem with applying ordering to an expression still applies:

Model.objects.all().order_by( F('field')+F('other') )

Here, it's actually the connector (+) that is the outer-most level, so there's no opportunity to apply ordering to any of the expressions. I've found a similar problem with trying to apply an `output_type` to expressions that are joined by a connector. I think the solution is to introduce a new type that simply wraps another expression for the purposes of mutation.

Model.objects.all().order_by( -Wrap(F('field')+F('other')) )
Model.objects.annotate( Wrap(F('field')+5, output_type=IntegerField()) )

But I'd be interested in hearing alternatives.

Josh 

Marc Tamlyn

unread,
Jun 9, 2014, 6:24:11 AM6/9/14
to django-d...@googlegroups.com
For this example:
Model.objects.all().order_by( -Wrap(F('field')+F('other')) )

does just the brackets not work? -(f + g) should do f + g (returning some object) and then negate the result. This doesn't solve your output type issue. If it doesn't python's wrong ;)

Marc


Anssi Kääriäinen

unread,
Jun 9, 2014, 6:34:04 AM6/9/14
to django-d...@googlegroups.com
Could we use just F() for this, that is the examples would be:
-F(F('field')+F('other'))
F(F('field')+5, output_type=IntegerField())

As Marc said in another post plain parentheses should be enough for the first case.

 - Anssi

Josh Smeaton

unread,
Jun 9, 2014, 7:54:20 AM6/9/14
to django-d...@googlegroups.com
The parens definitely would work for __neg__, and as for the output_type problem, it's only an issue when mixing field types - but that can be solved by applying a matching output_type to one of the inner expressions. If a wrapping type isn't needed for neg, I won't introduce one to slightly improve the aesthetics of combinable expressions.

To answer your question about using F() as a wrapper - no that wouldn't work. F() must refer to an existing field by name - it doesn't wrap expressions.

Josh

Tim Martin

unread,
Jun 9, 2014, 3:44:08 PM6/9/14
to django-d...@googlegroups.com
My concern about the filter(foobar_lt=-F('baz')) issue is that it violates the principle of least surprise: it will be accepted silently and do something that nobody could expect unless they know a bit about the internals. However, I think having some special case code in filter(), annotate() and anything else that takes expressions would be OK. Provided it throws some kind of exception, I think it resolves my concern.

I'll go ahead and try to implement this using __neg__() to invert the ordering.

Tim

Daniel Moisset

unread,
Jun 9, 2014, 4:00:24 PM6/9/14
to django-d...@googlegroups.com
I see everyone trying to put the direction (ascending vs decending) inside the expression, while it's actually a property of the ordering clause. Wouldn't it make more sense and avoid ambiguities to do:

qs.order_by(..., descending=True) 

?

Then you can add functions (and even a __neg__ operator to fields mapped to SQL unary minus) and still have a clear separation of functions vs order direction.

D.



--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.

Michael Manfre

unread,
Jun 9, 2014, 4:08:51 PM6/9/14
to django-d...@googlegroups.com
Each expression must be able to define it's own ordering because order_by allows you to define a sort based upon multiple expressions.


Daniel Moisset

unread,
Jun 9, 2014, 4:12:18 PM6/9/14
to django-d...@googlegroups.com
oh, nevermind, I misread the syntax spec for the order by clause


Tim Martin

unread,
Jun 9, 2014, 4:28:51 PM6/9/14
to django-d...@googlegroups.com

On Monday, 9 June 2014 20:44:08 UTC+1, Tim Martin wrote:

I'll go ahead and try to implement this using __neg__() to invert the ordering.


It's still pretty rough, but there's some code in https://github.com/timmartin/django/tree/order_by_expressions that appears to work. In particular, I haven't yet implemented the LowerCase and related functions (I have one, but it's just a test hack), but that should just be boiler-plate.

Tim

Tim Martin

unread,
Jun 9, 2014, 4:31:28 PM6/9/14
to django-d...@googlegroups.com

And I should have noted that I haven't implemented detection of negation in contexts other than ORDER BY. It's not easy to do it cleanly with the simple approach I've got at the moment. The code could just check that get_ordering() returns ASC, but a) that's gross and b) it doesn't solve the problem of having unary negative on a subexpression, which would get ignored.

Tim

Josh Smeaton

unread,
Jun 9, 2014, 7:48:14 PM6/9/14
to django-d...@googlegroups.com
> However, I think having some special case code in filter(), annotate() and anything else that takes expressions would be OK

I strongly disagree with this. Expressions are context insensitive at the moment, which means they can be used anywhere. If we start externalising behaviour based on the context of use, it'll lead us to some very confusing code and a tonne of special cases. It would be slightly better if you could check the context from the `prepare()` method (are we preparing order_by or not), but then I think we'll run into issues where the prepare breaks for 3rd party backends - by checking private data. I really do think we'd be better off documenting that __neg__ is reserved by Expressions for Ordering, and leave it at that.

I'm surprised how easily expressions slotted in to the order_by - I thought it'd take a lot more work. Good stuff I think. Some questions:

- should expressions in order_by support random ordering (?) ? I don't think so, but I haven't ever had a need for random ordering so I'm not sure.
- could we reduce the complexity of the order_by machinery by forcing all arguments (internally) to be expressions? `order_by('field_a') -> order_by(F('field_a'))` etc

These are open questions, not necessarily directed at Tim.

> I haven't yet implemented the LowerCase and related functions

I think we can build out a handy little library of utility functions like Lower and Coalesce that could(should) be included directly in django. But it'll also open up, to library developers, the ability to easily develop backend specific structures, similar to parts of the django.contrib.postgres kickstarter.

Cheers

Justin Holmes

unread,
Jun 10, 2014, 1:23:51 PM6/10/14
to django-developers
I don't have strong opinions on either of the contentious issues in this discussion, but I do want to weigh-in and say that I think this is an important and exciting feature.


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.

For more options, visit https://groups.google.com/d/optout.



--
Justin Holmes
Chief Chocobo Breeder, slashRoot

slashRoot: Coffee House and Tech Dojo
New Paltz, NY 12561
845.633.8330

Michael Manfre

unread,
Jun 10, 2014, 1:51:29 PM6/10/14
to django-d...@googlegroups.com
On Mon, Jun 9, 2014 at 7:48 PM, Josh Smeaton <josh.s...@gmail.com> wrote:
- should expressions in order_by support random ordering (?) ? I don't think so, but I haven't ever had a need for random ordering so I'm not sure.

It should be possible to create an expression for order_by that essentially does this, at least for simpler databases that support an ORDER BY RAND(). E.g. .order_by(Random())
 
> I haven't yet implemented the LowerCase and related functions

I think we can build out a handy little library of utility functions like Lower and Coalesce that could(should) be included directly in django. But it'll also open up, to library developers, the ability to easily develop backend specific structures, similar to parts of the django.contrib.postgres kickstarter.

Anything utility functions included in core need a way for 3rd party backends to flag as not supported or be able to tweak the underlying SQL. The most likely tweak would be renaming the function to match the specifics of the database. I have to do this for some of the aggregation functions, so it's safe to assume that some database might need to do the same for order_by functions.

Regards,
Michael Manfre

Tim Martin

unread,
Jun 10, 2014, 2:47:46 PM6/10/14
to django-d...@googlegroups.com
On Tuesday, 10 June 2014 00:48:14 UTC+1, Josh Smeaton wrote:
> However, I think having some special case code in filter(), annotate() and anything else that takes expressions would be OK

I strongly disagree with this. Expressions are context insensitive at the moment, which means they can be used anywhere. If we start externalising behaviour based on the context of use, it'll lead us to some very confusing code and a tonne of special cases.

I mean that we would verify __neg__ hadn't been used in the context of filter() etc. This doesn't change behaviour for filter(), it just provides additional error checking.
 
It would be slightly better if you could check the context from the `prepare()` method (are we preparing order_by or not),

That would be a good way to address it, I think. We are calling prepare() for order_by already. We could add a parameter prepare(..., allow_ordering=False) and throw an exception for non-default ordering unless allow_ordering was set to True.
 
but then I think we'll run into issues where the prepare breaks for 3rd party backends - by checking private data.

I'm not sure what kind of problem you're thinking about here (I'm pretty new to this code). Can you give an example?


- should expressions in order_by support random ordering (?) ? I don't think so, but I haven't ever had a need for random ordering so I'm not sure.

Assuming we're not deprecating the order_by('?') syntax (which nobody has suggested, I think) I don't see any benefit to integrating expressions with random ordering. Random ordering can't usefully be combined with function application.

Actually, the possibility of COALESCE(foobar, RANDOM()) just occurred to me - I don't know if that's legal.
 
- could we reduce the complexity of the order_by machinery by forcing all arguments (internally) to be expressions? `order_by('field_a') -> order_by(F('field_a'))` etc

I think you're right, though we'd have to be a bit careful that '?' was left alone and not expanded to F('?'). I don't have time to try it just now, but I'll look at it tomorrow. I'm a bit worried that there's a lot of special cases handled by the SQLCompiler.get_ordering() that might not work nicely as expressions, but I guess I'll find out.

Tim

Josh Smeaton

unread,
Jun 10, 2014, 7:57:22 PM6/10/14
to django-d...@googlegroups.com
> Anything utility functions included in core need a way for 3rd party backends to flag as not supported or be able to tweak the underlying SQL. The most likely tweak would be renaming the function to match the specifics of the database. I have to do this for some of the aggregation functions, so it's safe to assume that some database might need to do the same for order_by functions.

Expressions are compiled first by checking for an `as_{{vendor}}()` then fallback to using `as_sql()`. So to override Lower for mssql you'd do:

def lower_mssql(self, compiler, connection):
    self.function = 'ToLower'
    super(Lower, self).as_sql(compiler, connection)
Lower.as_mssql = lower_mssql

This is all documented in the nonaggregate-annotations patch that the above patch is based on. 

Josh Smeaton

unread,
Jun 10, 2014, 8:02:00 PM6/10/14
to django-d...@googlegroups.com
> That would be a good way to address it, I think. We are calling prepare() for order_by already. We could add a parameter prepare(..., allow_ordering=False) and throw an exception for non-default ordering unless allow_ordering was set to True.

I'd prefer not to include an extra argument to prepare, but to use prepare to check the state of the current query:

def prepare(self, query, ..):
    if query.ordering is None and self.has_ordering():
        # boom

This is what I meant about inspecting private data and perhaps breaking for 3rd party backends. The behaviour is nice and contained, but may rely on internal implementations that may not be consistent. I may be being overly pedantic though :)

> I think you're right, though we'd have to be a bit careful that '?' was left alone and not expanded to F('?'). I don't have time to try it just now, but I'll look at it tomorrow. I'm a bit worried that there's a lot of special cases handled by the SQLCompiler.get_ordering() that might not work nicely as expressions, but I guess I'll find out.

I took a look and you may be right. There is quite a lot of special casing in that function which may not translate very well. The cleanup here could be incremental though, so I wouldn't worry too much about it right yet.

Cheers

Shai Berger

unread,
Jun 11, 2014, 6:58:04 AM6/11/14
to django-d...@googlegroups.com
On Tuesday 10 June 2014 02:48:14 Josh Smeaton wrote:
> > However, I think having some special case code in filter(), annotate()
>
> and anything else that takes expressions would be OK
>
> I strongly disagree with this. Expressions are context insensitive at the
> moment, which means they can be used anywhere. If we start externalising
> behaviour based on the context of use, it'll lead us to some very confusing
> code and a tonne of special cases.

+1 Josh.

However, per the __neg__ issue, I think you're heading in the wrong direction:
ordering shouldn't be a feature of an expression, because an expression with
ordering cannot be used the same way: It cannot be combined with other
expressions (because the ordering of the result is intuitively ambiguous), it
shouldn't be used for comparisons etc.

So, I think, in terms of implementation (and I'm doing this in horrible style,
without looking at your actual patches -- sorry if I'm way off), once you apply
ordering to an expression, you should get a different-typed object. Indeed, a
wrapper would be suitable, but it should no longer be an expression IMO.

And in terms of API as well -- the hints should be clear that this is different
from normal expression creation. Both "-Expression" and "Desc(Expression)"
look like other expressions, so I think they're suboptimal. I'd prefer
something like "Expression.desc()" which reads nicely:

Model.objects.filter(...).order_by( F('fld_a').desc(), F('fld_b')).asc() )
Model.objects.filter(...).order_by( (F('fld_a')+F('fld_b')).desc() )

(desc as a property instead of method would read nicer but feels less "right"
as Python).

My 2 cents,

Shai.

Josh Smeaton

unread,
Jun 11, 2014, 7:28:20 AM6/11/14
to django-d...@googlegroups.com
 Model.objects.filter(...).order_by( F('fld_a').desc(), F('fld_b')).asc() ) 
>  Model.objects.filter(...).order_by( (F('fld_a')+F('fld_b')).desc() ) 

I actually really like this. It's simple, clear, and gets around the issues being discussed. It still requires Expressions to know about ordering (to implement the asc/desc methods) but I think that's fine.

The other problem with having ordering objects being of a different type means that you would have to call asc or desc. But now I think of it, the order_by method could just call asc, by default, on any expressions passed in.

Think you've come up with a winner here.

Cheers,

Tim Martin

unread,
Jun 15, 2014, 1:45:02 PM6/15/14
to django-d...@googlegroups.com
On Wednesday, 11 June 2014 12:28:20 UTC+1, Josh Smeaton wrote:
 Model.objects.filter(...).order_by( F('fld_a').desc(), F('fld_b')).asc() ) 
>  Model.objects.filter(...).order_by( (F('fld_a')+F('fld_b')).desc() ) 

I actually really like this. It's simple, clear, and gets around the issues being discussed. It still requires Expressions to know about ordering (to implement the asc/desc methods) but I think that's fine.

The other problem with having ordering objects being of a different type means that you would have to call asc or desc. But now I think of it, the order_by method could just call asc, by default, on any expressions passed in.

I agree, this looks like a good API and fixes a lot of my concerns. I've got this implemented in my branch now (https://github.com/timmartin/django/tree/order_by_expressions)

Other than some small bits of tidying, I think the only thing missing here is documentation and a bunch of instances of expressions.Func() for the SQL functions we want to support.

Josh, what's the likely timeline of getting your branch merged in? And will your change include any function definitions, or shall I do that as part of my work?

Tim

Josh Smeaton

unread,
Jun 15, 2014, 8:25:45 PM6/15/14
to django-d...@googlegroups.com
Nice work, I think that's a lot better. The only thing that now worries me is that order_by doesn't accept sql parameters for some reason. I'm not sure how many expressions produce sql_params and if they'd be useful in an ordering context. It might be worth looking into that a little more.

I'm not sure when this branch will get merged, but I'm fairly confident it will be this time around. Anssi did another review pass and contributed some nice fixes. He indicated that he'd try to do another pass next week (this week?), but that's all very dependent on the available time he has. I'm hoping the merge will be soon, as it gets quite annoying fixing the merge conflicts every time I rebase onto master :P

As for function definitions, I think that should be part of a separate PR. We don't want the implementations of various functions to hold up the merging of expressions or ordering expressions. There's also the small design question of where these functions should live. I was thinking of django.db.models.functions or django.db.models.expressions.functions, but I'm not married to that idea. We also need to determine a set of functions that we'd like to commit to supporting in core. Further, if there are differences in implementations per backend, should the differences be surfaced in each of the backends, or should they reside entirely within the expression using as_oracle and as_mysql? I would prefer the as_{vendor} approach, as it allows other backends to easily implement without worrying about connection.backend.yet_another_backend_method_to_support, but that'd be a fairly big deviation from the way things are typically done with supported backends. Perhaps we can start another discussion on the ML so we can address these concerns without it being hidden inside this thread?

Josh

Tim Martin

unread,
Jun 17, 2014, 5:51:31 PM6/17/14
to django-d...@googlegroups.com
On Monday, 16 June 2014 01:25:45 UTC+1, Josh Smeaton wrote:
Nice work, I think that's a lot better. The only thing that now worries me is that order_by doesn't accept sql parameters for some reason. I'm not sure how many expressions produce sql_params and if they'd be useful in an ordering context. It might be worth looking into that a little more.

Disallowing params was just a conservative default. I can't see any reason why this can't be made to work. However, I couldn't see any obvious expressions that would need to use parameters, and I was doubtful that it would be supported across all backends.

I guess COALESCE(some_field, %s) could be useful in certain circumstances, or SUBSTRING(some_field, %s, %s).

On the cross-platform issue, is the general pattern to allow params anywhere and just expose the error to the developer if they use an expression with parameters in a place that doesn't support it on their backend? I'm fine to follow that if it's the pattern.
 
[Support for platform-dependence in function expressions]
 Perhaps we can start another discussion on the ML so we can address these concerns without it being hidden inside this thread?

Sounds good. Do you wan to start a new topic, since you understand the issue better than I do at the moment?

Tim

Josh Smeaton

unread,
Dec 29, 2014, 6:36:35 AM12/29/14
to django-d...@googlegroups.com
Sorry to dig up an old thread, but I've just created a ticket to bring this functionality into django: https://code.djangoproject.com/ticket/24060

Tim Martin, are you interested in bringing this up to date? The alpha freeze for 1.8 is Jan 12th, and I'd really like to get this ready before then. I'm happy to pick up your work and finish it off if you haven't got the time to do it though. Just thought I'd give you the chance since you've put in the effort.

Josh
Reply all
Reply to author
Forward
0 new messages