ORM difficulties

114 views
Skip to first unread message

Raphael Gaschignard

unread,
Dec 5, 2015, 10:43:39 AM12/5/15
to Django developers (Contributions to Django itself)

Hi list, 

  I want to preface this by saying I’m really glad to see the ORM where it is in 1.8, it’s gotten really far and I now think it’s not hopeless to imagine writing complex things in the ORM…


   So earlier this week I wrote a bit of a rant in the bug tracker about how annotate is confusing (https://code.djangoproject.com/ticket/25834).


  A short summary is that annotate and values (with a spattering of order_by) are basically what decides grouping, but there’s a lot of trial and error involved in getting the right grouping with a tricky interaction between the two.


  A behaviour that exemplifies this is that queryset.annotate(foo=thing).annotate(bar=other_thing) is not the same as queryset.annotate(foo=thing, bar=other_thing) given certain things. This goes against the intuitive interpretation of the queryset API IMO.


  I think there should be some update to the API to render grouping more explicit. Absent that, given that the docs are 

now oriented in a sort of “You shouldn’t need extras anymore” fashion, there really should be a “SQL to ORM” migration guide/cookbook to point out explicitly how to go from SELECT stuff FROM table GROUP BY properties to the right annotate/values. I found what jarshwah pointed out in the ticket was really helpful.


  Also, some random feature requests:

  • .values(‘stuff’, my_thing=Coalesce(‘thing’, ‘stuff’)) should work
  • there should be a provided Year and Month functions to extract years/months from date fields

  Anyways I wanted to share this experience so that anyone who has the courage to right new docs/continue evolving things can know of at least one team’s difficulties.


 Raphael

Aymeric Augustin

unread,
Dec 5, 2015, 3:25:18 PM12/5/15
to django-d...@googlegroups.com
Hello,

> On 5 déc. 2015, at 10:28, Raphael Gaschignard <rap...@rtpg.co> wrote:
>
> A behaviour that exemplifies this is that queryset.annotate(foo=thing).annotate(bar=other_thing) is not the same as queryset.annotate(foo=thing, bar=other_thing) given certain things. This goes against the intuitive interpretation of the queryset API IMO.

Isn't this the same difference as between queryset.filter(foo__bar=bar).filter(foo__baz=baz) and queryset.filter(foo__bar=bar, foo__baz=baz)? I’m asking because if the answer is “yes”, then we want to keep the two APIs consistent and we can’t quite change how chained filter calls set up joins.

I don’t quite have a good answer about your other questions… I know that some people will say “stick to raw SQL”. That’s even what the documentation advises. However it appears that the ORM’s ability to compose querysets is so powerful that it outweighs the greater expressivity of SQL in many practical cases. When writing complex aggregations, I’ve had more luck trying to express the logic with Django’s APIs than trying to produce a particular SQL query. It was still very frustrating and mostly based on trial and error.

--
Aymeric.

Josh Smeaton

unread,
Dec 5, 2015, 9:16:21 PM12/5/15
to Django developers (Contributions to Django itself)
Also, some random feature requests:
  • .values(‘stuff’, my_thing=Coalesce(‘thing’, ‘stuff’)) should work
 I thought this was already tracked but I couldn't find an existing ticket. So I created it: https://code.djangoproject.com/ticket/25871
Feel free to leave comments there if you feel that's a good idea, and to discuss some API ideas (like how to preserve existing **kwargs in values()).
  • there should be a provided Year and Month functions to extract years/months from date fields
It's coming in 1.10 https://github.com/django/django/pull/5683, although you can already sort of use the private versions (which will be moved in 1.10 so be careful) https://github.com/django/django/blob/6f229048ddd8c7347ff60dddfb9121e6021c7b2e/django/db/models/lookups.py#L511

As far as grouping semantics go, I'm open to discussion! If someone was to put forward a proposal that could work and wouldn't unnecessarily burden users that aren't intimately familiar with SQL, then that would likely be welcome. Anssi mentioned exposing the underlying grouping method/dict which may be a good idea, but we'd like to avoid messing around with internal state too much.

Personally, I don't find grouping to be too much of a problem. Django generates the correct grouping based on what you've limited the columns to in your `values()` clause. There are outliers which we can maybe defend against with docs. The most unintuitive thing is adding `Meta: ordering` columns to the group by list, but I think that changed recently, or was just added?

Reply all
Reply to author
Forward
0 new messages