Django ORM - query help

63 views
Skip to first unread message

David

unread,
Apr 11, 2012, 10:28:11 AM4/11/12
to Django users
class Log(models.Model):
thing = models.ForeignKey(Thing)
context = models.CharField(max_length=255)
action = models.CharField(max_length=255)
modifier = models.ForeignKey(User, limit_choices_to={'groups__in':
[2]})
modified_on = models.DateTimeField(auto_now=True)

class Thing(models.Model):
prefix = models.CharField(max_length=50,choices=PREFIXES,
blank=True)
first_name = models.CharField(max_length=255)
last_name = models.CharField(max_length=255)

Hello,

Given the above models, how can I achieve the following results:

A unique "Thing" instance along with the last (in terms of date)
modified_on instance (ideally with access to the other related fields
such as modifier) from the "Log" model? I have tried using .annotate
but it seems to result in a very ugly left outer join which makes me
think I must be doing something wrong.

Such as:

last_modified =
Thing.objects.select_related().annotate(deletion_date=Max('log__modified_on')).order_by('-
deletion_date')

Any assistance would be much appreciated.

Jani Tiainen

unread,
Apr 11, 2012, 1:05:37 PM4/11/12
to django...@googlegroups.com
Hi,

You're not doing anything wrong. The catch is that since "Thing" can exist without Log you will get outer join. 

If you want to get along with inner join, you should turn query around and start querying from Log model. I'm just too tired to think how it should be done right now... =)



--
You received this message because you are subscribed to the Google Groups "Django users" group.
To post to this group, send email to django...@googlegroups.com.
To unsubscribe from this group, send email to django-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.


David

unread,
Apr 11, 2012, 4:47:57 PM4/11/12
to django...@googlegroups.com
Thanks for your reply. I am grateful for your help. If you remember this question when you feel less sleepy I'd be very interested to see the inner join alternative :)

Thanks again

Jani Tiainen

unread,
Apr 12, 2012, 1:25:45 AM4/12/12
to django...@googlegroups.com
11.4.2012 23:47, David kirjoitti:
> Thanks for your reply. I am grateful for your help. If you remember this question when you feel less sleepy I'd be very interested to see the inner join alternative :)
>
> Thanks again
>

Well, after good night sleep I've some idea.

Something like should do the trick:

Log.objects.values('thing').annotate(deletion_date=Max('modified_on')).order_by('-deletion_date')

--

Jani Tiainen

Message has been deleted

David

unread,
Apr 12, 2012, 4:52:27 AM4/12/12
to django...@googlegroups.com

Hi Jani

That was very helpful. Is there a way to include select_related into that query? or do I have to list every single field I would like to return using values()?

last_deleted = ModificationLog.objects.values('thing__id', ' thing__prefix', ' thing __first_name', ' thing__last_name', ' thing__company_name', 'thing__creator', ' thing __created_on', 'modifier').annotate(deletion_date=Max('modified_on')).filter(thing__deleted=1).order_by('-deletion_date')[:20]

This for example works, but I don't have access to User. Thing contains a FK to the User model, but I'm unable to do in the template {{ object_list.creator.get_full_name }} 


But I am a lot lot closer than I was before, thank you.

akaariai

unread,
Apr 12, 2012, 6:34:57 AM4/12/12
to Django users
One reason why such queries are hard is that they aren't easy to
express using SQL. The left join is correct from ORM perspective: it
can't know that each entry will have at least one log entry attached.
I am not sure how to tell that to the ORM without doing ugly trick
using sql/query.py internals.

If you happen to be using PostgreSQL and Django 1.4 you could probably
do something like this (using DISTINCT ON support added in 1.4):

> ml_objs = ModificationLog.objects.distinct('thing__id').order_by('thing__id', 'modified_on').select_related('thing').

You could then swap the objects around:

> [setattr(ml.thing, 'last_mod', ml) for ml in ml_objs]
> things = [ml.thing for ml in ml_objs]

I am not sure at all the above will work, or that the .distinct() and
order_by() calls are correct for your use case. But DISTINCT ON could
work here...

In general fetching the latest object(s) per related category is a
common operation, and it would be extremely nice if prefetch_related
and/or select_related would support that out of the box. It is a very
commonly repeating pattern: show me the latest mail in each thread,
show me the latest edit for each article, show the highest rating
employee per project and so on. A modification of this theme is "show
me the N latest edits per category" which is pretty hard to do
efficiently using the current ORM APIs.

So, if somebody has ideas how to implement this into prefetch_related/
select_related I am all ears.

- Anssi

Jani Tiainen

unread,
Apr 12, 2012, 7:00:36 AM4/12/12
to django...@googlegroups.com
12.4.2012 11:51, David kirjoitti:
> Hi Jani
>
> That was very helpful. Is there a way to include select_related into
> that query? or do I have to list every single field I would like to
> return using values()?
>
> last_deleted = ModificationLog.objects.values('thing__id', ' thing
> __prefix', ' thing __first_name', ' thing__last_name', '
> thing__company_name', 'thing__creator', ' thing __created_on',
> 'modifier').annotate(deletion_date=Max('modified_on')).filter(appointment__deleted=1).order_by('-deletion_date')[:20]

>
> This for example works, but I don't have access to User. Thing contains
> a FK to the User model, but I'm unable to do in the template {{
> object_list.creator.get_full_name }}
>
> But I am a lot lot closer than I was before, thank you.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Django users" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/django-users/-/XRx3bOc68p0J.

> To post to this group, send email to django...@googlegroups.com.
> To unsubscribe from this group, send email to
> django-users...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/django-users?hl=en.

I suppose that you have to add all fields. I suggest that you install
django-command-extensions and IPython. Run ./manage.py shell_plus and
start experimenting.

--

Jani Tiainen

David

unread,
Apr 12, 2012, 9:01:51 AM4/12/12
to django...@googlegroups.com
Thank you akaariai

That put me on the right track.

Log.objects.distinct('thing__id').order_by('thing__id', '-modified_on').select_related().filter(thing__deleted=0)[:20]

By avoiding the use of values() I was able to then use the result as an object and access everything I needed.

The above ORM statement however does not look as elegant to read as I have come to expect from Django though. The resulting SQL doesn't seem too shabby however.

akaariai

unread,
Apr 12, 2012, 9:58:26 AM4/12/12
to Django users
On Apr 12, 4:01 pm, David <cthl...@gmail.com> wrote:
> The above ORM statement however does not look as elegant to read as I have
> come to expect from Django though. The resulting SQL doesn't seem too
> shabby however.

.distinct(fields) + .order_by() is pretty low level stuff - that is
why it would be nice if there was a dedicated API for queries like
this. Something like:

> qs.first_related('log', order_by='-modified_on')
OUT: qs annotated with log entries.

Turning that into efficient and cross-DB compliant SQL isn't the
easiest thing to do. If you want N first objects instead of one, then
it is going to be even harder.

- Anssi

Andre Terra

unread,
Apr 12, 2012, 10:47:32 AM4/12/12
to django...@googlegroups.com
On Thu, Apr 12, 2012 at 10:01 AM, David <cth...@gmail.com> wrote:
Log.objects.distinct('thing__id').order_by('thing__id', '-modified_on').select_related().filter(thing__deleted=0)[:20]

By avoiding the use of values() I was able to then use the result as an object and access everything I needed.

The above ORM statement however does not look as elegant to read as I have come to expect from Django though. The resulting SQL doesn't seem too shabby however.


Django has a tendency of making you write really long lines, but I guess you could break it into smaller chunks to improve readability.

logs = Log.objects.distinct('thing__id').order_by('thing__id', '-modified_on')
related = logs.select_related().filter(thing__deleted=0)[:20]


Cheers,
AT

Russell Keith-Magee

unread,
Apr 12, 2012, 8:18:48 PM4/12/12
to django...@googlegroups.com

On Thursday, 12 April 2012 at 10:47 PM, Andre Terra wrote:

The other option is to find ways to break the line using Python's own syntax. e.g.:

Log.objects.distinct(
'thing__id'
).order_by(
'thing__id', '-modified_on'
).select_related(
).filter(
thing__deleted=0
)[:20]


Yours,
Russ Magee %-)


Reply all
Reply to author
Forward
0 new messages