JOIN instead of multiple SELECT

62 views
Skip to first unread message

Federico Capoano

unread,
Mar 31, 2010, 4:00:38 PM3/31/10
to Django users
Hello to all,

has been a while i've been wondering how to optimize Django's queries
to the database, for example by setting it to use JOIN to retrieve
foreign keys instead of multiple selects.

for example if I have a blog object that has a category foreign key
and I write in a template

{{ blog.category.slug }}

I get another query to select the category.

Is there a way I can use a JOIN instead to avoid multiple queries?

Does this have sense for performance optimization?

Thanks
Federico

Rolando Espinoza La Fuente

unread,
Mar 31, 2010, 4:49:16 PM3/31/10
to django...@googlegroups.com
On Wed, Mar 31, 2010 at 3:30 PM, Federico Capoano
<nemesis...@libero.it> wrote:
> Hello to all,
>
> has been a while i've been wondering how to optimize Django's queries
> to the database, for example by setting it to use JOIN to retrieve
> foreign keys instead of multiple selects.
>
> for example if I have a blog object that has a category foreign key
> and I write in a template
>
> {{ blog.category.slug }}
>
> I get another query to select the category.
>
> Is there a way I can use a JOIN instead to avoid multiple queries?

select_related() ?

http://docs.djangoproject.com/en/dev/ref/models/querysets/#id4

~Rolando

Federico Capoano

unread,
Mar 31, 2010, 5:52:11 PM3/31/10
to Django users
Thanks,

how many things i've learnt today, to optimize the number and length
of query by using:

* select_related
* only
* extra

Do you think the performance gain is worth the work?

And I've a curiosity more to ask:
If I use the cache framework, once the results are cached the will the
database be hit again?

Best Regards
Federico

On Mar 31, 10:49 pm, Rolando Espinoza La Fuente <dark...@gmail.com>
wrote:


> On Wed, Mar 31, 2010 at 3:30 PM, Federico Capoano
>

Rolando Espinoza La Fuente

unread,
Mar 31, 2010, 6:49:00 PM3/31/10
to django...@googlegroups.com
On Wed, Mar 31, 2010 at 5:22 PM, Federico Capoano
<nemesis...@libero.it> wrote:
> Thanks,
>
> how many things i've learnt today, to optimize the number and length
> of query by using:
>
> * select_related
> * only
> * extra
>
> Do you think the performance gain is worth the work?

Yes. Specially in loops where you have:

# in your view
posts = Post.objects.all()[:20]

# in your template
{% for post in posts %}
{{ post.title }}
{{ post.category.name }}
{{ post.owner.username }}
{% endfor %}

post.category and post.owner will hit the database each loop.

> And I've a curiosity more to ask:
> If I use the cache framework, once the results are cached the will the
> database be hit again?

No. But you need to take care of invalidation.

There are few apps that provide drop-out solution for caching at orm level:
* cache machine - http://jbalogh.me/2010/02/09/cache-machine/
* cachebot - http://blog.davidziegler.net/post/429237463/announcing-django-cachebot
* johnny cache - http://jmoiron.net/blog/is-johnny-cache-for-you/

Regards,

~Rolando

Federico Capoano

unread,
Apr 1, 2010, 7:18:51 AM4/1/10
to Django users
Very cool, if the database won't be hit often and I manage to use the
new template caching functionality added to Django 1.2 the result will
be really performant.

Thank you very much, I save these info on caching for future
reference.

On Apr 1, 12:49 am, Rolando Espinoza La Fuente <dark...@gmail.com>
wrote:


> On Wed, Mar 31, 2010 at 5:22 PM, Federico Capoano
>

Federico Capoano

unread,
Apr 1, 2010, 8:59:38 AM4/1/10
to Django users
What about this solution to generate static files?

http://superjared.com/projects/static-generator/

On Apr 1, 12:49 am, Rolando Espinoza La Fuente <dark...@gmail.com>
wrote:


> On Wed, Mar 31, 2010 at 5:22 PM, Federico Capoano
>

> <nemesis.des...@libero.it> wrote:
> > Thanks,
>
> > how many things i've learnt today, to optimize the number and length
> > of query by using:
>
> > * select_related
> > * only
> > * extra
>
> > Do you think the performance gain is worth the work?
>
> Yes. Specially in loops where you have:
>
> # in your view
> posts = Post.objects.all()[:20]
>
> # in your template
> {% for post in posts %}
>     {{ post.title }}
>     {{ post.category.name }}
>     {{ post.owner.username }}
> {% endfor %}
>
> post.category and post.owner will hit the database each loop.
>
> > And I've a curiosity more to ask:
> > If I use the cache framework, once the results are cached the will the
> > database be hit again?
>
> No. But you need to take care of invalidation.
>
> There are few apps that provide drop-out solution for caching at orm level:

Alexander

unread,
Apr 1, 2010, 9:53:25 AM4/1/10
to Django users
> * select_related
> * only
> * extra

You also might be interested in "anotate" and "aggregate"
http://docs.djangoproject.com/en/dev/topics/db/aggregation/

bob84123

unread,
Apr 1, 2010, 7:04:50 PM4/1/10
to Django users
You probably want to check out select_related:
http://docs.djangoproject.com/en/dev/ref/models/querysets/#id4

bob84123

unread,
Apr 1, 2010, 7:08:46 PM4/1/10
to Django users
grr @ google groups for not showing me that this had already been
answered...
Reply all
Reply to author
Forward
0 new messages