Django beta slower with some queries?

2 views
Skip to first unread message

julianb

unread,
Apr 24, 2009, 10:45:57 AM4/24/09
to Django users
Hi,

I have used Django 1.0.2 but moved to beta, because I badly needed
some new features. I want to "report" that I found some queries to
take considerably longer, while not having changed anything in the
code at all. I talk about a rather simple query, but one which
involves some joins with tables containing > 1 million rows. On the
"old" Django version everything was running absolutely smoothly.
However, once I switch to a later Django revision, the specific page
with it's queries comes to halt. It's like someone hit the brakes.
Whithin minutes the site is not usable at all and MySQL is still busy
with the initial queries. Just switching Django back to old makes
everything run fine again.
I'm using MySQL and with the latest version the queries take seconds
and are displayed in the processlist as being prepared, sorted, sent
and so on while on the old version it was running whithin miliseconds
as I sad.
Now, I don't know where I should start looking for something going
wrong, or if any db conenction/MySQL/ORM things have been changed
lately. I appreciate any help in solving this. Something that
definitely worked should get working again with the latest Django,
too. Thanks!

Karen Tracey

unread,
Apr 24, 2009, 12:05:27 PM4/24/09
to django...@googlegroups.com

I'd start by using a couple of Python shells (one using 1.0.2 and one using 1.1 beta) and connection.queries to see if you can see the difference in SQL generated for whatever model queries your view is using:

http://docs.djangoproject.com/en/dev/faq/models/

Karen

julianb

unread,
Apr 27, 2009, 4:11:37 PM4/27/09
to Django users
On Apr 24, 6:05 pm, Karen Tracey <kmtra...@gmail.com> wrote:
> I'd start by using a couple of Python shells (one using 1.0.2 and one using
> 1.1 beta) and connection.queries to see if you can see the difference in SQL
> generated for whatever model queries your view is using:
>
> http://docs.djangoproject.com/en/dev/faq/models/
>
> Karen

Hi Karen,

I wouldn't have written the post if I had not already done that. At
least so I thought. Comparing the resulting queries once more, I found
the difference. Django is using subselects now.
http://docs.djangoproject.com/en/dev/ref/models/querysets/#in

In my code, I just used a queryset as parameter for __in and in
previous Django versions it would evaluate to a list of numbers
whereas now it does a subselect. I don't think that's very backwards
compatible.

Reading the part "Performance considerations" got me in the right
direction but didn't help me very well, because if you do it like
that, it will still do a subselect. I had to explicitly rewrite the
ValuesListQuerySet to a list to really get a list of IDs into my
query. Looks like a bug, but I'm too busy to confirm that right now.

Julian

Malcolm Tredinnick

unread,
Apr 27, 2009, 4:41:09 PM4/27/09
to django...@googlegroups.com
On Mon, 2009-04-27 at 13:11 -0700, julianb wrote:
[...]

> In my code, I just used a queryset as parameter for __in and in
> previous Django versions it would evaluate to a list of numbers
> whereas now it does a subselect. I don't think that's very backwards
> compatible.

It's fully backwards compatible. Exactly the same results are returned;
the functionality is identical.

Further, on most database servers we support, the subselect approach
will be faster, since network roundtrips, C to Python conversions and
some data marhsalling operations are all avoided. It also gives the
database server as more opportunities to perform optimisations. You
mentioned you were using MySQL as the backend, which, sadly, has some
performance issues with (some) subselects and you will have to check the
various things you're doing there, hence the documentation note. It's
not correct to always avoid subselects with MySQL (which is why we
don't), but has to be examined case-by-case if the query is
time-critical. Hopefully that will work improve in the future, as the
MySQL developers enable the server and storage engine optimisation paths
to interact a bit more. Some of that work is happening already as part
of the Drizzle project.

> Reading the part "Performance considerations" got me in the right
> direction but didn't help me very well, because if you do it like
> that, it will still do a subselect.

That's a documentation bug. There should be a list() call wrapped around
the rhs of the first line in the fragment in "Performance
considerations". If you open a ticket for that, we'll fix it.

Regards,
Malcolm


julianb

unread,
Apr 28, 2009, 2:06:21 AM4/28/09
to Django users
On Apr 27, 10:41 pm, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:
> > Reading the part "Performance considerations" got me in the right
> > direction but didn't help me very well, because if you do it like
> > that, it will still do a subselect.
>
> That's a documentation bug. There should be a list() call wrapped around
> the rhs of the first line in the fragment in "Performance
> considerations". If you open a ticket for that, we'll fix it.

I'll do that. Thanks Malcolm!
Reply all
Reply to author
Forward
0 new messages