Hi all,
Fairly new to Django. I ended up pulling out all of the ORM-generated queries and writing my own SQL directly (I got fed up trying to work out how to achieve the kind of things I needed without Django adding in extra joins or unintended WHERE clauses etc). All my app's SQL uses cursor.execute() and the dictfetchall() method as referenced here.
I've found that my app incurs a couple of seconds load time in production, with CPU time at 2532ms and overall time 4684ms (according to the debug toolbar). I'm seeing 8 SQL queries take 380ms, and each one seems to be several times slower when made by Django versus hitting the database through phpMyAdmin or something: eg, this query:
SELECT * FROM newsJOIN news_categories ON news.news_category_id = news_categories.idLEFT JOIN writers ON news.writer_id = writers.idLEFT JOIN images ON news.image_id = images.idORDER BY news.is_sticky DESC, news.date_posted DESCLIMIT 10
This takes 14.8ms when run in phpMyAdmin (against the production database) but Django reports it as 85.2ms. The same ratios are true for all my other queries.
All I can think of is the note on the dictfetchall() method in the Django docs which describes a "small performance hit". Is this it?!
I've profiled the app too, although I'm a bit hazy about what it all means. Here's a dump of the result: http://pastebin.com/raw.php?i=UHE9edVC (this is from running on my local server rather than production but performance is broadly similar).
Can anyone help me? I realise I've perhaps gone off-piste by writing raw SQL but I feel it was justified.
thanks,Matt
--
You received this message because you are subscribed to the Google Groups "Django users" group.
To view this discussion on the web visit https://groups.google.com/d/msg/django-users/-/Qiley9RqfngJ.
To post to this group, send email to django...@googlegroups.com.
To unsubscribe from this group, send email to django-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
To view this discussion on the web visit https://groups.google.com/d/msg/django-users/-/CxYZKPFVBQgJ.
To view this discussion on the web visit https://groups.google.com/d/msg/django-users/-/NvJBbauReMUJ.
Yes, local app, local DB (which is an exact copy of the production DB). Interestingly, I just tried pointing the local app at the remote database and total query time (for 10 queries) went from 115ms to 3477ms! Really odd.
On Tuesday, January 22, 2013 7:54:53 PM UTC, Nikolas Stevenson-Molnar wrote:
When running locally, are you also using a local database instance? IIRC the time that phpMyAdmin reports is the time taken for the database to execute the query, whereas the time reported by the debug toolbar (I think) is complete round-trip (send query, execute, return results). So if there is latency between your application and your database, that could account for the discrepancy.
The debug toolbar does add overhead to your request (about doubles the page load time), but the numbers it reports should be more or less accurate (iow a page might take 1s to load with the toolbar, but the toolbar will report 500ms, which is accurate for load-time without the toolbar).
_Nik
On 1/22/2013 9:26 AM, Matt Andrews wrote:
Hi Nik,
I see the discrepancy both locally and when deployed, too -- that's what's so puzzling. If it helps, I'm using MySQL and running it on Windows locally and under Passenger on my remote Linux server (also on MySQL).
Only other thing I can think of is that this "overhead" might literally be the extra work Django Debug Toolbar does to show queries etc. Obviously when I turn the toolbar off then I can't see (directly) the SQL times Django is getting, but could be a factor? Either way, with the toolbar off on production, there's definitely a slight "lag" before page requests start to kick in -- maybe this is unrelated to the SQL, though.
On Tuesday, 22 January 2013 16:42:19 UTC, Nikolas Stevenson-Molnar wrote:
Hi Matt,
It's unlikely the problem lies with dictfetchall. The "small performance hit" mentioned by the docs probably refers to the fact that the function has a time complexity of (number of colulmns) * (number of rows). Since you're only getting 10 rows back, I can't see that having an appreciable impact.
I haven't seen this kind of variation between performing a direct query, and executing the query as part of a Django request� especially with such a basic query. Do you see the same difference in timing locally or is it only when deployed?
_Nik
> � � Hi,
>
> � � � From your raw SQL I saw you're doing few joins. So I suppose you do
> � � quite a few foreign key fetches.
>
> � � You didn't mention anything how you originally tried to solve case with
> � � ORM. Could you please publish what you had when things were slow?
>
> � � 22.1.2013 12:26, Matt Andrews kirjoitti:
> � � �> Hi Nik,
> � � �>
> � � �> Thanks - I do feel like by circumventing the ORM I've just "given
> � � up"
> � � �> and perhaps I'll reconsider -- none of my queries are particularly
> � � �> "specialist" (as the sample above indicates) - I just found Django
> � � �> generating odd things.
> � � �>
> � � �> To answer your questions:
> � � �>
> � � �> 1. Yes, reloading the page sees the same time for the queries (it
> � � just
> � � �> feels as though the entire process takes a second or two to
> � � start, which
> � � �> is perhaps not related to SQL itself).
> � � �>
> � � �> 2. I believe so, yes (it's shared hosting...).
> � � �>
> � � �> If you're curious, you can see a sample of the app at
> � � �> http://beta.scenepointblank.com <http://beta.scenepointblank.com>
> � � (obviously you won't see the SQL, but
> � � �> the "delay" between pages, even though these pages are all cached
> � � for
> � � �> 2hrs+, is partly my concern here).
> � � �>
> � � �> On Tuesday, January 22, 2013 4:24:09 AM UTC, Nikolas
> � � Stevenson-Molnar wrote:
> � � �>
> � � �> � � Hi Matt,
> � � �>
> � � �> � � Firstly, I encourage you to have another crack a the ORM. It can
> � � �> � � certainly seem a bit aggravating at times if you're coming
> � � from a
> � � �> � � SQL mindset, but really pays off down the road in terms of
> � � �> � � maintainability and readability. Typically you should only
> � � need raw
> � � �> � � queries in Django in cases where you have super-specialized
> � � (that
> � � �> � � uses views or non-standard functions) queries or need some
> � � specific
> � � �> � � optimization. If there's really no way to perform many of your
> � � �> � � "day-to-day" queries with the ORM then that's an indication
> � � that a
> � � �> � � different database design may fit your data model better. I
> � � �> � � understand that you may have a unique situation, but I just
> � � wanted
> � � �> � � to throw that out there as I personally find the ORM to be a
> � � huge
> � � �> � � time saver.
> � � �>
> � � �> � � Now, with that out of the way... a couple of considerations:
> � � 1) you
> � � �> � � say it's a slow "startup"; if you refresh the page do the
> � � queries
> � � �> � � run just as slow the second time around? and 2) are your
> � � Django app
> � � �> � � and phpMyAdmin running on the same machine? If not, could
> � � transit
> � � �> � � time be an issue?
> � � �>
> � � �> � � Finally, can you give an idea about the size of the tables in
> � � �> � � question? How many rows in each?
> � � �>
> � � �> � � _Nik
> � � �>
> � � �> � � On 1/21/2013 3:25 PM, Matt Andrews wrote:
> � � �>> � � Hi all,
> � � �>>
> � � �>> � � Fairly new to Django. I ended up pulling out all of the
> � � �>> � � ORM-generated queries and writing my own SQL directly (I got
> � � fed
> � � �>> � � up trying to work out how to achieve the kind of things I
> � � needed
> � � �>> � � without Django adding in extra joins or unintended WHERE
> � � clauses
> � � �>> � � etc). All my app's SQL uses cursor.execute() and the
> � � �>> � � dictfetchall() method as referenced here
> � � �>>
> � � <https://docs.djangoproject.com/en/dev/topics/db/sql/#django.db.models.Manager.raw
> � � <https://docs.djangoproject.com/en/dev/topics/db/sql/#django.db.models.Manager.raw>>.
>
> � � �>>
> � � �>>
> � � �>> � � I've found that my app incurs a couple of seconds load time in
> � � �>> � � production, with CPU time at 2532ms and overall time 4684ms
> � � �>> � � (according to the debug toolbar). I'm seeing 8 SQL queries take
> � � �>> � � 380ms, and each one seems to be several times slower when
> � � made by
> � � �>> � � Django versus hitting the database through phpMyAdmin or
> � � �>> � � something: eg, this query:
> � � �>>
> � � �>> � � � � SELECT * FROM news
> � � �>> � � � � JOIN news_categories ON news.news_category_id =
> � � �>> news_categories.id <http://news_categories.id>
> � � <http://news_categories.id>
> � � �>> � � � � LEFT JOIN writers ON news.writer_id = writers.id
> � � <http://writers.id>
> � � �>> � � � � <http://writers.id>
> � � �>> � � � � LEFT JOIN images ON news.image_id = images.id
> � � <http://images.id> <http://images.id>
> � � �>> � � � � ORDER BY news.is_sticky DESC, news.date_posted DESC
> � � �>> � � � � LIMIT 10
> � � �>>
> � � �>>
> � � �>> � � This takes 14.8ms when run in phpMyAdmin (against the
> � � production
> � � �>> � � database) but Django reports it as 85.2ms. The same ratios are
> � � �>> � � true for all my other queries.
> � � �>>
> � � �>> � � All I can think of is the note on the dictfetchall() method
> � � in the
> � � �>> � � Django docs which describes a "small performance hit". Is
> � � this it?!
> � � �>>
> � � �>> � � I've profiled the app too, although I'm a bit hazy about
> � � what it
> � � �>> � � all means. Here's a dump of the result:
> � � �>> http://pastebin.com/raw.php?i=UHE9edVC
> � � <http://pastebin.com/raw.php?i=UHE9edVC>
> � � �>> � � <http://pastebin.com/raw.php?i=UHE9edVC
> � � <http://pastebin.com/raw.php?i=UHE9edVC>> (this is from running on
> � � �>> � � my local server rather than production but performance is
> � � broadly
> � � �>> � � similar).
> � � �>>
> � � �>> � � Can anyone help me? I realise I've perhaps gone off-piste by
> � � �>> � � writing raw SQL but I feel it was justified.
> � � �>>
> � � �>> � � thanks,
> � � �>> � � Matt
> � � �>>
>
> � � --
> � � Jani Tiainen
>
> � � - Well planned is half done and a half done has been sufficient
> � � before...
To view this discussion on the web visit https://groups.google.com/d/msg/django-users/-/UXh6v5jzI0gJ.