PostgreSQL, queries, speed

80 views
Skip to first unread message

Lachlan Musicman

unread,
Jun 17, 2014, 12:08:36 AM6/17/14
to django...@googlegroups.com
Hola,

In an inventory system, I've got a PartNumber model and a Buckets
model - each bucket holding one or more instances of a part number.

Each instance can have a separate price and currency - depending on
date and supplier these can fluctuate.

From this data we would like to extract total and average value, per
currency, per PartNumber.

The problem is that even with a relatively small number of parts
(<5000) and only 4 currencies (USD, AUD, EUR, GBP) we are seeing page
rendering slow down as each of those queries is sent off to be worked
out.

Caveat: The desire is for the page to list *all* parts, not a subset.

I've implemented solr/haystack for individual search, but sometimes
they want the overview.

I watched Andrew Godwin's presentation on Rapid Schema Development
http://vimeo.com/83756644 and thought that maybe what I need to do is
implement a small function within PostgreSQL rather than within my
view?

Another solution that had occurred to me was to have a PostgreSQL view
that was updated on Bucket insert to hold the total and average value
data. ( I read this post about django-postgres
http://blog.isotoma.com/2014/05/a-different-view-part-2/ )

Is there a preferred method, or a recommended solution?

cheers
L.



--
The idea is that a beautiful image is frameable. Everything you need
to see is there: It’s everything you want, and it’s very pleasing
because there’s no extra information that you don’t get to see.
Everything’s in a nice package for you. But sublime art is
unframeable: It’s an image or idea that implies that there’s a bigger
image or idea that you can’t see: You’re only getting to look at a
fraction of it, and in that way it’s both beautiful and scary, because
it’s reminding you that there’s more that you don’t have access to.
It’s now sort of left the piece itself and it’s become your own
invention, so it’s personal as well as being scary as well as being
beautiful, which is what I really like about art like that.
-----------------------------------------------------------------------------------------------------------
Adventure Time http://theholenearthecenteroftheworld.com/

Tim Chase

unread,
Jun 17, 2014, 12:22:12 AM6/17/14
to django...@googlegroups.com, dat...@gmail.com
On 2014-06-17 10:08, Lachlan Musicman wrote:
> The problem is that even with a relatively small number of parts
> (<5000) and only 4 currencies (USD, AUD, EUR, GBP) we are seeing
> page rendering slow down as each of those queries is sent off to be
> worked out.

A couple things occur to me:

- have you pulled a list of the queries that are being sent? That
should be a single query issued once. If for some reason you're
issuing the query multiple times for a single page-view, this would
be slow. Or you're pulling back all the data only to aggregate in
Django rather than in SQL (when you could be using Django's
aggregation[1]), you'd see similar slow-down, especially if your DB
server is on a different machine from your web-server

- have you created any indexes on the data that would cover the data
you're looking for?

- is your DB server severely underpowered or poorly tuned?

A mere 5000 rows should come back in almost a blink if properly
indexed.

-tkc


[1] https://docs.djangoproject.com/en/dev/topics/db/aggregation/




Lachlan Musicman

unread,
Jun 17, 2014, 12:42:56 AM6/17/14
to Tim Chase, django...@googlegroups.com
tkc,

You make good points, and I will need to look into better use of
Django first - aggregates and the like.

I have seen the queries sent, and there are plenty.

The problem isn't that we are sending 5000 rows - some parts have
hundreds of Buckets, each Bucket needs to be tested that it has qty>0
(old buckets need be kept for archival purposes), and then the buckets
value (qty for this bucket * unit price for this bucket), is added to
a currency:value dictionary for the four currencies.

The db is hosted on the same server, it's not underpowered, and it is
less poorly tuned so much as untuned.

No, there are no indexes atm, am still building this knowledge slowly.

Will look into aggregates, cheers. I think I avoided them at first
because of the complexity introduced by the multiple currencies -
essentially need to iterate over each part, then each currency. Or
each Part, then each bucket. It's that double iteration.

Glen Jungels

unread,
Jun 17, 2014, 1:17:15 AM6/17/14
to django...@googlegroups.com
You should definitely be able to accomplish what you are needing with a stored function in PostgreSQL.  By sending only a single request and letting the database server do the processing for you, you'll minimize how much work the web server has to do.  That said, writing a stored function/procedure isn't necessarily the most straight forward either.  From what you are describing, I would say it's the way to go though.


--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/CAGBeqiNMYq48DpSdVBZRCCC8oUaU%2B%2BGDozfQtNJ7KEynENSztQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Lachlan Musicman

unread,
Jun 17, 2014, 1:25:56 AM6/17/14
to django...@googlegroups.com
Thank Glen, appreciated.

Now reading about indexes I can't believe I've not used them previously.

Does the addition of "db_index=True" to a field require a migration,
or is it acceptable to just add. I guess that it will require a
migration, since it's a DB level change. Or an instruction to the DB
to perform an action that will stay in the DB, rather than within the
app?

cheers
L.
> https://groups.google.com/d/msgid/django-users/CAA_%3DsbSdG9Z%3DgzLe63nFphv9_6hkVx6oG6kaRwkfb0CiEvaB%3Dg%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.



Russell Keith-Magee

unread,
Jun 17, 2014, 3:05:09 AM6/17/14
to Django Users
On Tue, Jun 17, 2014 at 9:25 AM, Lachlan Musicman <dat...@gmail.com> wrote:
Thank Glen, appreciated.

Now reading about indexes I can't believe I've not used them previously.

Indicies are definitely your friend - *especially* on PostgreSQL. :-)
 
Does the addition of "db_index=True" to a field require a migration,
or is it acceptable to just add. I guess that it will require a
migration, since it's a DB level change. Or an instruction to the DB
to perform an action that will stay in the DB, rather than within the
app?

Yes, a migration will be required. Once tables have been created, no change to the model file are reflected in the database unless you perform a migration.

In raw PostgreSQL compatible SQL, you're looking for:

CREATE INDEX "appname_tablename_columnname" ON "appname_tablename" ("columnname");

(substituting the app name, table name and column name as appropriate). This will create a BTree index. There are other index types and options for indicies, all of which have different search properties; however, if this is your first time really using indicies, a BTree is a decent place to start.

Yours,
Russ Magee %-)

Lachlan Musicman

unread,
Jun 17, 2014, 3:17:32 AM6/17/14
to django...@googlegroups.com
Thanks Russ, appreciated.

Out of interest, how do people use plpythonu in postgres when virtualenv'd?

I'm seeing a lot of "set the PYTHONPATH envvar in Postgres" answers is
this the way?

What if I have 3 virtualenvs (dev, stage, prod) on the same machine,
each grabbing a different DB (dev, stage, prod) from the same postgres
server? I guess I set it to the PATH to prod's virtualenv because it's
the most conservative but closest to dev?

L.
> --
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-users...@googlegroups.com.
> To post to this group, send email to django...@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-users.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-users/CAJxq848Mgm5z5vJtkez6u8THMknZOk6FO5joB0O-mDC0O3T4CQ%40mail.gmail.com.

Alex Mandel

unread,
Jun 17, 2014, 6:51:04 AM6/17/14
to django...@googlegroups.com
If you stick to stock python commands virtualenv shouldn't be involved
at all and you can just point to the system python.

Thanks,
Alex

Lachlan Musicman

unread,
Jun 17, 2014, 7:59:51 AM6/17/14
to django...@googlegroups.com
I discovered what was going wrong - I was configuring pg9.3 but my db
was in pg9.1 I think that's a hangover from when I updated from 12.04
to 14.04 or something. Anyway, thankful for backups! (redface)
> To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/539FE53B.1050100%40wildintellect.com.
> For more options, visit https://groups.google.com/d/optout.



Tim Chase

unread,
Jun 17, 2014, 12:52:13 PM6/17/14
to django...@googlegroups.com, rus...@keith-magee.com
On 2014-06-17 11:04, Russell Keith-Magee wrote:
> On Tue, Jun 17, 2014 at 9:25 AM, Lachlan Musicman wrote:
> > Now reading about indexes I can't believe I've not used them
> > previously.
>
> Indicies are definitely your friend - *especially* on
> PostgreSQL. :-)

I've found http://use-the-index-luke.com/ particularly useful in
learning to tune indexes for better performance.

> In raw PostgreSQL compatible SQL, you're looking for:
>
> CREATE INDEX "appname_tablename_columnname" ON "appname_tablename"
> ("columnname");
>
> (substituting the app name, table name and column name as
> appropriate).

As you (the OP) will learn at the above URL, you can index on
multiple columns, so you can have

CREATE INDEX "name_for_this_index" ON "appn ame_tablename" (
"colname1",
"colname2",
---
"colnameN"
)

for combined indexes which, if you regularly query those conditions
(particularly from first-to-last in terms of subsets), you'll see
notable performance improvements.

-tkc



Lachlan Musicman

unread,
Jun 17, 2014, 9:27:30 PM6/17/14
to django...@googlegroups.com
tkc,

Is that the default created when adding db_index=True of is that
something that I would do directly in postgres itself?

cheers
L.
> --
> You received this message because you are subscribed to the Google Groups "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
> To post to this group, send email to django...@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-users.
> To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/20140617075049.0556ba33%40bigbox.christie.dr.

Tom Evans

unread,
Jun 17, 2014, 10:15:43 PM6/17/14
to django...@googlegroups.com
On Tue, Jun 17, 2014 at 10:26 PM, Lachlan Musicman <dat...@gmail.com> wrote:
> tkc,
>
> Is that the default created when adding db_index=True of is that
> something that I would do directly in postgres itself?
>
> cheers
> L.

Yep, although obviously that won't work when you want a multi-column
index. You want a multi column index when you want to speed up queries
involving all the columns in that query. You can get multi column
index by adding the appropriate Meta options on that model:

https://docs.djangoproject.com/en/1.6/ref/models/options/#index-together

There is also unique_together that allows UNIQUE indexes to be created.

Bear in mind that whilst indexes speed up specific queries, they also
slow down database modifications, as more indices have to be
manipulated to reflect the changes. Having said that, not having the
right indices can make your app blindingly slow, so it is important to
get the balance right!

Cheers

Tom

Tom Evans

unread,
Jun 17, 2014, 10:18:26 PM6/17/14
to django...@googlegroups.com
On Tue, Jun 17, 2014 at 11:14 PM, Tom Evans <teva...@googlemail.com> wrote:
> You want a multi column index when you want to speed up queries
> involving all the columns in that query.

Or, depending on database features, queries involving all the
columns[0:N] - eg an index on (first_name, last_name, sex) would still
optimise a query on first_name and last_name. See your database for
more details!

Cheers

Tom

Vladimir Chukharev

unread,
Jun 19, 2014, 8:59:15 AM6/19/14
to django...@googlegroups.com, django...@tim.thechases.com
On Tuesday, June 17, 2014 3:42:56 AM UTC+3, Lachlan Musicman wrote:
The db is hosted on the same server, it's not underpowered, and it is 
less poorly tuned so much as untuned.


Untuned is the same as poorly tuned when it comes to PostgreSQL. Or worse.
Pg has defaults from last millennium... See e.g.
for simple memory limits to increase 10...100 times or more. And make sure
that autovacuum works.

Vladimir
Reply all
Reply to author
Forward
0 new messages