Django and PostgreSQL/Slony for load-balancing?

78 views
Skip to first unread message

Adam Seering

unread,
Nov 3, 2009, 10:19:52 PM11/3/09
to django...@googlegroups.com
Hi,
We're running a website that usually runs just fine on our server; but
every now and then we get a big load burst (thousands of simultaneous
users in an interactive Web 1.5-ish app), and our database server
(PostgreSQL) just gets completely swamped.

We'd like to set up some form of load-balancing. The workload is very
SELECT-heavy, so this seems plausible. It looks like Slony is the
recommended package for doing this. However, if we set up a Slony
cluster and use pgpool to divide up queries among the nodes, the default
isolation level requested by psycopg forces all the queries to go to the
master database, which defeats the purpose of the cluster. If we force
the system to a lower isolation level, all kinds of things start
breaking, because data doesn't appear quickly enough in the slave
databases, and various chunks of Django code (and our code) seem to rely
on writing data and immediately reading it back.

Does anyone else do this type of load-balancing? Any tips? In
general, what (if anything) do folks here do for load-balancing?

Thanks,
Adam

Michael Price

unread,
Nov 4, 2009, 7:24:07 PM11/4/09
to Django users
I've been working on the same project and figured I would chip in.

A compromise to avoid needing synchronous replication would be to
determine which functions in our code need to use "live" or recently
modified data, and ensure that queries pertaining to those function
calls get sent to the master database (where all INSERT and UPDATE
operations are performed). For other functions where a few seconds of
delay doesn't matter, the queries would be directed to a replicated
slave database.

It isn't clear how to achieve this. We have pgpool2 working in master/
slave mode, but it doesn't have a very fine level of control over how
queries get directed. The only way I could see to do it at the
application level would be to step in and out of the "read committed"
isolation level. According to the Postgres documentation this makes
no difference in behavior because "read committed" is actually the
minimum level of transaction isolation. However, pgpool2 isn't aware
of this; it directs all queries to the master at this isolation level
and uses load balancing when the isolation level is "read
uncommitted."

I was able to direct queries from individual functions to the master
database by wrapping them in a decorator that sets the connection's
isolation level to 1 and then back to 0. However, this seems way too
sketchy for us to be comfortable with it. I wonder if there is a
better way.

Michael
Reply all
Reply to author
Forward
0 new messages