pycassa pool size - how to determine?

69 views
Skip to first unread message

Cato Yeung

unread,
Oct 4, 2012, 11:44:28 PM10/4/12
to pycassa...@googlegroups.com
Dear experts,
I know that pycassa can specify pool size like this:
pool = pycassa.ConnectionPool('Keyspace1', pool_size=20)

What value should I put in this pool_size if I want to maximize stability. Besides, speed is my secondary concern(but also important).
If it depends, depends on what factor?

Regards,
Cato

Tyler Hobbs

unread,
Oct 8, 2012, 4:22:09 PM10/8/12
to pycassa...@googlegroups.com
The best size for your pool depends on two things:

a) How many threads will use the connection pool concurrently
b) How many nodes you want to spread your operations across

For consideration (a), your pool should be at least as large as the number of threads that will use the pool concurrently.  If the number of threads is not fixed but may go up and down, make the pool large enough to handle all threads during normal circumstances, and depend on max_overflow to handle bursts of larger numbers of threads.

It's trickier to handle (b). If your volume of operations is low (< 100/second), a small number of connections is fine, like 3 to 5.  If the volume is higher or your writes are large, you may want to use a larger pool to spread the operations out over more nodes.  Having more connections than the number of nodes in your cluster doesn't help anything, in this case, so that's the upper bound.

Hope that helps,
- Tyler
--
Tyler Hobbs
DataStax

Reply all
Reply to author
Forward
0 new messages