Using simpledb with boto

Mark

unread,

May 3, 2012, 11:35:58 AM5/3/12

to cherrypy-users

Anyone using Amazon's SimpleDB with boto in CherryPy?

I have to connect then get a domain:

import boto

sdb = boto.connect_sdb(...)
dom = sdb.get_domain('test_users')

Then in my page handlers I can look for a user 'shooter' with:

user2 = dom.get_item('shooter')
if user2 == None:
print "no user shooter"

But I'm not sure where to make the connect and get_domain calls and
where to save the domain object. Can I just do:

root.py:

import boto
sdb = boto.connect_sdb(...)
dom = sdb.get_domain('test_users')

class Root:
@cherrypy.expose
def page(self, username):
user = dom.get_item(username)
if user != None:
...

Alan Pound

unread,

May 3, 2012, 11:46:35 AM5/3/12

to cherryp...@googlegroups.com

What I do is, while starting up CP, I instantiate a class that builds a pool of SDB connection objects, and I pass a reference to that class to my CP classes.

When they want to access SDB (or in a similar manner, S3), they obtain a connection object from that shared pool, and return it after use. If the pool gets empty, the class just makes another new one, and that goes back into the pool on completion.

In fact, the class that maintains the pool, also supplies a bunch of helper functions that use those connection objects..

Hope this helps.

Alan

--
You received this message because you are subscribed to the Google Groups "cherrypy-users" group.
To post to this group, send email to cherryp...@googlegroups.com.
To unsubscribe from this group, send email to cherrypy-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cherrypy-users?hl=en.

winmail.dat

Mark

unread,

May 3, 2012, 12:27:21 PM5/3/12

to cherrypy-users

I'm asking on boto-users about connection pooling (thanks) but I'm
still asking here because I'm not sure about how threading works and
how to setup a class as you mentioned.

So I have a class say with a get_conn and release_conn function.

sdb = boto.connect_sdb(...)

creates a connection which I can then reuse if release_conn is called
by returning the same connection to the next get_conn call?

cherrypy is threaded right? I don't understand this. So my connection
pooling class is created once (not per thread?) or uses classmethods?
and cherrypy kicks off a thread which will eventually call my page
handling function:

@cherrypy.expose
def page(self, username):
conn = MyPool.get_conn()
...
MyPool.release(conn)

Now each thread will reuse a connection that has been previously
created and released...

Mark

> winmail.dat
> 4KViewDownload

Alan Pound

unread,

May 3, 2012, 12:47:01 PM5/3/12

to cherryp...@googlegroups.com

The first *really important* thing to understand about boto, is that under the hood it uses httplib, which isn't thread-safe. So you have to do something like:

sdb_con = boto.connect_sdb( credentials[0], credentials[1], region=sdbregion)
sdb_ptr = sdb_con.get_domain( storename)

for each connection object you want to use - you cannot get away with using just a single sdb_con...

>>> So my connection pooling class is created once (not per thread?) or uses classmethods?

Yes, it is neat to use a class to encapsulate all of this, but I'm sure there are other ways of doing it.

>>> and cherrypy kicks off a thread which will eventually call my page handling function

Yes, that's about it - pretty much as you describe.

I really believe it is worth making a very tidy formal arrangement for accessing AWS via boto, as boto really doesn't do much by way of retries, and you will want to catch all sorts of low-level exceptions, log them and deal with them. For example, SDB has some real issues if you are trying to do a lot of writes in a short time. You pretty much *must* use batch_write, but there are limits - and the way it responds is to fail the request, and you will have to work out the exceptions that can be retried, and use a back-off and retry algorithm to manage them. All the same, there are real limits to the performance you can get out of SDB (it isn't boto's fault).

If your usage is light though, SDB is extremely flexible.

If your usage is likely to be heavy in terms of writes or deletes (maybe more than say 50 writes/sec), then you should forget SDB and go to DynamoDB (that is what we are up to right now...)

Hope this helps

winmail.dat

Scott Chapman

unread,

May 3, 2012, 7:05:09 PM5/3/12

to cherryp...@googlegroups.com

When using non-thread-safe databases I make each thread keep it's own connection to the database.
It's simpler than pooling so long as you don't have too many threads open that the number of connections might cause a problem.

Set these up in your startup script:

def db_connect(thread_index):
cherrypy.thread_data.db = MySQLdb.connect(foo)

cherrypy.engine.subscribe('start_thread', db_connect)
cherrypy.quickstart(root, '/', cp_config)

Then you grab the connection any time you need it like this:

connection = cherrypy.thread_data.db
cursor = connection.cursor()

Scott

Reply all

Reply to author

Forward