Tornado and SQLite

1,417 views
Skip to first unread message

daniels

unread,
Dec 9, 2011, 6:28:50 AM12/9/11
to Tornado Web Server
I would like to use Tornado for a project and i will need to use
SQLite also.
What is the best way to connect to the database? On each request
connect, query and disconnect? Or a single connection application wide
and then use the same connection on all requests.

aliane abdelouahab

unread,
Dec 9, 2011, 11:58:15 AM12/9/11
to python-...@googlegroups.com
well, am new, but here are some suggestions:
tornado is made for asnych non-blocking, this means that it's made for
large application, SQLite is made for small data bases, and it obeys
only for monopost (the DB is written in a single file and not as the
concept client-server)
the module that tornado includes is made for MySql, but even this, i
think it's not made for non-blocking writing on DB, you should look
for other solutions like MongoDB or Redis (both are NoSql)....

2011/12/9, daniels <danie...@gmail.com>:

jaceka...@gmail.com

unread,
Dec 9, 2011, 12:08:57 PM12/9/11
to python-...@googlegroups.com
The guys from the Plurk project have a nice solution based on Redis. It might be the thing you are looking for.
Sent from my BlackBerry® smartphone in Play

aliane abdelouahab

unread,
Dec 9, 2011, 12:13:39 PM12/9/11
to python-...@googlegroups.com
as a beginner in nosql (and internet in general) there is a problem
when switching from tutorials made for the RDBMS (relational databases
system) - like mysql or sqlite - and the nosql tutorials, Redis is a
an NoSql solution, and it uses a different approach, this is why i
suggest using MongoDB since it's friendly with RDBMS syntax, and it
has a AsyncMongo to be used with Tornado :
https://github.com/bitly/asyncmongo

2011/12/9, jaceka...@gmail.com <jaceka...@gmail.com>:

Cliff Wells

unread,
Dec 9, 2011, 12:32:06 PM12/9/11
to python-...@googlegroups.com
On Fri, 2011-12-09 at 17:58 +0100, aliane abdelouahab wrote:
> well, am new, but here are some suggestions:
> tornado is made for asnych non-blocking, this means that it's made for
> large application, SQLite is made for small data bases, and it obeys
> only for monopost (the DB is written in a single file and not as the
> concept client-server)

This is a bit misleading. SQLite is fine for larger applications
(FreeSWITCH, for example, uses it heavily). It's not so fine for
distributed applications or applications with high concurrency levels.

The fact that Tornado is async doesn't mean it's "made for large
applications". It means it scales predictably from small to large. In
fact, Tornado is excellent for small applications, since it has a
relatively small memory footprint.

The fact that Tornado is async is actually something that can make
SQLite more appropriate, since you can be certain that there will only
be a single thread accessing the database at a time (running multiple
instances of Tornado and sharing the same SQLite database would bring
concerns).

> the module that tornado includes is made for MySql, but even this, i
> think it's not made for non-blocking writing on DB, you should look
> for other solutions like MongoDB or Redis (both are NoSql)....

NoSQL databases may or may not be more appropriate for the OP's
application, but this would have more to do with the type of data he is
storing and how it must be retrieved than with whether or not the
database is "non-blocking". NoSQL is *not* a replacement or alternative
for an RDBMS, any more than a filesystem is. You can sometimes use them
interchangeably (usually with some ugliness), but they are intended for
different purposes.

Cliff

Srini Kommoori

unread,
Dec 9, 2011, 2:18:51 PM12/9/11
to python-...@googlegroups.com
Good points. One more option to consider is leveldb. It is also as simple as sqlite and can work as a persistent cache.

For one of my sites, http://safebox.fabulasolutions.com/ I am using leveldb for the payment side. 

Some leveldb pointers

-Srini
full disclosure: leveldb-server is my github repo

daniels

unread,
Dec 9, 2011, 3:18:24 PM12/9/11
to Tornado Web Server
What i want to do is a web based admin interface for a device that
runs a stripped down version of Linux so resources are limited and
that's why i thought of going with Tornado (as app, and http server)
and SQLite db.
This admin interface will be multiuser but as any admin interface
there shouldn't be many simultaneous users accessing the app and
that's why i thought SQLite will be ok but after looking more into
Tornado i see that i can't use the SQLite async so while i do a query
everything will be blocked this meaning no other user will be able to
access the site this time, and i can't do other requests (ajax) until
the query is done.
Am i understanding correctly the way Tornado works?
Also what would you suggest that i could use for such application?
Can i adapt Tornado in any way to work for what i need?

Thanks.

aliane abdelouahab

unread,
Dec 9, 2011, 3:18:38 PM12/9/11
to python-...@googlegroups.com
@Cliff Wells but the problem of the sqlite that it will block the
process which is not the main concept of tornado, for example imagine
that you've a user that wants to register, so he'll make a write, then
the process will wait him till he finishes the db process no?

2011/12/9, Srini Kommoori <vas...@gmail.com>:

Didip Kerabat

unread,
Dec 9, 2011, 4:13:25 PM12/9/11
to python-...@googlegroups.com
Yeah, the problem of SQLite is its EXCLUSIVE lock. When such locking is triggered, the whole read and write is blocked.

And as you can imagine, a singleton ioloop blocked by SQLite is not a good thing.

The easiest cop-out way is to run multiple tornado instances behind a proxy (nginx/haproxy).

Hope that helps.

- Didip -

Didip Kerabat

unread,
Dec 9, 2011, 4:16:48 PM12/9/11
to python-...@googlegroups.com
Oh wait, then all of Tornado instances are still blocked by SQLite.

Erased everything I just said.

- Didip -

David Birdsong

unread,
Dec 9, 2011, 5:00:57 PM12/9/11
to python-...@googlegroups.com
On Fri, Dec 9, 2011 at 1:16 PM, Didip Kerabat <did...@gmail.com> wrote:
> Oh wait, then all of Tornado instances are still blocked by SQLite.
>
> Erased everything I just said.
>
> - Didip -

if you could reduce the listen queue down to 1, then nginx or haproxy
in front of multiple tornado's should load balance evenly. the key is
for a 'full' tornado to fast fail and send a TCP RST to the proxy
which could then move the request to another tornado backend.

Srini Kommoori

unread,
Dec 9, 2011, 6:24:58 PM12/9/11
to python-...@googlegroups.com
Daniels, 

if you are talking about an embedded device, you can do mongoose http://code.google.com/p/mongoose/ with python binding via cgi.  

On the blocking, sqlite reads are not blocked - multiple threads/processes could read sqlite db but writes are blocking. More info here: http://www.sqlite.org/faq.html#q5 

If you are really concerned about the write locks, use pyleveldb it is equivalent of using sqlite. Also, leveldb doesn't lock(eventual consistent) even during writes. 

web connection being async and db connection being async are two different topics. Tornado is very good at web connection being async and db being async is dependent on respective dbs and db client libraries. 

-Srini

aliane abdelouahab

unread,
Dec 10, 2011, 11:22:14 AM12/10/11
to python-...@googlegroups.com
@Srini Kommoori but if the DB is not validated, it can't run the rest
if the code?! if the registration is failed, it can't give the welcome
page! :D

2011/12/10, Srini Kommoori <vas...@gmail.com>:

Srini

unread,
Dec 10, 2011, 11:38:40 AM12/10/11
to python-...@googlegroups.com
Alabdelouahab,

Sorry didn't get what you are talking about. Are you implying, during writes db is locked and can't do reads?

Architectural decision that needs to be made about number of concurrent users, db latency and any other constraints project may have.

-Srini

aliane abdelouahab

unread,
Dec 10, 2011, 11:48:32 AM12/10/11
to python-...@googlegroups.com

i think (dont know really, am new here) but the logical way, the non-blocking is needed to make heavy writes while other things are done too without waiting a certain FIFO, even writing no?
2011/12/10 Srini <vas...@gmail.com>

Cliff Wells

unread,
Dec 10, 2011, 12:14:35 PM12/10/11
to python-...@googlegroups.com
On Fri, 2011-12-09 at 21:18 +0100, aliane abdelouahab wrote:
> @Cliff Wells but the problem of the sqlite that it will block the
> process which is not the main concept of tornado, for example imagine
> that you've a user that wants to register, so he'll make a write, then
> the process will wait him till he finishes the db process no?

It isn't really necessary to be completely non-blocking. You just need
to be certain you don't block over some threshold. In other words, if
you have a MySQL query that takes 2s, then you should probably find a
way to make it called in a non-blocking fashion. But if the query only
takes 0.01s, then the overhead involved in making it non-blocking
probably isn't worth it.

As I understand it, FriendFeed (which Tornado was written for) does
exactly this. Most of their MySQL queries are blocking and they just
make sure those queries return in a reasonably short amount of time.


Cliff


aliane abdelouahab

unread,
Dec 10, 2011, 3:02:51 PM12/10/11
to python-...@googlegroups.com
so the operations are just simple operations and i guess it will not
use JOIN since it will use more than 0.01? and i think they'll use
memcached to work, the asynch + RDBMS are only (in my opinion) a
solution when it's RAM cached!

2011/12/10, Cliff Wells <cl...@develix.com>:

aliane abdelouahab

unread,
Dec 11, 2011, 9:59:27 AM12/11/11
to Tornado Web Server
it seems that it's better to use blocking for DB ?????
https://github.com/facebook/tornado/wiki/Threading-and-concurrency
what's that?!

Cliff Wells

unread,
Dec 12, 2011, 12:31:11 AM12/12/11
to python-...@googlegroups.com

This is what I've been attempting to explain to you. It is fine to
block so long as you are blocking for a very short time. If you read
and understand that entire page, then what you will come away with is
there is no "blocking is better" or "async is better". Rather you will
need to assess your own application and decide "what is good enough" for
your particular circumstances (and in fact, for each database query).

Cliff


aliane abdelouahab

unread,
Dec 12, 2011, 2:30:59 AM12/12/11
to Tornado Web Server
so, as i understood, i'll use the blocking only if it will get few
seconds, for example some basic queries, or from memcached, if it will
take much time like JOIN then, make it async, and from that, i've
another small bug in my mind, since NoSql is faster than RDBMS, so why
the use of non-blocking since it dont use JOIN to take several
seconds, and the use of another Async library (for example AsyncMongo)
will use the native driver, mean there will be twice and
interpretation of the code, and this will take time no? so why not
using directly Pymongo (in the case of mongodb, because Pymongo is the
driver, and asyncmongo is build on pymongo)

Cliff Wells

unread,
Dec 12, 2011, 4:47:23 PM12/12/11
to python-...@googlegroups.com
On Sun, 2011-12-11 at 23:30 -0800, aliane abdelouahab wrote:
> so, as i understood, i'll use the blocking only if it will get few
> seconds, for example some basic queries, or from memcached, if it will
> take much time like JOIN then, make it async, and from that, i've
> another small bug in my mind, since NoSql is faster than RDBMS,

I don't think there's been any empirical claim that NoSQL is faster than
an RDBMS except perhaps for very narrowly defined workloads. It's a
different data model that lends itself to particular types of
applications. Many queries cannot even be directly mapped between the
two systems. Comparing NoSQL to an RDBMS is comparing apples and apple
pie.

> so why
> the use of non-blocking since it dont use JOIN to take several
> seconds, and the use of another Async library (for example AsyncMongo)
> will use the native driver, mean there will be twice and
> interpretation of the code, and this will take time no? so why not
> using directly Pymongo (in the case of mongodb, because Pymongo is the
> driver, and asyncmongo is build on pymongo)

A NoSQL query might certainly take several seconds depending on many
variables such as size of the record, load on the server, latency of the
network, etc. You are attempting to make generalizations about systems
that defy generalization. Things such as benchmarks can help setting
some baseline expectations, but at the end of the day, the performance
of your application will rest entirely upon things that are unique to
your application.

Cliff


Joe Bowman

unread,
Dec 12, 2011, 9:34:37 PM12/12/11
to python-...@googlegroups.com
As someone who wrote a sessions library using asyncmongo, I honestly think the time and effort was probably not worth the gain and if I was to do it again I would use pymongo.

aliane abdelouahab

unread,
Dec 13, 2011, 3:02:17 AM12/13/11
to Tornado Web Server
@Cliff: but from benchmarks they say that nosql is faster, and from
what i've found in internet, to choose rdbms in the affecient way, it
must be combined with several tools!
and what is good in nosql, (for example mongodb) is that i can use the
python language to "fill" what is not found in mongodb, for example
intersection is not aviable, but it can be easily done using python:
1- pymongo stores information using dictionnary
2- intersection is found in SET
3- just make set(someone.values()).intersection(otherone.values())

@Joe: so pymongo is faster?

Joe Bowman

unread,
Dec 13, 2011, 9:52:05 AM12/13/11
to python-...@googlegroups.com
No, pymongo is easier to develop with. 

Bit of history, and Ben or someone who worked for Friendfeed can correct anything I might state wrong.

Tornado is a release of a majority of the code that supported Friendfeed before the company was purchased by Facebook.

Friendfeed used the Tornado codebase along with MySQL to support the application. MySQL requests were synchronous. They came to the conclusion that if the DB request wasn't fast enough then you should speed up the DB. 

I, like other people, still pursued trying to get asynchronous connections to different storage mediums. However, especially after developing something with asyncmongo, it's really just creating a lot of work that doesn't make sense.

Look at it from a business sense. The product is there to support customers, right? Customers expect results quickly, it's the internet. So if your DB can't return a result extremely fast, you need to speed it up in order to support that customer anyway. So your Tornado application shouldn't be hampered by synchronous db requests because your db should be returning those results fast any way.

Mind you, this isn't a one size fits all solution. It could be you're building an app where it's ok for the customer to wait for a result. In that case you may want to look at asynchronous db connections in order to have each Tornado instance support more connections. However, if you're building a standard web application this is likely not the case.

Also, you want to consider the core way Tornado is built. Overall it fits the current hardware model of lots of cores on a single CPU and memory over all being cheap. So you scale Tornado by running more instances behind a load balancer so if you have db requests piling up, run more instances of Tornado on different cores. 

Using pymongo instead of asyncmongo, for example, can lead to simpler to understand code. You end up with

make request
process request

instead of

request method
  make request, when it's done run this callback

callback method
  parse response


I came to this conclusion while writing the asyncmongo sessions library. I finished it only because I was stubborn. Was really tempted to move it to pymongo when I was over half done, but figured that would take more total time in the end. I'm pretty well stuck with asyncmongo for the prototype I'm writing now as a result. No biggie, I'm used to it at this point. But if I was just starting out, I'd go pymongo all the way. 

Another advantage is asyncmongo depends on pymongo. What if pymongo changes in a way that breaks asyncmongo? It's another item in a chain of dependencies. One thing I really like about Tornado is there are not a lot of dependencies. You need python and Tornado. That's it. 

aliane abdelouahab

unread,
Dec 13, 2011, 10:19:44 AM12/13/11
to Tornado Web Server
!!!!!!!!!!!! am sorry if i'll not write as much information as you
give me, you give me all what i'll need to my thesis, i'll talk about
mongodb in the "professionnal" way! asyncmongo is excellent! since
they're people who liked it and like it! by programming the asyncmongo
you made people make another option to switch to mongodb, from all the
articles i saw about asyncmongo, they say that people from cassandra
wish to have the same thing ;) i hope you'll not stop developing it!
sadly am new in languages like python, if i can contribute that will
an honor :)
thank you again for giving me such valuables information ;)

Joe Bowman

unread,
Dec 13, 2011, 10:23:53 AM12/13/11
to python-...@googlegroups.com
I didn't develop asyncmongo. bit.ly did. I just used it :)

aliane abdelouahab

unread,
Dec 13, 2011, 10:42:28 AM12/13/11
to Tornado Web Server
but you contribute on it :D

Joe Bowman

unread,
Dec 13, 2011, 11:31:43 AM12/13/11
to python-...@googlegroups.com
Nope.. my project is this


It uses asyncmongo to power http sessions for tornado applications.

aliane abdelouahab

unread,
Dec 13, 2011, 12:51:11 PM12/13/11
to python-...@googlegroups.com
and i think it's the same, it's something new to tornado-mongodb :D

2011/12/13 Joe Bowman <bowman...@gmail.com>

Chase Lee

unread,
Dec 13, 2011, 10:45:37 PM12/13/11
to python-...@googlegroups.com
I know it wouldn't have worked for the sessions library, but I find myself just using the gen module all the time now, which has addressed at least part of your case against asyncmongo.

aliane abdelouahab

unread,
Dec 14, 2011, 2:47:55 AM12/14/11
to python-...@googlegroups.com
am not against asyncmongo, but i just want to understand why the plugin is there :)

2011/12/14 Chase Lee <umcha...@gmail.com>

Cliff Wells

unread,
Dec 14, 2011, 2:21:49 PM12/14/11
to python-...@googlegroups.com
On Wed, 2011-12-14 at 08:47 +0100, aliane abdelouahab wrote:
> am not against asyncmongo, but i just want to understand why the
> plugin is there :)

Because for some applications (not all), it may be needed. Also, some
people simply prefer to keep their async applications "pure", even if it
provides no tangible performance benefit.

Cliff

Chase Lee

unread,
Dec 14, 2011, 10:44:49 PM12/14/11
to python-...@googlegroups.com
Sorry, I meant that to be directed to Joe.

aliane abdelouahab

unread,
Dec 15, 2011, 7:54:55 AM12/15/11
to Tornado Web Server
no problem, it is also helpful :p

Joe Bowman

unread,
Dec 15, 2011, 12:50:20 PM12/15/11
to python-...@googlegroups.com
I"m not "against asyncmongo" I just have the opinion for most use cases pymongo is simpler to use and write code with, which should lead to lower development time. MongoDB for the most part should be more than fast enough that blocking for a couple CPU cycles should be a non-issue. Especially with how easy it is to turn up more tornado instances.

If the expectation is that MongoDB will be slower returning query results, then asyncmongo is the way to go. As it is the app I'm working with uses asyncmongo for everything, because I had already put so much time into it. For future projects I'd probably go with pymongo first unless there's a specific reason to need asyncmongo. 

aliane abdelouahab

unread,
Dec 15, 2011, 3:20:22 PM12/15/11
to python-...@googlegroups.com

nosql is here to make it fast, and i think when it was invented there was no idea about async requests :D so i agree with you, mongodb is fast, so why using async?
2011/12/15 Joe Bowman <bowman...@gmail.com>

Joe Bowman

unread,
Dec 16, 2011, 9:03:59 AM12/16/11
to python-...@googlegroups.com
Once again, depends on your data model. Mongodb can be fast, but if you start to move away from the nosql philosophy and need to use more complex Mongodb operations, say map reduce on an extremely large data set, then that query could turn out to not be so fast. In this case asyncmongo would allow the tornado server to not block during the creation of the query result.

aliane abdelouahab

unread,
Dec 16, 2011, 12:10:58 PM12/16/11
to Tornado Web Server
ah! now i understand when i'll use both of them:
use pymongo when it's simple queries
use asyncmongo for mapreduce and complex queries
thank you :D

Phil Whelan

unread,
Dec 17, 2011, 4:54:06 PM12/17/11
to python-...@googlegroups.com
use pymongo when it's simple queries
use asyncmongo for mapreduce and complex queries

Sorry to come into this post-conclusion, but I with my usage of asyncmongo I usually fire off multiple queries in parallel. For instance, I calculate all the components needed for the webpage / ajax call and then fire off an asyncmongo call for each one in parallel and wait for all the results. I have no benchmarks to confirm this is quicker than doing it serially, but I'm assuming mongodb can parallelize these requests at its end. At an extreme, 20 "quick" queries done serially will add up to some significant time. I guess this could come under "complex mongodb operations" mentioned above.

Also, I think the complexity of the query is not the only factor here. In MongoDB, if your data is 100% resident in memory (see FourSquare) then you will get lightning fast responses. If only your index data is in memory, and the disk needs to be read to retrieve the documents, you'll get 1 or 2 orders of magnitude slower response time (still fast). If even your index data does not completely fit into memory then responses will be noticeably slower, even for simple queries. 

aliane abdelouahab

unread,
Dec 18, 2011, 11:23:06 AM12/18/11
to python-...@googlegroups.com
but from what i've understood from mongodb, it dont control the
validation of the operation when it's finished, it's nosql feature, so
when serializing 20 oprations, dont they pass without control and
gaining speed? but this come to the problem of loosing data if a
transaction for example has not been ended from the other side....
about memory, if the requests will be put in the memory, why not using
from the beginning an RDBMS solution, since the memory access will be
the same, the calculation will be done before going to ram?

2011/12/17, Phil Whelan <phi...@gmail.com>:

Reply all
Reply to author
Forward
0 new messages