I'm doing something similar with MongoDB for the realtime streaming search for
unscatter.com. It works basically like this
client makes a search {query}
The RequestHandler queries MongoDB to see if there's a topic for that query.
If there is no topic
Creates a topic which is basically query + last access time
Returns the base page to deploy the long polling javascript
If there is a topic
Uses the topic ObjectId to query a queue table for the most recent messages.
Deploys the messages and the javascript to start long polling for more messages.
I have a separate process, also written in Tornado, that polls looking for active topics.
Active topics are determined by last access time.
It checks the twitter/facebook nexturl fields in the topic to poll those sites for new messages.
If there are new messages it processes them into the queue, and updates the topic nexturl fields.
The longpoll basically is a GET request for new messages with a parameter that is the objectid of the last message it received (if any). this kicks off the polling in the RequestHandler that will poll mongodb for new messages, returning them when it gets them to start the next polling session.
I used mongodb because of it's speed and also provides build in garbage collection. The topics and queue collections are capped, so they'll never excede a certain size. Eventually I'm going to plug in
bit.ly's asyncmongo they released and if i ever start getting real amounts of traffic then that should help as well.
there's no navigation to that filter yet, you have to add the f=realtime to the search url manually at this time.