Scaling and failover with Tornado

199 views
Skip to first unread message

wataka

unread,
Dec 10, 2011, 9:43:46 AM12/10/11
to Tornado Web Server
I am not well versed in this as you can see :)

At the moment, I start several instances of my app on different ports
behind Cherokee. Cherokee acts as a reverse proxy to the tornado
backends and when one of them dies, Cherokee tries to restart it if
possible.

I read somewhere that one can use the multiprocessing module to pre-
fork several workers on one port/socket to scale evenlyon multi-core
architectures

1. Are there pros and cons to both methods?
2. Can one pre-fork he processes on different ports? I wuld like to
automate the process of restarting it

Thanks

aliane abdelouahab

unread,
Dec 10, 2011, 3:06:32 PM12/10/11
to python-...@googlegroups.com
it seems that CPython is bad for that, dont know where i've already
found it, it's called GIT or something...

2011/12/10, wataka <nhy...@googlemail.com>:

daniels

unread,
Dec 10, 2011, 5:08:50 PM12/10/11
to Tornado Web Server
GIL

On Dec 10, 10:06 pm, aliane abdelouahab <alabdeloua...@gmail.com>
wrote:

David Birdsong

unread,
Dec 10, 2011, 11:25:17 PM12/10/11
to python-...@googlegroups.com
The GIL is a global lock that any single cpython interpreter
maintains. Multiple processes means multiple GILs--the GIL is not an
issue for multiprocessing.

My preference for this sort of thing is to use an external process
manager to fork a tornado for a range of ports and use your load
balancer to spread load evenly across the ports. Supervisord is my
favorite for this.

- the code doesn't have to care for daemonization issues at all
- no changes for multiprocessing are needed in your code
- slight micro-optimization, but by listening on separate ports, the
python processes aren't all woken up by the OS to battle it out on
who's going to win at accept() (thundering herd)

Ben Darnell

unread,
Dec 11, 2011, 1:31:25 AM12/11/11
to python-...@googlegroups.com
On Sat, Dec 10, 2011 at 8:25 PM, David Birdsong
<david.b...@gmail.com> wrote:
> The GIL is a global lock that any single cpython interpreter
> maintains. Multiple processes means multiple GILs--the GIL is not an
> issue for multiprocessing.
>
> My preference for this sort of thing is to use an external process
> manager to fork a tornado for a range of ports and use your load
> balancer to spread load evenly across the ports. Supervisord is my
> favorite for this.

I agree. External process managers are also very helpful in doing
rolling restarts for zero-downtime updates.

>
> - the code doesn't have to care for daemonization issues at all
> - no changes for multiprocessing are needed in your code
> - slight micro-optimization, but by listening on separate ports, the
> python processes aren't all woken up by the OS to battle it out on
> who's going to win at accept() (thundering herd)

IMO this is pretty small as an optimization since it's self-correcting
under heavy load (you only see thundering herds when multiple
processes are idling in epoll; if most of the processes are busy they
won't participate in the race). However, using separate ports has the
advantage that your load balancer can be intelligent about assigning
requests to backend processes instead of relying on chance (in
principle load should be self-balancing for the reason I just
described, but in practice I've seen load get imbalanced by as much as
a factor of 5 when using multiple processes sharing the same port).

-Ben

aliane abdelouahab

unread,
Dec 11, 2011, 2:20:04 AM12/11/11
to Tornado Web Server
sorry if being out of topics, am new :D
thank you for the explications :)

On Dec 11, 7:31 am, Ben Darnell <b...@bendarnell.com> wrote:
> On Sat, Dec 10, 2011 at 8:25 PM, David Birdsong
>

wataka

unread,
Dec 11, 2011, 10:12:46 AM12/11/11
to Tornado Web Server
Thanks for the replies.

So its best to keep on doing things the way I have been doing things,
namely starting up a number of tornado backend processes on different
ports.Like I mentioned above, I start the backends manually and allow
Cherokee to handle the balancing and restarting if necessary, this
works well sometimes. Could I automate the starting of the backends
this way?

import os
import sys
import tornado.ioloop
import tornado.web

from multiprocessing import Process

class MainHandler(tornado.web.RequestHandler):
def get(self):
self.write("Hello, world")

application = tornado.web.Application([
(r"/", MainHandler),
])

def start_server():
tornado.ioloop.IOLoop.instance().start()

def run_backends(number_of_ports, number_of_backends):

for index in number_of_backends-1:
app = application.listen('localhost', ports[index])

Process(target=start_server, args=(app,)).start()

if __name__ == "__main__":
NUMBER_OF_BACKENDS = 4
PORTS = [800, 8001, 8002, 8003]
run_backends(PORTS, NUMBER_OF_BACKENDS)

wataka

unread,
Dec 15, 2011, 1:42:13 AM12/15/11
to Tornado Web Server
Ok, since I had no replies to the code I posted above, I would like to
ask the forum how other users start their multiple tornado backends.
Maybe using the multiprocessing module is a bad idea?

Aleksandar Radulovic

unread,
Dec 15, 2011, 8:02:37 AM12/15/11
to python-...@googlegroups.com
Hi,

My tornado apps are usually one process - I let supervisor start my
tornado instances, with configurable port - nginx in front manages
upstreams. Simple.

-alex

--
a lex 13 x
http://a13x.net | @a13xnet

Peter Bengtsson

unread,
Dec 15, 2011, 11:31:34 AM12/15/11
to python-...@googlegroups.com
I just use supervisord. It's ugly and confusing but when you get it right it really works well. For example, try to kill any of the supervisor processes (you might have, say, 4 running) and notice how it restarts faster than you can measure. 

Andrew Fort

unread,
Dec 15, 2011, 1:04:25 PM12/15/11
to python-...@googlegroups.com
On Wed, Dec 14, 2011 at 10:42 PM, wataka <nhy...@googlemail.com> wrote:
> Ok, since I had no replies to the code I posted above, I would like to
> ask the forum how other users start their multiple tornado backends.
> Maybe using the multiprocessing module is a bad idea?

+1 for supervisord. It's Python, so you can read and understand it
easily. You can also enable an XML-RPC interface so you can remote
control it easily for useful things without too much work
(auto-scaling, fail jobs on unresponsive machines over to others,
etc).

-a

wataka

unread,
Dec 19, 2011, 12:08:58 AM12/19/11
to Tornado Web Server
Thanks for the tips, I have decided to use monit or supervisord
instead of Cherokees inbuilt monitoring
Reply all
Reply to author
Forward
0 new messages