fork_processes, sockets, what is tornado doing? Help me understand this example.

542 views
Skip to first unread message

Alejandro Peralta Frías

unread,
Jul 13, 2015, 5:18:30 PM7/13/15
to python-...@googlegroups.com
I found this example (see below or link) in an open source project which fork some processes and sets up several http servers.

As far as I understand Tornado will fork 4 process (I have 4 cores, and the argument to fork_processes is 0) will exit the parent that created the fork as I read in the code. What I think the result is going to be
is that all 4 process will be accepting connection on 8888 and one of the child process (task_id == 0) will also have two more servers that listen on 7777 and 9999. Is that right?

What I don't understand is what happens to the socket listing on port 8888? Are connections being distributed across all processes? Does the kernel distribute connections? They just go to one process? I thought that only one processes can listen then you fork when accepting (of course this is not tornado model, that I understand, but I don't understand what _is_ tornado doing?). 

I've read in the list that's better to use a load balancer and supervisor. I would like to understand what happens.

def tornado_run():
    # Bind tcp port to launch processes on requests
    sockets = netutil.bind_sockets(8888)
    # Fork working processes
    process.fork_processes(0)
    # Tornado app implementation
    app = Application([url(r"/", RequestHandler)])
    # Start http servers and attach the web app to it
    server = httpserver.HTTPServer(app)
    server.add_sockets(sockets)
    # Perform following actions only in the parent process
    process_counter = process.task_id()
    if (process_counter == 0):
        server_0 = Application([url(r"/", ZeroRequestHandler)])
        server_0.listen(7777)
        server_evt = Application([url(r"/", ZeroRequestHandler)])
        server_evt.listen(9999)

    # main io loop. it will loop waiting for requests
    IOLoop.instance().start()

Ben Darnell

unread,
Jul 13, 2015, 10:51:34 PM7/13/15
to Tornado Mailing List
On Mon, Jul 13, 2015 at 5:18 PM, Alejandro Peralta Frías <aper...@machinalis.com> wrote:
I found this example (see below or link) in an open source project which fork some processes and sets up several http servers.

As far as I understand Tornado will fork 4 process (I have 4 cores, and the argument to fork_processes is 0) will exit the parent that created the fork as I read in the code. What I think the result is going to be
is that all 4 process will be accepting connection on 8888 and one of the child process (task_id == 0) will also have two more servers that listen on 7777 and 9999. Is that right?

Yes.
 

What I don't understand is what happens to the socket listing on port 8888? Are connections being distributed across all processes? Does the kernel distribute connections? They just go to one process?

The incoming connections are distributed across all the processes. The kernel has one queue of connections, and when a new connection arrives, any processes that are idle in their epoll() will wake up. The first of these processes to reach the accept() call wins the connection. This results in a very roughly even distribution of connections across the processes, since a busier process will be less likely to be idle when the new connection comes in (but it's very rough, and I've one process with twice as many connections as another when using this strategy).

 
I thought that only one processes can listen then you fork when accepting (of course this is not tornado model, that I understand, but I don't understand what _is_ tornado doing?). 

Only one process can bind() to a port, but once the socket has been bound any process that obtains a copy of the file descriptor can accept() on it. When a process forks, the child process inherits a copy of its parent's file descriptors. (and there are other ways to copy file descriptors: https://gist.github.com/bdarnell/1073945)
 

I've read in the list that's better to use a load balancer and supervisor. I would like to understand what happens.

There are several reasons to prefer a load balancer. First, as noted above, multi-process mode doesn't do a great job with load balancing; a real load balancer will distribute the load more evenly. Second, and probably more important, is that with a load balancer/supervisor system, you can restart one process at a time to deploy new code with zero downtime. With fork_processes(), you must take all the processes down at the same time before you can start a new master process.

-Ben
 

def tornado_run():
    # Bind tcp port to launch processes on requests
    sockets = netutil.bind_sockets(8888)
    # Fork working processes
    process.fork_processes(0)
    # Tornado app implementation
    app = Application([url(r"/", RequestHandler)])
    # Start http servers and attach the web app to it
    server = httpserver.HTTPServer(app)
    server.add_sockets(sockets)
    # Perform following actions only in the parent process
    process_counter = process.task_id()
    if (process_counter == 0):
        server_0 = Application([url(r"/", ZeroRequestHandler)])
        server_0.listen(7777)
        server_evt = Application([url(r"/", ZeroRequestHandler)])
        server_evt.listen(9999)

    # main io loop. it will loop waiting for requests
    IOLoop.instance().start()

--
You received this message because you are subscribed to the Google Groups "Tornado Web Server" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-tornad...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alejandro Peralta Frías

unread,
Jul 14, 2015, 11:54:02 AM7/14/15
to python-...@googlegroups.com, b...@bendarnell.com
Thank you very much Ben!

Regards,
Ale
Reply all
Reply to author
Forward
0 new messages