Pitfalls combining HTTP keepalive and reuse_port?

33 views
Skip to first unread message

Adrian Flitcroft

unread,
Feb 4, 2021, 1:18:13 PM2/4/21
to Tornado Web Server
Bear with me on this one guys. I'm trying to wrap my head around the concepts and avoid mistakes.

I understand that running multiple separate processes on the same port - in combination with reuse_port - has superseded fork_processes.

Let's say I enable keepalive connections between my Nginx reverse proxy and each upstream Tornado server.

My vague concern at this point, is that as Nginx maintains its pool of open connections, this creates a kind of implicit binding between Nginx and Tornado processes, leading to a suboptimal distribution of requests. 

A situation where, perhaps, a certain pattern of light requests causes Nginx to hold open connections to a subset of Tornado processes. Then a burst of heavy requests are distributed to that subset, leaving CPUs underutilized.

Before I possibly waste a ton of time researching this issue, can anyone tell me whether my concern is justified?

If it is, can the problem be mitigated by reverting to fork_processes? I don't see why it would, but asking to make doubly sure.

Finally, is this justification for disabling keepalive connections between Nginx and Tornado, or is the trade-off considered worthwhile?

Ben Darnell

unread,
Feb 6, 2021, 9:09:43 PM2/6/21
to Tornado Mailing List
Yes, this concern is justified. But it exists in exactly the same way for reuse_port and fork_processes, so switching between those two won't change much.

Disabling keepalive connections from nginx to its backends will help, but only somewhat . Without keepalive connections, each request will distributed to the backends independently so you don't get imbalances that persist for long periods of time due to long-lived connections, but in practice it's easy to see imbalances with this approach. With the equivalent of fork_processes and non-keepalive connections in a real-world app, I saw the process with the heaviest load be assigned twice as many requests as the one with the lightest load. The reuse_port option is supposed to be a little better than fork_processes at this kind of balancing, but I haven't tested it myself.

Both fork_processes and reuse_port are intended for simple use *without* a separate load balancer. If you are running nginx or another proxy, it's best to give each process its own port and tell the proxy about them all. That way the proxy knows which process each request is going to and it can be intelligent about balancing them. This has other benefits including better monitorability for each process, although it's more setup work.

For a lightly-loaded app, you don't need to worry about this too much and can just use reuse_port. But if you get enough traffic that you care about how well the requests are balanced across processes, it's better to do the work to give each server process its own port.

-Ben

--
You received this message because you are subscribed to the Google Groups "Tornado Web Server" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-tornad...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/python-tornado/f5dcaef6-e2c9-44b2-89d5-6931c99dddean%40googlegroups.com.

Adrian

unread,
Feb 7, 2021, 7:37:48 AM2/7/21
to Tornado Web Server
>> Both fork_processes and reuse_port are intended for simple use *without* a separate load balancer. If you are running nginx or another proxy, it's best to give each process its own port and tell the proxy about them all.

Thanks for the thoughts Ben. It looks like a local nginx instance is still the way to go. 

Alternatively, I could specify every host:port combination in my remote nginx reverse proxy, but that has implications for health checks and failover that I'll need to explore.

Adrian

unread,
Feb 19, 2021, 4:50:25 PM2/19/21
to Tornado Web Server
I've been pondering how to deploy Tornado for scale, resource utilisation and simplicity. 

I think it's clear from this discussion and further research, that kernel-based "port sharding" is a lousy way to load balance. But I'm also not keen on an additional local nginx instance, when I'm already running an nginx reverse proxy, downstream of Tornado.

Clearly the way to go is to run a Tornado process per-core, on separate ports, on each of your web servers. But then define them individually in your nginx upstream configuration, interleaving the hosts. So if a box fails, nginx will fail over to a working server, rather than another port on the broken one. For example (excluding the various passive health check parameters):

upstream tornado {
    server 1.2.3.4:8881
    server 5.6.7.8:8881
    server 1.2.3.4:8882
    server 5.6.7.8:8882
    server 1.2.3.4:8883
    server 5.6.7.8:8883
    server 1.2.3.4:8884
    server 5.6.7.8:8884

    keepalive 16;
}

The added benefit is you also get to use keepalive connections, without any drawbacks for load-balancing.

This will be a bit fiddly in production, when it comes to deploying new servers or fresh code, but if you're used to automating configuration management and have multiple web servers, then it's not too much of a concern.

Ben Darnell

unread,
Feb 28, 2021, 9:50:46 PM2/28/21
to Tornado Mailing List
On Fri, Feb 19, 2021 at 4:50 PM Adrian <aflit...@gmail.com> wrote:
I've been pondering how to deploy Tornado for scale, resource utilisation and simplicity. 

I think it's clear from this discussion and further research, that kernel-based "port sharding" is a lousy way to load balance. But I'm also not keen on an additional local nginx instance, when I'm already running an nginx reverse proxy, downstream of Tornado.

Clearly the way to go is to run a Tornado process per-core, on separate ports, on each of your web servers. But then define them individually in your nginx upstream configuration, interleaving the hosts. So if a box fails, nginx will fail over to a working server, rather than another port on the broken one. For example (excluding the various passive health check parameters):


Yes, a separate nginx (or haproxy) server balancing across all the processes on every machine has always been my recommendation once you have multiple machines involved. I don't think you need to interleave the hosts like that - with health checks set up correctly it should skip over all the ports on a downed box. (But I'm far from an nginx expert)
 
upstream tornado {
    server 1.2.3.4:8881
    server 5.6.7.8:8881
    server 1.2.3.4:8882
    server 5.6.7.8:8882
    server 1.2.3.4:8883
    server 5.6.7.8:8883
    server 1.2.3.4:8884
    server 5.6.7.8:8884

    keepalive 16;
}

The added benefit is you also get to use keepalive connections, without any drawbacks for load-balancing.

This will be a bit fiddly in production, when it comes to deploying new servers or fresh code, but if you're used to automating configuration management and have multiple web servers, then it's not too much of a concern.

This is going to depend on how exactly you deploy, but it doesn't need to be too "fiddly" - I've found supervisord to work well at managing multiple processes like this. You just need to be careful to deploy new versions into fresh directories instead of updating existing directories in-place (which may be used by current processes).

-Ben
 
Reply all
Reply to author
Forward
0 new messages