help on implementing backpressure for a simple http server

95 views
Skip to first unread message

Dimitar Georgiev

unread,
Nov 9, 2016, 10:58:55 AM11/9/16
to Netty discussions
Hello, 

I'm trying to implement some backpressure support for an HTTP scenario.
The overall goal is that when the receiver processes data slower than server produces, or the network sends data to client slower than server produces, I will be able to somehow limit the amount of (direct) memory used by the process.

My current idea was to take advantage of the write watermark settings, turn off auto-read when channel becomes unwritable, and turn it back on when channel becomes writable again.
The (interesting part of) the code in question is here:


There is two http routes in the example:

- If you make an arbitrary http request with a "size" query parameter, the server will generate a list of java objects (https://github.com/dimitarg/backpressure_test/blob/master/src/main/java/com/novarto/test/Bean.java) with the passed size, and write it back as JSON using a direct buffer;
- If you make a HTTP request toward uri "/static" it will return a pre-generated json via Unpooled.wrappedBuffer (I wrote this one to make sure GC pressure and/or serialization cycles were not messing with the results). The count of json object in this precached response depends on the system property "staticSize", and defaults to 5000

As you can see, this "backpressure" mechanism in the example can be turned on/off via a system property. 

I then did some load tests, using h2load (https://nghttp2.org/documentation/h2load-howto.html) with "backpressure" on and off. The client and server were on two different physical machines with one hop between them and 1000 MBPS link, and the tests were in all cases bound by the network (the server TX throughput was the same as observed  via an iperf tcp test between the two machines)

What turned out was that this backpressure mechanism of mine did not have any effect at all. During all tests the "backpressured" version of the server and the naive one performed on par - which means that process reserved memory grew at the same rate, and to the same absolute size, plus/minus statistical error. In both cases the server process memory grows. almost monotonously, during the lifetime of the load. If you let it run long enough, and make the responses big enough, eventually it's going to go down with OutOfDirectMemoryError

So what I have written does not work at all. Can you help me out how to go about reaching my goal -  limit the amount of (direct) memory used by the process in such a scenario? 


Thanks.


1000 concurrent users, 100 000 total requests, dynamic route, each response has size 10 000 json objects
./h2load -B http://10.10.0.186:8081 -n 100000 -t 8 -c 1000 "/?size=10000" --h1
930 mb peak, backpressure enabled
926 mb peak, backpressure disabled



1000 concurrent users, 10 000 total requests, dynamic route,  each response has size 50 000 json objects
./h2load -B http://10.10.0.186:8081 -n 10000 -t 8 -c 1000 "/?size=50000" --h1
2.7 g peak, backpressure enabled, 127 sec
2.6 g peak, backpressure disabled, 120 sec


1000 concurrent users, 3 000 total requests, dynamic route,  each response has size 100 000 json objects
./h2load -B http://10.10.0.186:8081 -n 3000 -t 8 -c 1000 "/?size=100000" --h1
4.7 g peak, backpressure enabled, 78 sec
4.6 g peak, backpressure disabled, 79 sec



1000 concurrent users, 100 000 total requests, static precached response route,  each response has size 10 000 json objects
./h2load -B http://10.10.0.186:8081 -n 100000 -t 4 -c 1000 "/static" --h1

backpressure : 614 m peak, 213 sec
no backpressure: 590 m peak, 213 sec



Eran Harel

unread,
Nov 13, 2016, 6:25:04 AM11/13/16
to Netty discussions
From my experience using the writability events to implement back-pressure in such scenarios is ineffective.

To begin with, you only stop reading from the channel of the current request - not from *all* channels, and you definitely don't stop accepting new requests (you need to close the parent channel - the server channel - to achieve that) Remember, this is HTTP.

Second, even in a TCP based protocol it is better to implement back-pressure based on an outstanding requests gauge instead.
This allows you to respond faster, (before you're holding more "context" than you can actually manage), and gives you a finer grained control on the throttling.
Usually you add an atomic integer which gets incremented when you start handling a new request, and decremented when it's has been completed (write and flush future completed).
You can set the auto-read to false when you reach a "max" threshold, and set it to true when you go below the low threshold. I think there's an example in the netty-example repo.
Please note that it doesn't mean you need to remove what you already implemented.
Reply all
Reply to author
Forward
0 new messages