The issue I'm seeing is that with the old mochiweb I was able to handle well over 3000 req/sec on a single server.
With the new one, although during a test it was able to handle even more, after I reach 1000/1200 req/sec I start getting timeouts.
I could think that it's in my code but even the simple ping page fails to answer and there's nothing going on in it, it immediately returns pong.
I don't see anything particular in the erlang logs that could make me think about an internal timeout between erlang processes due to my code, I do see a lot of cases where the client disconnected from the server and when I try to send a response it fails in gen_tcp:send().
with a log like this in sasl: =CRASH REPORT==== 4-Nov-2011::00:00:03 === crasher: initial call: mochiweb_acceptor:init/3 pid: <0.12016.168> registered_name: [] exception exit: enotconn in function mochiweb_request:send/2 in call from mochiweb_request:start_raw_response/2 in call from mochiweb_request:respond/2 in call from dyno_web:loop/1 in call from mochiweb_http:headers/5 ancestors: [dyno_web,dyno_sup,<0.66.0>] messages: [] links: [<0.89.0>,#Port<0.4613065>] dictionary: [{random_seed,{1320,1117,9871}}, {mochiweb_request_path, "/view/5108159f8eebac4d22a074c1ca82f8b1"}, {mochiweb_request_cookie, [{"foobar","42a98512d10520496684b53c0b29a938"}]}] trap_exit: false status: running heap_size: 1597 stack_size: 24 reductions: 5122 neighbours:
This problem unfortunately seem to happen only in production so for now I've switched the load back to the older servers. Anyone can help me shed any light on this?
What was the old version of mochiweb? What is your loop actually doing? Have you tried using the default configuration options? What version of Erlang are you using?
> The issue I'm seeing is that with the old mochiweb I was able to handle well over 3000 req/sec on a single server.
> With the new one, although during a test it was able to handle even more, after I reach 1000/1200 req/sec I start getting timeouts.
> I could think that it's in my code but even the simple ping page fails to answer and there's nothing going on in it, it immediately returns pong.
> I don't see anything particular in the erlang logs that could make me think about an internal timeout between erlang processes due to my code, I do see a lot of cases where the client disconnected from the server and when I try to send a response it fails in gen_tcp:send().
> with a log like this in sasl: > =CRASH REPORT==== 4-Nov-2011::00:00:03 === > crasher: > initial call: mochiweb_acceptor:init/3 > pid: <0.12016.168> > registered_name: [] > exception exit: enotconn > in function mochiweb_request:send/2 > in call from mochiweb_request:start_raw_response/2 > in call from mochiweb_request:respond/2 > in call from dyno_web:loop/1 > in call from mochiweb_http:headers/5 > ancestors: [dyno_web,dyno_sup,<0.66.0>] > messages: [] > links: [<0.89.0>,#Port<0.4613065>] > dictionary: [{random_seed,{1320,1117,9871}}, > {mochiweb_request_path, > "/view/5108159f8eebac4d22a074c1ca82f8b1"}, > {mochiweb_request_cookie, > [{"foobar","42a98512d10520496684b53c0b29a938"}]}] > trap_exit: false > status: running > heap_size: 1597 > stack_size: 24 > reductions: 5122 > neighbours:
> This problem unfortunately seem to happen only in production so for now I've switched the load back to the older servers. Anyone can help me shed any light on this?
> -- > You received this message because you are subscribed to the Google Groups "MochiWeb" group. > To post to this group, send email to mochiweb@googlegroups.com. > To unsubscribe from this group, send email to mochiweb+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/mochiweb?hl=en.
On Thu, Nov 3, 2011 at 6:58 PM, Bob Ippolito <b...@redivi.com> wrote: > What was the old version of mochiweb? What is your loop actually > doing? Have you tried using the default configuration options? What > version of Erlang are you using?
> On Thu, Nov 3, 2011 at 6:41 PM, Valentino Volonghi <dialt...@gmail.com> wrote: >> I'm using mochiweb at revision 62e460dbd42 from github.
>> The issue I'm seeing is that with the old mochiweb I was able to handle well over 3000 req/sec on a single server.
>> With the new one, although during a test it was able to handle even more, after I reach 1000/1200 req/sec I start getting timeouts.
>> I could think that it's in my code but even the simple ping page fails to answer and there's nothing going on in it, it immediately returns pong.
>> I don't see anything particular in the erlang logs that could make me think about an internal timeout between erlang processes due to my code, I do see a lot of cases where the client disconnected from the server and when I try to send a response it fails in gen_tcp:send().
>> with a log like this in sasl: >> =CRASH REPORT==== 4-Nov-2011::00:00:03 === >> crasher: >> initial call: mochiweb_acceptor:init/3 >> pid: <0.12016.168> >> registered_name: [] >> exception exit: enotconn >> in function mochiweb_request:send/2 >> in call from mochiweb_request:start_raw_response/2 >> in call from mochiweb_request:respond/2 >> in call from dyno_web:loop/1 >> in call from mochiweb_http:headers/5 >> ancestors: [dyno_web,dyno_sup,<0.66.0>] >> messages: [] >> links: [<0.89.0>,#Port<0.4613065>] >> dictionary: [{random_seed,{1320,1117,9871}}, >> {mochiweb_request_path, >> "/view/5108159f8eebac4d22a074c1ca82f8b1"}, >> {mochiweb_request_cookie, >> [{"foobar","42a98512d10520496684b53c0b29a938"}]}] >> trap_exit: false >> status: running >> heap_size: 1597 >> stack_size: 24 >> reductions: 5122 >> neighbours:
>> This problem unfortunately seem to happen only in production so for now I've switched the load back to the older servers. Anyone can help me shed any light on this?
>> -- >> You received this message because you are subscribed to the Google Groups "MochiWeb" group. >> To post to this group, send email to mochiweb@googlegroups.com. >> To unsubscribe from this group, send email to mochiweb+unsubscribe@googlegroups.com. >> For more options, visit this group at http://groups.google.com/group/mochiweb?hl=en.
The "new" loop is at the bottom of the mail, the one running on the older mochiweb is a bit different and not practical to put it here, however it's essentially based on the same code, dispatch is simply using the erlang function dispatch (like I was doing in the older loop) and Path is a list of path segments.
I haven't tried with the default configuration options. I'm using the very latest version of erlang R14B04 and the platform is Ubuntu 11.10 on EC2.
loop(MochiReq) -> Req = dyno:create(mochiweb, MochiReq), Path = Req:path(), case dynospatch:dispatch(Path, Req) of no_dispatch_match -> Req:not_found(), stats:sample({unknown}, Req:start_time());
> What was the old version of mochiweb? What is your loop actually > doing? Have you tried using the default configuration options? What > version of Erlang are you using?
> On Thu, Nov 3, 2011 at 6:41 PM, Valentino Volonghi <dialt...@gmail.com> wrote: >> I'm using mochiweb at revision 62e460dbd42 from github.
>> The issue I'm seeing is that with the old mochiweb I was able to handle well over 3000 req/sec on a single server.
>> With the new one, although during a test it was able to handle even more, after I reach 1000/1200 req/sec I start getting timeouts.
>> I could think that it's in my code but even the simple ping page fails to answer and there's nothing going on in it, it immediately returns pong.
>> I don't see anything particular in the erlang logs that could make me think about an internal timeout between erlang processes due to my code, I do see a lot of cases where the client disconnected from the server and when I try to send a response it fails in gen_tcp:send().
>> with a log like this in sasl: >> =CRASH REPORT==== 4-Nov-2011::00:00:03 === >> crasher: >> initial call: mochiweb_acceptor:init/3 >> pid: <0.12016.168> >> registered_name: [] >> exception exit: enotconn >> in function mochiweb_request:send/2 >> in call from mochiweb_request:start_raw_response/2 >> in call from mochiweb_request:respond/2 >> in call from dyno_web:loop/1 >> in call from mochiweb_http:headers/5 >> ancestors: [dyno_web,dyno_sup,<0.66.0>] >> messages: [] >> links: [<0.89.0>,#Port<0.4613065>] >> dictionary: [{random_seed,{1320,1117,9871}}, >> {mochiweb_request_path, >> "/view/5108159f8eebac4d22a074c1ca82f8b1"}, >> {mochiweb_request_cookie, >> [{"foobar","42a98512d10520496684b53c0b29a938"}]}] >> trap_exit: false >> status: running >> heap_size: 1597 >> stack_size: 24 >> reductions: 5122 >> neighbours:
>> This problem unfortunately seem to happen only in production so for now I've switched the load back to the older servers. Anyone can help me shed any light on this?
>> -- >> You received this message because you are subscribed to the Google Groups "MochiWeb" group. >> To post to this group, send email to mochiweb@googlegroups.com. >> To unsubscribe from this group, send email to mochiweb+unsubscribe@googlegroups.com. >> For more options, visit this group at http://groups.google.com/group/mochiweb?hl=en.
> -- > You received this message because you are subscribed to the Google Groups "MochiWeb" group. > To post to this group, send email to mochiweb@googlegroups.com. > To unsubscribe from this group, send email to mochiweb+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/mochiweb?hl=en.
The first thing I would try is to see what happens if you don't tweak nodelay and the buffer sizes. We're not seeing anything like what you are, although our setup is a little different (CentOS, R14B01, 64-bit, not virtualized).
With the ancient mochiweb, are all settings and the environment exactly the same?
> -- > You received this message because you are subscribed to the Google Groups "MochiWeb" group. > To post to this group, send email to mochiweb@googlegroups.com. > To unsubscribe from this group, send email to mochiweb+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/mochiweb?hl=en.
> The first thing I would try is to see what happens if you don't tweak > nodelay and the buffer sizes. We're not seeing anything like what you > are, although our setup is a little different (CentOS, R14B01, 64-bit, > not virtualized).
> With the ancient mochiweb, are all settings and the environment > exactly the same?
environment is ubuntu 10.10 and behaved smoothly since 8.04. The rest of the parameters to the erlang vm are the same.
I'm gonna try using the defaults and probably getting rid of one module of the server that could be responsible for extra load.
I think using {acceptor_pool_size, 256} is probably madness. The default of 16 is almost certainly more than enough. It doesn't control how many concurrent requests you can handle, just how many processes are blocking on gen_tcp:accept/2. The old version of mochiweb is more similar to having an acceptor_pool_size of 1 or maybe 2. It's a bit harder to compare because the new mochiweb uses the acceptor process to handle the request rather than starting a new process as soon as the accept happens.
On Fri, Nov 4, 2011 at 9:24 AM, Valentino Volonghi <dialt...@gmail.com> wrote:
> On 4 Nov 2011, at 09:21, Bob Ippolito wrote:
>> The first thing I would try is to see what happens if you don't tweak >> nodelay and the buffer sizes. We're not seeing anything like what you >> are, although our setup is a little different (CentOS, R14B01, 64-bit, >> not virtualized).
>> With the ancient mochiweb, are all settings and the environment >> exactly the same?
> environment is ubuntu 10.10 and behaved smoothly since 8.04. The rest of the parameters to the erlang vm are the same.
> I'm gonna try using the defaults and probably getting rid of one module of the server that could be responsible for extra load.
> -- > You received this message because you are subscribed to the Google Groups "MochiWeb" group. > To post to this group, send email to mochiweb@googlegroups.com. > To unsubscribe from this group, send email to mochiweb+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/mochiweb?hl=en.
> I think using {acceptor_pool_size, 256} is probably madness. The > default of 16 is almost certainly more than enough. It doesn't control > how many concurrent requests you can handle, just how many processes > are blocking on gen_tcp:accept/2. The old version of mochiweb is more > similar to having an acceptor_pool_size of 1 or maybe 2. It's a bit > harder to compare because the new mochiweb uses the acceptor process > to handle the request rather than starting a new process as soon as > the accept happens.
Fair enough, I'll try to test under production dark load or something like that.
I'm also gonna remove the fixed buffer sizes and let linux determine that.
On 4 Nov 2011, at 10:59, Valentino Volonghi wrote:
> Fair enough, I'll try to test under production dark load or something like that.
> I'm also gonna remove the fixed buffer sizes and let linux determine that.
Ok… I've done a few more tests and found out something interesting…
Behind nginx 1.0.5 the system is a lot slower (half the speed), however without nginx in front, even though the system is faster, after a few rounds of testing I start getting timeouts and connection resets… These are of course symptoms of too many open sockets and so on. Although it would be interesting to understand if this is a problem that doesn't affect nginx because it's too slow or because it does something else.
I have all sorts of options set to avoid the exhaust of sockets and ports such as:
this in limits.conf: * hard nofile 999999 * hard memlock unlimited
I call ulimit -n before starting the server process and nginx is setup with
worker_rlimit_nofile 20000; events { worker_connections 26000; use epoll;
}
On one side nginx is definitely slowing down the system but allows the benchmark to finish, on the other side if I hit mochiweb directly the benchmark doesn't finish and starts timing out down the line.
For example the number of connections in TIME_WAIT is wildly different, under mochiweb it goes up to 11000 while under nginx it never really crosses 5000. However it's not really possible to lower the TIME_WAIT even more than 10 seconds.
I'm gonna try and test a different a nginx version and an older linux version or just a different one.
Here are some of the numbers:
Behind nginx: Server Software: nginx/1.0.5 Server Hostname: foobar.amazonaws.com Server Port: 80
Document Path: /perf Document Length: 35 bytes
Concurrency Level: 1000 Time taken for tests: 59.983 seconds Complete requests: 100000 Failed requests: 86 (Connect: 0, Receive: 0, Length: 86, Exceptions: 0) Write errors: 0 Non-2xx responses: 86 Total transferred: 45640768 bytes HTML transferred: 4061752 bytes Requests per second: 1667.14 [#/sec] (mean) Time per request: 599.829 [ms] (mean) Time per request: 0.600 [ms] (mean, across all concurrent requests) Transfer rate: 743.06 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 2 13 119.4 2 3035 Processing: 3 547 1296.4 166 15026 Waiting: 3 547 1296.3 166 15026 Total: 5 561 1306.8 173 18058
Percentage of the requests served within a certain time (ms) 50% 173 66% 190 75% 210 80% 229 90% 3041 95% 3188 98% 3479 99% 5420 100% 18058 (longest request)
Directly hitting mochiweb: This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/
Server Software: MochiWeb/1.0 Server Hostname: ec2-107-22-69-62.compute-1.amazonaws.com Server Port: 8000
Document Path: /perf Document Length: 35 bytes
Concurrency Level: 500 Time taken for tests: 33.848 seconds Complete requests: 100000 Failed requests: 0 Write errors: 0 Total transferred: 46600000 bytes HTML transferred: 3500000 bytes Requests per second: 2954.42 [#/sec] (mean) Time per request: 169.238 [ms] (mean) Time per request: 0.338 [ms] (mean, across all concurrent requests) Transfer rate: 1344.49 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 2 73 488.7 2 9085 Processing: 3 92 58.2 72 1770 Waiting: 2 92 58.1 71 1769 Total: 4 166 493.7 82 9345
Percentage of the requests served within a certain time (ms) 50% 82 66% 123 75% 128 80% 132 90% 147 95% 190 98% 3023 99% 3109 100% 9345 (longest request)
On Fri, Nov 4, 2011 at 8:01 PM, Valentino Volonghi <dialt...@gmail.com> wrote: > On 4 Nov 2011, at 10:59, Valentino Volonghi wrote:
>> Fair enough, I'll try to test under production dark load or something like that.
>> I'm also gonna remove the fixed buffer sizes and let linux determine that.
> Ok… I've done a few more tests and found out something interesting…
> Behind nginx 1.0.5 the system is a lot slower (half the speed), however without nginx in front, even though the system is faster, after a few rounds of testing I start getting timeouts and connection resets… These are of course symptoms of too many open sockets and so on. Although it would be interesting to understand if this is a problem that doesn't affect nginx because it's too slow or because it does something else.
> I have all sorts of options set to avoid the exhaust of sockets and ports such as:
> On one side nginx is definitely slowing down the system but allows the benchmark to finish, on the other side if I hit mochiweb directly the benchmark doesn't finish and starts timing out down the line.
> For example the number of connections in TIME_WAIT is wildly different, under mochiweb it goes up to 11000 while under nginx it never really crosses 5000. > However it's not really possible to lower the TIME_WAIT even more than 10 seconds.
> I'm gonna try and test a different a nginx version and an older linux version or just a different one.
> Here are some of the numbers:
> Behind nginx: > Server Software: nginx/1.0.5 > Server Hostname: foobar.amazonaws.com > Server Port: 80
> Concurrency Level: 1000 > Time taken for tests: 59.983 seconds > Complete requests: 100000 > Failed requests: 86 > (Connect: 0, Receive: 0, Length: 86, Exceptions: 0) > Write errors: 0 > Non-2xx responses: 86 > Total transferred: 45640768 bytes > HTML transferred: 4061752 bytes > Requests per second: 1667.14 [#/sec] (mean) > Time per request: 599.829 [ms] (mean) > Time per request: 0.600 [ms] (mean, across all concurrent requests) > Transfer rate: 743.06 [Kbytes/sec] received
> Connection Times (ms) > min mean[+/-sd] median max > Connect: 2 13 119.4 2 3035 > Processing: 3 547 1296.4 166 15026 > Waiting: 3 547 1296.3 166 15026 > Total: 5 561 1306.8 173 18058
> Percentage of the requests served within a certain time (ms) > 50% 173 > 66% 190 > 75% 210 > 80% 229 > 90% 3041 > 95% 3188 > 98% 3479 > 99% 5420 > 100% 18058 (longest request)
> Directly hitting mochiweb: > This is ApacheBench, Version 2.3 <$Revision: 655654 $> > Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ > Licensed to The Apache Software Foundation, http://www.apache.org/
> Concurrency Level: 500 > Time taken for tests: 33.848 seconds > Complete requests: 100000 > Failed requests: 0 > Write errors: 0 > Total transferred: 46600000 bytes > HTML transferred: 3500000 bytes > Requests per second: 2954.42 [#/sec] (mean) > Time per request: 169.238 [ms] (mean) > Time per request: 0.338 [ms] (mean, across all concurrent requests) > Transfer rate: 1344.49 [Kbytes/sec] received
> Connection Times (ms) > min mean[+/-sd] median max > Connect: 2 73 488.7 2 9085 > Processing: 3 92 58.2 72 1770 > Waiting: 2 92 58.1 71 1769 > Total: 4 166 493.7 82 9345
> Percentage of the requests served within a certain time (ms) > 50% 82 > 66% 123 > 75% 128 > 80% 132 > 90% 147 > 95% 190 > 98% 3023 > 99% 3109 > 100% 9345 (longest request)
> -- > You received this message because you are subscribed to the Google Groups "MochiWeb" group. > To post to this group, send email to mochiweb@googlegroups.com. > To unsubscribe from this group, send email to mochiweb+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/mochiweb?hl=en.
Quick update because while I was doing performance testing before being side tracked…
I've tested a self-compiled version of nginx which didn't seem to show the slowness and managed to not slow down mochiweb 3000+ req/sec
Maybe Ubuntu 11.10 ships with a version of nginx compiled with too many plugins that slow down the server itself or maybe it's compiled with very conservative settings… Unfortunately while I was about to test this more deeply I got side tracked.
I still get a high 100% response time (timeout) basically in the last request, not sure what it could be but if it reaches at least 3000 req/sec on a large instance then it's good enough, it's at least as fast as the current system.
On 4 Nov 2011, at 20:05, Valentino Volonghi wrote:
> On 4 Nov 2011, at 20:03, Bob Ippolito wrote:
>> Is your benchmark using HTTP Keep-Alive? Nginx does not support it for >> talking to backends
I've also modified mochiweb a little bit to avoid sending on a nil socket or getpeer on a nil socket (The only 2 changes).
Without any other modification I could get 10000 calls per second on a very simple url.
Then I've run the function above under fprof:apply() with 500 reps.
In the results most of the functionality took around 100 ms cumulative within the profiler, except for one function:
string:to_lower/0 called by mochiweb_headers:normalize/1
Of course I don't know if this is related to the performance differences I'm seeing (it could be that in my new code I'm stressing this area more by fetching more headers and even though the code is not different it's affected) however this change alone affected about 60% of the performance.
Changing mochiweb_headers:normalize/1 to return just K instead of string:to_lower(K) I could run 16000 calls per second.
Another guilty part is lists:flatten although I doubt anything can be done about this.
Header parsing code is a "bottleneck" in these kinds of microbenchmarks but that part of mochiweb has not changed at all so can't possibly explain any of your problem.
On Monday, November 14, 2011, Valentino Volonghi <dialt...@gmail.com> wrote: > On 9 Nov 2011, at 19:26, Valentino Volonghi wrote:
> Digging a bit deeper I've tried to run mochiweb and my app syntetically with this function:
> I've also modified mochiweb a little bit to avoid sending on a nil socket
or getpeer on a nil socket (The only 2 changes).
> Without any other modification I could get 10000 calls per second on a very simple url.
> Then I've run the function above under fprof:apply() with 500 reps.
> In the results most of the functionality took around 100 ms cumulative
within the profiler, except for one function:
> string:to_lower/0 called by mochiweb_headers:normalize/1
> Of course I don't know if this is related to the performance differences
I'm seeing (it could be that in my new code I'm stressing this area more by fetching more headers and even though the code is not different it's affected) however this change alone affected about 60% of the performance.
> Changing mochiweb_headers:normalize/1 to return just K instead of
string:to_lower(K) I could run 16000 calls per second.
> Another guilty part is lists:flatten although I doubt anything can be done about this.
> -- > You received this message because you are subscribed to the Google Groups "MochiWeb" group. > To post to this group, send email to mochiweb@googlegroups.com. > To unsubscribe from this group, send email to
> Header parsing code is a "bottleneck" in these kinds of microbenchmarks but that part of mochiweb has not changed at all so can't possibly explain any of your problem.
That's true and I still think my main bottleneck is between nginx and erlang, although I can't quite figure out why nginx goes from handling 20000 req/sec of the simple transparent gif that it serves to handle 1800 req/sec in front of a server that can handle 3100 req/sec by itself. It uses a lot of CPU to handle those 1800, the same it uses for the transparent gif (or even more).
However my older code did GeoIP within erlang while the new code lets nginx do GeoIP and pass everything as headers. This means that for each request my new code is doing 11 more header lookups for each request compared to just 3 or 4 total before, since this is a bottleneck I could speed up my code a little bit by changing this part.
It is an HTTP/1.0 proxy without the ability for keep-alive requests yet. (As a result, backend connections are created and destroyed on every request.) Nginx talks HTTP/1.1 to the browser and HTTP/1.0 to the backend server. As such it handles keep-alive to the browser.
On Tue, Nov 15, 2011 at 4:33 PM, Valentino Volonghi <dialt...@gmail.com> wrote:
> On 14 Nov 2011, at 22:07, Bob Ippolito wrote:
>> Header parsing code is a "bottleneck" in these kinds of microbenchmarks but that part of mochiweb has not changed at all so can't possibly explain any of your problem.
> That's true and I still think my main bottleneck is between nginx and erlang, although I can't quite figure out why nginx goes from handling 20000 req/sec of the simple transparent gif that it serves to handle 1800 req/sec in front of a server that can handle 3100 req/sec by itself. It uses a lot of CPU to handle those 1800, the same it uses for the transparent gif (or even more).
> However my older code did GeoIP within erlang while the new code lets nginx do GeoIP and pass everything as headers. This means that for each request my new code is doing 11 more header lookups for each request compared to just 3 or 4 total before, since this is a bottleneck I could speed up my code a little bit by changing this part.
> -- > You received this message because you are subscribed to the Google Groups "MochiWeb" group. > To post to this group, send email to mochiweb@googlegroups.com. > To unsubscribe from this group, send email to mochiweb+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/mochiweb?hl=en.
> It is an HTTP/1.0 proxy without the ability for keep-alive requests > yet. (As a result, backend connections are created and destroyed on > every request.) Nginx talks HTTP/1.1 to the browser and HTTP/1.0 to > the backend server. As such it handles keep-alive to the browser.
I've run more performance tests between different versions of:
Ubuntu 11.04 and 11.10 Nginx 0.8.54 and 1.0.5 Erlang VM R14B03 and R14B04 (compiled both with same options)
This mixed in with my new code and my old code, just using ab, not a great tool to know actual performance but good to value differences.
I've also tried connecting directly to the mochiweb server and through nginx separately. I haven't run all of them yet but I have some numbers at the bottom.
With those numbers I don't think it's possible for nginx to be responsible for the slowdown, if at all considering that the new nginx has slightly smaller impact on performance when used even though it also does GeoIP. So I'd say nginx got slightly faster.
At this point it's either erlang that got slower between R14B03 and R14B04 or Ubuntu 11.10 that got slower than 11.04. The new mochiweb and my new code are considerably faster than my old code with old mochiweb even though the new code does more things by default, such as generating cookies, fetching more headers and so on. I'm gonna first try with a new erlang and if that proves still slow then I'll either stick with the old (but soon unsupported version of Ubuntu or change distribution entirely).
Given that Ubuntu 11.10 comes with Linux 3.0 which has several big changes among which Xen support, new process scheduler, big changes in memory management that maybe cause these slow downs.
All ran 100000 requests at 500 concurrency.
Erlang R14B03:
Ubuntu 11.04 - New Code - Direct Connection: - Requests per second: 4271.01 [#/sec] (mean) - Latency: 95% 112, 98% 262ms, 99% 3059, 100% 9286
Ubuntu 11.04 - New Code - Through Nginx 0.8.54: - Requests per second: 2460.42 [#/sec] (mean) - Latency: 95% 212, 98% 3091, 99% 3135, 100% 9214
Ubuntu 11.04 - Old Code - Direct Connection: [Required one retry to finish due to timeouts] - Requests per second: 3318.86 [#/sec] (mean) - LatencyL 95% 133, 98% 1063, 99% 3077, 100% 17182
Ubuntu 11.04 - Old Code - Through Nginx: - Requests per second: 1897.05 [#/sec] (mean) - Timeouts: 21 - Latency: 95% 191, 98% 3118, 99% 3163, 100% 15006
Erlang R14B04:
Ubuntu 11.10 - New Code - Direct Connection: - Requests per second: 3306.48 [#/sec] (mean) - Latency: 95% 158, 98% 404, 99% 3086, 100% 9875
Ubuntu 11.10 - New Code - Through Nginx 1.0.5: - Requests per second: 1977.81 [#/sec] (mean) - Latency: 95% 244, 98% 3100, 99% 3168, 100% 9317
Ubuntu 11.10 - Old Code - Direct Connection: [Required several retries to finish due to timeouts] - Requests per second: 2966.25 [#/sec] (mean) - Latency: 95% 123, 98% 783, 99% 3058, 100% 23976
Ubuntu 11.10 - Old Code - Through Nginx 1.0.5: - Requests per second: 1880.54 [#/sec] (mean) - Timeouts: 34 - Latency: 95% 215, 98% 3100, 99% 3288, 100% 15870
- Did you check file descriptor size and soft limits of processes by any chance?- - Also is this possible to dump tcp traffic with tcpdump to see latency values? - Do you use alogger/lager etc. type disk based loggers?
>> It is an HTTP/1.0 proxy without the ability for keep-alive requests >> yet. (As a result, backend connections are created and destroyed on >> every request.) Nginx talks HTTP/1.1 to the browser and HTTP/1.0 to >> the backend server. As such it handles keep-alive to the browser.
> I've run more performance tests between different versions of:
> Ubuntu 11.04 and 11.10 > Nginx 0.8.54 and 1.0.5 > Erlang VM R14B03 and R14B04 (compiled both with same options)
> This mixed in with my new code and my old code, just using ab, not a great tool to know actual performance but good to value differences.
> I've also tried connecting directly to the mochiweb server and through nginx separately. I haven't run all of them yet but I have some numbers at the bottom.
> With those numbers I don't think it's possible for nginx to be responsible for the slowdown, if at all considering that the new nginx has slightly smaller impact on performance when used even though it also does GeoIP. So I'd say nginx got slightly faster.
> At this point it's either erlang that got slower between R14B03 and R14B04 or Ubuntu 11.10 that got slower than 11.04. The new mochiweb and my new code are considerably faster than my old code with old mochiweb even though the new code does more things by default, such as generating cookies, fetching more headers and so on. I'm gonna first try with a new erlang and if that proves still slow then I'll either stick with the old (but soon unsupported version of Ubuntu or change distribution entirely).
> Given that Ubuntu 11.10 comes with Linux 3.0 which has several big changes among which Xen support, new process scheduler, big changes in memory management that maybe cause these slow downs.
> All ran 100000 requests at 500 concurrency.
> Erlang R14B03:
> Ubuntu 11.04 - New Code - Direct Connection: > - Requests per second: 4271.01 [#/sec] (mean) > - Latency: 95% 112, 98% 262ms, 99% 3059, 100% 9286
> Ubuntu 11.04 - New Code - Through Nginx 0.8.54: > - Requests per second: 2460.42 [#/sec] (mean) > - Latency: 95% 212, 98% 3091, 99% 3135, 100% 9214
> Ubuntu 11.04 - Old Code - Direct Connection: > [Required one retry to finish due to timeouts] > - Requests per second: 3318.86 [#/sec] (mean) > - LatencyL 95% 133, 98% 1063, 99% 3077, 100% 17182
> -- > You received this message because you are subscribed to the Google Groups "MochiWeb" group. > To post to this group, send email to mochiweb@googlegroups.com. > To unsubscribe from this group, send email to mochiweb+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/mochiweb?hl=en.
One thing that I re-checked is the machine that is clearly faster than the others was using a different CPU.
Re-running tests with a new instance and same CPU showed roughly the same results as the others, slightly better than old code overall, not 30% more, more like a few percentage points only and this sent me back to square one or rather -1 since now it's becoming a general performance issue...
> - Did you check file descriptor size and soft limits of processes by > any chance?-
> - Also is this possible to dump tcp traffic with tcpdump to see latency values?
I guess I can give this a try.
> - Do you use alogger/lager etc. type disk based loggers?
nginx has logging disabled (except for timeouts, exceptions, error responses, and so on, if everything goes well it doesn't log anything). erlang for the url tested doesn't log anything at all, otherwise for actual production code it does have disk based logging, this is practically the same code as the old server and this is the code path for a logging call:
internal_write(State=#log{handle=H}, Data) -> ok = file:write(H, Data), State.
Data is usually around 600 bytes per call and by default delayed_write waits for 64K before flushing.
I've thought of other potential issues like wrong balancing across the cores but I've started a new instance with 8 CPUs and the load spreads evenly across all of them and can handle more data than apparently its network lets through, running even 2 ab locally on this machines yields more than 12000 req/sec (and 2 cores are busy with ab). Non locally instead I get high latency on the 100% and 8000 req/sec or so max even with 5 other instances stressing the machine. On one physical Intel(R) Core(TM)2 CPU 6600 @2.40GHz that I have here with 4GB of RAM running Ubuntu 10.10 and erlang R14B03 I get the new code to do about 6000 req/sec with 99% latency at 14 and 100% latency at 3010.
If that's the case then I'm simply reaching the limit of the underlying network capacity for the instance and I need to increase the number of instances to fix all these issues.