surprising web server performance

43 views
Skip to first unread message

Sam Tobin-Hochstadt

unread,
Jul 13, 2015, 6:50:28 PM7/13/15
to d...@racket-lang.org
Summary: `web-server` slows down a lot under high concurrenccy, which
the "More" tutorial web server doesn't. But "More" has bad 99%
latency, probably for GC reasons, which `web-server` doesn't. Any help
in understanding the behavior would be appreciated.

----

When I was at PolyConf two weeks ago, I ran a quick benchmark on a web
server created by `#lang web-server/insta`, and I discovered that it
did quite poorly under high concurrency. Here are some results from
`ab`:

10k requests, concurrency 2:

Percentage of the requests served within a certain time (ms)
50% 1
66% 1
75% 1
80% 1
90% 1
95% 1
98% 2
99% 5
100% 9 (longest request)

concurrency 1000:

Percentage of the requests served within a certain time (ms)
50% 309
66% 661
75% 1114
80% 1240
90% 1648
95% 3333
98% 3864
99% 3996
100% 7575 (longest request)

We've gone from very fast to very slow here.

I then tried the web server constructed in the "More" tutorial, which
is much simpler:

concurrency 2:

Percentage of the requests served within a certain time (ms)
50% 1
66% 1
75% 2
80% 2
90% 4
95% 5
98% 6
99% 6
100% 435 (longest request)

Very similar, but a big spike at the longest request.

concurrency 1000:

Percentage of the requests served within a certain time (ms)
50% 2
66% 3
75% 4
80% 4
90% 5
95% 6
98% 19
99% 748
100% 4646 (longest request)

Now this is the surprising one: the median has barely budged, and the
98% is still quite good -- 200x better than `web-server/servlet`. But
the longest request is again very slow.

Using the `gcstats` package, it looks like the `web-server` version is
allocating about 3.5 GB , while the More version allocates about 450
MB.

Strangely, the `web-server` version allocates much more in the
1000-concurrent setting than in the 2-concurrent setting.

Also, the longest request in More is significantly longer than the
_entire_ GC time for that program, and much longer than the max pause
time.

Overall, I find these results surprising in a bunch of ways, and can't
really explain them. But I think they indicate that a variety of
things could be improved.

Sam
Reply all
Reply to author
Forward
0 new messages