It turns out that one million requests per second is easy with the right tools

1,421 views
Skip to first unread message

Brian Hauer

unread,
Mar 4, 2014, 9:54:01 AM3/4/14
to framework-...@googlegroups.com
As a bit of a teaser for Round 9 (coming soon, I promise!), I've written up a blog entry expressing my personal surprise at how easily achievable one million requests per second is with modern hardware and software.

http://www.techempower.com/blog/2014/03/04/one-million-http-rps-without-load-balancing-is-easy/

A short while back, I and my colleagues had read with great interest Google's announcement about their load balancers processing over a million requests per second.  That was fascinating and a serious achievement.  But it really casts a different light on the whole scenario when you consider that a single server can service the same number of requests without any load balancing—without any system complexity at all.

I don't want to steal any thunder from Google's accomplishment with their load balancer.  But I do want to remind people that if you simply begin with higher-performance tools, you can find yourself not needing to invest time, effort, and money in complicated system architectures.  That is a principal reason for this project, after all, to remind everyone that performance is highly variable and you can avoid spending effort and money if you think about performance a bit up-front.

The hardware Peak has provided certainly makes our workstations seem quite inadequate.  I need to be able to process a million requests per second on my desktop!  :)

Adam Chlipala

unread,
Mar 4, 2014, 10:04:53 AM3/4/14
to framework-...@googlegroups.com
On 03/04/2014 09:54 AM, Brian Hauer wrote:
> As a bit of a teaser for Round 9 (coming soon, I promise!), I've
> written up a blog entry expressing my personal surprise at how easily
> achievable one million requests per second is with modern hardware and
> software.

Here's a related question I've had for a while, in the context of these
benchmarks:

What are the latest figures on how many requests per second real web
sites receive in the first place? For each RPS requirement, how rare is
it to need that much performance, as a fraction of real sites out there?

In evaluating the TechEmpower benchmark results, can we somehow draw a
line of "good enough for almost everyone" throughput and, past that
point, focus on factors like how much programmer effort different
frameworks require?

Brian Hauer

unread,
Mar 4, 2014, 10:44:15 AM3/4/14
to framework-...@googlegroups.com
Hi Adam,

Your question has the marks of a great conversation.  :)  Virtually no web sites in the world receive a million requests per second.  That kind of request load is orders of magnitude higher than the load seen by the vast majority of the web.

The most successful sites I have worked on have had millions of users and across a collection of servers might process several thousand (dynamic) requests per second at peak load times.  (It's important to note that I'm virtually never concerned about the delivery of static assets on a high-traffic site since those are handled by a CDN or separate static servers.)

When we measure requests per second with a trivial response, such as tested by Google and mimicked by us for the above blog entry, we intend for this to be a proxy for application performance.  Furthermore, with any realistic response payload, it's too easy to saturate gigabit Ethernet with a modestly high-performance server.  Even a fairly small response of only 1,000 bytes per response means you need only provide ~80,000 requests per second to totally saturate a gigabit Ethernet connection.

The tests in our project are one or two steps closer to testing real-world application behaviors, but they too are proxies for real applications.  Even the Fortunes test, which simulates the widest spectrum of framework and platform functions in our project remains a very simple workload compared to most real-world applications.

For that reason, when I've been asked to describe the value in this data, I routinely suggest that if we are comfortable using these data as a proxy, and we are comfortable applying a very rough coefficient to the numbers to represent the additional workload of our real-world applications (say, 0.01 or 0.001), we can begin to map the data to something realistic.

For example, imagine I am evaluating two framework options that I am comfortable with for all other reasons (language, expressiveness, community, developer efficiency).  Framework A hits 5,000 requests per second on Fortunes while Framework B reaches 500.  If I believe my application will be about 100 times more complex than Fortunes, I can roughly compute that my application may enjoy either 50 requests per second (5,000 / 100) from Framework A and 5 requests per second (500 / 100) from Framework B.  50 might be acceptable for my use-case, but 5 is probably not.  So I'd favor Framework A.

In practice, I may then invest the time to build out a proof of concept and benchmark my specific application on A and B to confirm my hunch.  The above assumes I'm evaluating only two options.  But greater value comes from situations where I am pretty tolerant on the other variables, so I am open to considering a spread of options.  I can't reasonably implement my proof of concept code on a dozen or more frameworks, so the proxy data gives me a hand in narrowing the field.

But circling back to raw request per second data measured in the millions per second.  No, that alone is not interesting to most of the world because without a bigger picture of how the system performs once your application code is in play, it's not easy to know if you're dealing with a highly optimized web server paired with a sluggish application stack.  Of course in this case, we know that Undertow is a JVM platform so your application stack is going to be high-performance as well.

For this particular metric, it's a bit more like bragging rights.

Speaking big-picture, I feel we have more than adequate coverage of trivial tests in this project and future test types should be more complex operations.  At some point, I want to have test types that compress even the top performers into the realm of hundreds of requests per second.
Reply all
Reply to author
Forward
0 new messages