High latency when processing many concurrent connections

67 views

Skip to first unread message

Travis Bell

unread,

Mar 10, 2014, 6:31:16 PM3/10/14

to golia...@googlegroups.com

Hey guys,

I've been able to build a pretty decent little proxy thanks to both the examples by Ilya and the comments by some of you here on the mailing list. I've been testing it under a little bit of load and have noticed that while the requests per second stay consistent, the latency of a request grows by quite a lot. What I am wondering is whether or not this is normal. My problem being that it's a substantial amount of latency, and truthfully, too much to put into production. We have existing load so I can predict and test what the real world usage looks like.

Let me elaborate with some numbers and data.

A single request with no load produces very nice numbers:

[41372:INFO] 2014-03-10 12:28:09 :: trace.start:3.54, pre_process_beg:0.08, received_usage_info:0.33, pre_process_end:0.07, received_downstream_resp:7.14, post_process_beg:0.03, post_process_end:0.04, total:11.22999

Then, I decided to run an ab test, hitting the backend server directly (ab -c200 -n1000 ...):

[41944:INFO] 2014-03-10 12:28:03 :: trace.start:2.69, pre_process_beg:0.04, received_usage_info:461.25, pre_process_end:0.03, received_downstream_resp:148.02, post_process_beg:0.02, post_process_end:0.02, total:612.0699999999999

Every number looks about what I'd expect except for the received_ip_usage_info. 461ms! Whoa. The worst part is, is that this number goes up essentially, exponentially with more load. Testing it with 300 concurrent users and over a second to reply.

Here's a copy of Ilya's auth_and_rate_limit.rb demo but with a small change we had to make ( https://gist.github.com/travisbell/ff3184fbdc2d82ba92e0 ). In Ilya's example, when doing lazy_authorization, it's still hitting the backend even if you're over the rate limit. This was not acceptable for us so I changed it in my example to auth every single request. There might be another way around this (and share if there is) but if we don't auth every request to block the proxy call, we're still sending thousands and thousands of requests to our backend per second which is what this barrier is being put in place to avoid.

Also worth noting is that I have working versions using both em-mongo and em-hiredis. It's essentially the exact same numbers which tells me they are not the issue.

Is this just the best we got, or is there something critically wrong? Any insight would be super helpful! Thanks guys.