-Ben
Maybe Amazon is throttling your IP for making that many parallel requests.
-Roberto.
There are at least two possible definitions of "performance" here:
1) how long it takes to complete the entire test
2) how long it takes for the longest request to complete
The key difference here is this: imagine these requests were made, not
by a single test program, but by 1000 users. With the synchronous
client, the 1000th user wouldn't even begin to see a result until the
previous 999 users had finished, whereas with the async client, he would
probably wait no longer than the average time.
Threading would reveal a similar pattern. Running tasks in serial will
always complete faster than running them in parallel if all you measure
is the overall time to complete all tasks. If you consider the time of
the *slowest* task (i.e. the 1000th task in your test), then suddenly
the async (or threaded) system will appear to have much better
performance. Standard deviation is the statistical tool used to reveal
this number:
http://www.blackbeak.com/2008/04/16/using-standard-deviations-to-determine-web-analytics-benchmarks/
In general, for real-world systems, it's usually preferable to have
vastly smaller standard deviation at the expense of a slightly higher
average; all users wait 1s longer so that no user waits 1000s.
Regards,
Cliff
Because running tasks in parallel implies overhead and resource
sharing/consumption that running them in serial does not (e.g. context
switches, bandwidth limits, disk I/O limits, memory pressure, CPU cache
misses, etc). This overhead may exist in the client, on the remote
server, in the network, or more likely, be present in every part of the
stack.
Async applications suffer less from this than threaded equivalents
because they are serialized at the execution level (so context switches
aren't involved), but your particular test involves other services that
are probably not async, and some of the overhead (such as bandwidth,
memory, etc) still apply for async apps in any case.
Note that I'm not saying your conclusion is wrong (you may be right),
but rather that your testing methodology is inconclusive. You'd need a
much more isolated environment (read: not cloud-based, not testing
against a server with unknown performance properties, etc). Try your
same test on a LAN and test against something with a more predictable
performance baseline (i.e. Nginx serving a static file rather than
SDB).
Cliff
It seems somewhat better than your first test, but I think I'd still
prefer a more controlled environment such as Nginx serving a static file
that lies on the same LAN as your client. Internet is an unreliable
test platform (unless you are testing your internet connection).
Cliff
What is the problem you are trying to solve, again?
Yes, but your benchmark won't be. You need to ask yourself: am I
benchmarking the internet, or am I benchmarking a particular piece of
software? There might be a real issue with AsyncHTTPClient, but until
you've actually demonstrated that, I don't think anyone will spend much
time trying to track it down.
The test you've configured has far too many variables to decide where
bottleneck might be when making lots of concurrent requests: is it your
network throughput? Is it AsyncHTTPClient? Is it the database server?
Is it your cloud platform? Your test raises more questions than it
answers.
Cliff
More important than any of these variables is that you still haven't
told us much about what you're actually testing. You've said you're
sending 100 concurrent requests with the synchronous http client, but
haven't specified whether you're using threads or processes, etc. If
you post your code we'll probably be able to tell what's causing the
difference, but without that we're just taking shots in the dark.
-Ben
-Ben