Question about multi-core benchmark

James Cooper

unread,

Mar 29, 2011, 10:32:41 AM3/29/11

to deft-we...@googlegroups.com

Hi there,

Very interesting project. I stumbled on it from a StackOverflow answer yesterday.

Question about the benchmark at the bottom of: http://deftserver.appspot.com/

Deft appears to be ~2x faster running one instance than running 4 instances behind nginx. Am I reading that correctly? Is that due to the increased latency introduced by the nginx reverse proxy?

Was the deft server CPU bound in either benchmark?

Just curious, as I would think exploiting all 4 cores would increase performance. Or perhaps I'm misunderstanding the benchmark.

thank you!

-- James

Roger Schildmeijer

unread,

Mar 29, 2011, 11:21:40 AM3/29/11

to deft-we...@googlegroups.com

On Tue, Mar 29, 2011 at 4:32 PM, James Cooper <jamesp...@gmail.com> wrote:

Hi there,

Very interesting project. I stumbled on it from a StackOverflow answer yesterday.

Thanks.

Question about the benchmark at the bottom of: http://deftserver.appspot.com/

Deft appears to be ~2x faster running one instance than running 4 instances behind nginx. Am I reading that correctly? Is that due to the increased latency introduced by the nginx reverse proxy?

I will probably replace the second (rightmost graph) with a benchmark that illustrates how deft, tornado and node.js deals with request if ~100k idle connections are hanging on the servers.

The reason for the non-intuitive benchmark result is that nginx (atleast the version we used for the latest benchmark) does not support http 1.1 keep-alive connections to its backend (i.e there is no connection keep-alive between nginx and the four deft/tornado/node.js instances, and the tcp handskake procedure is pretty expensive relative to the rest of the work in the hello world example).

(Nb. nginx does support connection keep-alive between its clients and itself.)

Was the deft server CPU bound in either benchmark?

(regarding the single deft instance benchmark) IIRC we were pretty close to 100% cpu saturation (top -o cpu)

Just curious, as I would think exploiting all 4 cores would increase performance. Or perhaps I'm misunderstanding the benchmark.

thank you!

-- James

// Roger Schildmeijer

James Cooper

unread,

Mar 29, 2011, 11:59:27 AM3/29/11

to deft-we...@googlegroups.com, Roger Schildmeijer

On Tuesday, March 29, 2011 8:21:40 AM UTC-7, Roger Schildmeijer wrote:

I will probably replace the second (rightmost graph) with a benchmark that illustrates how deft, tornado and node.js deals with request if ~100k idle connections are hanging on the servers.

The reason for the non-intuitive benchmark result is that nginx (atleast the version we used for the latest benchmark) does not support http 1.1 keep-alive connections to its backend (i.e there is no connection keep-alive between nginx and the four deft/tornado/node.js instances, and the tcp handskake procedure is pretty expensive relative to the rest of the work in the hello world example).

(Nb. nginx does support connection keep-alive between its clients and itself.)

Ah, that makes sense. Thanks for the explanation.

-- James

Bing Ran

unread,

Apr 18, 2011, 9:38:43 PM4/18/11

to Deft Web Server

Then how would you improve the throughput by clustering multiple
instances of Deft?

On 3月29日, 下午11时21分, Roger Schildmeijer <schildmei...@gmail.com> wrote:

Roger Schildmeijer

unread,

Apr 19, 2011, 2:15:13 AM4/19/11

to deft-we...@googlegroups.com

Hi Bing Ran,

There is no cluster support for Deft. It's not a distributed system. There are basically two ways to saturate more than one cpu core using Deft. The first one is to simply spawn more Deft instances on the same machine and e.g use nginx as a reverse load balancer in front of the instances. Its the same problem/solution that the Tornado documentation talks about in the 'Performance' section (http://www.tornadoweb.org/documentation#performance).
The other one (pure theoretical according to me. (havent tested it myself) so I dont know how this would impact performance generally) is to use eg. threadpools in your RequestHandlers. But make sure that you return the control to the ioloop through IOLoop.addCallback(..). Most of the methods, infact all except IOLoop.addCallback, are _not_ threadsafe by design.

// HTH Roger Schildmeijer

2011/4/19 Bing Ran <bing...@gmail.com>

Bing Ran

unread,

Apr 19, 2011, 3:53:15 AM4/19/11

to deft-we...@googlegroups.com

Thank Roger for your comment.

I guess I did not state my question clear enough. Based on the benchmark you have published, 4 instances of Deft fronted with Nginx actually performed worse than a single instance, in terms of the total throughput. So what's the point of this deployment model other than fault-tolerance?

I also have noticed that load balanced node.js and Tornado had improved, even if not a lot, the throughput, leaving Deft the only guy that suffered from load balancing. Is the observation correct? So the real question is why the "competitors" can take advantage of load balancing to scale up but Deft cannot.

Adding a thread pool in the handler might work, but will it make the architecture look similar to Netty for example?

Keep up the nice work!

Bing

Roger Schildmeijer

unread,

Apr 19, 2011, 11:07:01 AM4/19/11

to deft-we...@googlegroups.com

On 19 apr 2011, at 09.53fm, Bing Ran wrote:

Thank Roger for your comment.

I guess I did not state my question clear enough. Based on the benchmark you have published, 4 instances of Deft fronted with Nginx actually performed worse than a single instance, in terms of the total throughput. So what's the point of this deployment model other than fault-tolerance?

As stated earlier in this thread:

"I will probably replace the second (rightmost graph) with a benchmark that illustrates how deft, tornado and node.js deals with request if ~100k idle connections are hanging on the servers".

The reason for this could be that nginx (atleast that version) became the bottleneck in the benchmark. Also, there is no keep-alive between the Deft instances and nginx (no support for this in nginx), and the tcp handshake is pretty expensive relative to the work we are doing (simple hello world). Could also be Deft that became the bottleneck because of no keep-alive. Would probably be easy to do a quick test with no keep-alive against a single deft instance and compare the numbers with the previous ones.

I also have noticed that load balanced node.js and Tornado had improved, even if not a lot, the throughput, leaving Deft the only guy that suffered from load balancing. Is the observation correct? So the real question is why the "competitors" can take advantage of load balancing to scale up but Deft cannot.

Adding a thread pool in the handler might work, but will it make the architecture look similar to Netty for example?

Know to little about the Netty internals to give a correct answer. But maybe.

Keep up the nice work!

Thanks

Bing

// Roger

Bing Ran

unread,

Apr 19, 2011, 11:21:58 AM4/19/11

to deft-we...@googlegroups.com

Hi,

Sorry to miss your answer to a previous message.

Yes it would be interesting to see the result without the client-side keep-alive. I was just curious why the other two had responded to Nginx positively while Deft negatively.

Bing

Roger Schildmeijer

unread,

Apr 19, 2011, 11:51:53 AM4/19/11

to deft-we...@googlegroups.com

On 19 apr 2011, at 17.21em, Bing Ran wrote:

Hi,

Sorry to miss your answer to a previous message.

No problem :)

Yes it would be interesting to see the result without the client-side keep-alive. I was just curious why the other two had responded to Nginx positively while Deft negatively.

Absolutely. I totally agree that the charts looks a little bit weird the first time you see them.

Reply all

Reply to author

Forward