golang helloworld 45% slower than node.js

32,213 views
Skip to first unread message

ChrisLu

unread,
Jun 16, 2011, 2:43:09 PM6/16/11
to golang-nuts
Kind of disappointing. Golang is supposedly "closer" to the metal. But
I don't expect it only be comparable to node.js, and don't expect it
actually node.js is 45% faster than golang.

In the test, GOMAXPROCS is set to 1. Setting it to higher numbers
actually does not have much effect.

For go:
Concurrency Level: 100
Time taken for tests: 152.330 seconds
Complete requests: 1000000
Failed requests: 0
Write errors: 0
Total transferred: 110000000 bytes
HTML transferred: 14000000 bytes
Requests per second: 6564.69 [#/sec] (mean)
Time per request: 15.233 [ms] (mean)
Time per request: 0.152 [ms] (mean, across all concurrent
requests)
Transfer rate: 705.19 [Kbytes/sec] received


For node.js:
Concurrency Level: 100
Time taken for tests: 104.538 seconds
Complete requests: 1000000
Failed requests: 0
Write errors: 0
Total transferred: 78000000 bytes
HTML transferred: 14000000 bytes
Requests per second: 9565.93 [#/sec] (mean)
Time per request: 10.454 [ms] (mean)
Time per request: 0.105 [ms] (mean, across all concurrent
requests)
Transfer rate: 728.66 [Kbytes/sec] received


Here are the codes for go and node.js

go code:

package main
import ("http";"io";"runtime")
func HelloServer(w http.ResponseWriter, req *http.Request) {
io.WriteString(w, "hello, world!\n")
}
func main() {
runtime.GOMAXPROCS(1)
http.HandleFunc("/", HelloServer)
http.ListenAndServe(":8080", nil)
}

node.js code:

var http = require('http');
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('hello, world!\n');
}).listen(8080, "127.0.0.1");
console.log('Server running at http://127.0.0.1:8080/');

ChrisLu

unread,
Jun 16, 2011, 2:44:57 PM6/16/11
to golang-nuts
Sorry, correction: the title should be "node.js is 45% faster than
golang".

Brad Fitzpatrick

unread,
Jun 16, 2011, 2:50:10 PM6/16/11
to ChrisLu, golang-nuts
Ryan & gang have spent a ton of time optimizing their http parser (in C, not JavaScript).  And the v8 team have spent a lot of time on their javascript -> native code compiler.

The Go team has spent no time optimizing the http package.  I added a benchmark a few weeks back but didn't start optimizing anything yet.

How did you run your tests?  Is that 'ab'?

Brad Fitzpatrick

unread,
Jun 16, 2011, 2:51:42 PM6/16/11
to ChrisLu, golang-nuts
Also, I'm not sure the tests are equivalent:  the Go version will be doing http chunking and I think the node.js one is (or could be) sending a Content-Length.

On Thu, Jun 16, 2011 at 11:43 AM, ChrisLu <chris.lu@gmail.com> wrote:

bflm

unread,
Jun 16, 2011, 2:53:43 PM6/16/11
to golang-nuts
On Jun 16, 8:44 pm, ChrisLu <chris...@gmail.com> wrote:

I wonder why is the ratio of:

Total transferred: 110000000 bytes
Total transferred: 78000000 bytes

so close to 1.45?

Scott Lawrence

unread,
Jun 16, 2011, 2:54:02 PM6/16/11
to golan...@googlegroups.com
It appears that the go version is sending significantly more data -
about 41% more. Probably not a fair test.

--
Scott Lawrence

signature.asc

FanWall

unread,
Jun 16, 2011, 3:03:11 PM6/16/11
to golan...@googlegroups.com
the response header:

go version:
Content-Type:text/html; charset=utf-8
Date:Thu, 16 Jun 2011 19:00:53 GMT
Transfer-Encoding:chunked

nodejs version:
Connection:keep-alive
Content-Type:text/plain
Transfer-Encoding:chunked


Gustavo Niemeyer

unread,
Jun 16, 2011, 3:16:12 PM6/16/11
to ChrisLu, golang-nuts
> node.js code:
>
> var http = require('http');
> http.createServer(function (req, res) {
>  res.writeHead(200, {'Content-Type': 'text/plain'});
>  res.end('hello, world!\n');
> }).listen(8080, "127.0.0.1");
> console.log('Server running at http://127.0.0.1:8080/');

Besides the other comments made, one should note that even _bash_ is
faster than Go.. try this snippet, for instance:

$ nginx

--
Gustavo Niemeyer
http://niemeyer.net
http://niemeyer.net/blog
http://niemeyer.net/twitter

Chris Lu

unread,
Jun 16, 2011, 3:19:50 PM6/16/11
to Brad Fitzpatrick, golang-nuts
You are right. Go is doing something extra.

Total transferred:      110000000 bytes for go
Total transferred:      78000000 bytes for node.js

Factor the difference in, seems Golang is comparable to node.js.

But still, node.js is not really the fastest. I was hoping Golang can be orders of magnitude faster than node.js.

Chris

FanWall

unread,
Jun 16, 2011, 3:18:37 PM6/16/11
to golan...@googlegroups.com
package main

import (
        "http"
        "log"
)

func HelloServer(w http.ResponseWriter, req *http.Request) {
    w.Header().Set("Content-Type", "text/plain")
    w.Header().Set("Connection", "keep-alive")
        w.Write([]byte("hello, world!\n"))
}
func main() {
        http.HandleFunc("/", HelloServer)
        log.Println("Serving at http://127.0.0.1:8080/")
        http.ListenAndServe(":8080", nil)
}



don't know how to delete the "Date:Thu, 16 Jun 2011 19:00:53 GMT"

Brad Fitzpatrick

unread,
Jun 16, 2011, 3:19:23 PM6/16/11
to golan...@googlegroups.com
Looking at a tcpdump, ab (apache bench) doesn't even use HTTP/1.1!  It's setting up a new connection for each request?

Scott Lawrence

unread,
Jun 16, 2011, 3:19:53 PM6/16/11
to golan...@googlegroups.com
On 06/16/2011 03:19 PM, Chris Lu wrote:
> You are right. Go is doing something extra.
>
> Total transferred: 110000000 bytes for go
> Total transferred: 78000000 bytes for node.js
>
> Factor the difference in, seems Golang is comparable to node.js.

Not necessarily.

It's not just that go is serving more data - it may take more time to
generate that data. With trivially small pages, it's hard to tell where
the time is going - for all we know, it's going into some trivial
function call as part of printing the date string.

>
> But still, node.js is not really the fastest. I was hoping Golang can be
> orders of magnitude faster than node.js.


--
Scott Lawrence

signature.asc

ChrisLu

unread,
Jun 16, 2011, 6:21:45 PM6/16/11
to golang-nuts
Thanks for all the quick replies! I admit this would not be a
scientific way to measure golang with node.js.

Here is the results after trimming fat from go's http response output.
Now go version transfer 25% less text than node.js version, but
node.js version is 38% faster than go version.

I modified the go code set content-type just as node.js.
And removed the line in http/server.go that's printing out the
"Date ....".
Now here is the transfer size:

Total transferred: 59000000 bytes (for go)
Requests per second: 6933.08 [#/sec] (mean)
Total transferred: 78000000 bytes (for node.js)
Requests per second: 9565.93 [#/sec] (mean)

However, the timing is still slower than node.js, which as

Concurrency Level: 100
Time taken for tests: 144.236 seconds
Complete requests: 1000000
Failed requests: 0
Write errors: 0
Total transferred: 59000000 bytes
HTML transferred: 14000000 bytes
Requests per second: 6933.08 [#/sec] (mean)
Time per request: 14.424 [ms] (mean)
Time per request: 0.144 [ms] (mean, across all concurrent
requests)
Transfer rate: 399.46 [Kbytes/sec] received

Chris

Here is the go code:

package main
import ("http";"runtime";"log")
func HelloServer(w http.ResponseWriter, req *http.Request) {
w.Header().Set("Content-Type", "text/plain")
w.Write([]byte("hello, world!\n"))
}
func main() {
runtime.GOMAXPROCS(1)
http.HandleFunc("/", HelloServer)
http.ListenAndServe(":8080", nil)
log.Println("Serving at http://127.0.0.1:8080/")
>  signature.asc
> < 1KViewDownload

Andrew Gerrand

unread,
Jun 16, 2011, 6:41:37 PM6/16/11
to Chris Lu, Brad Fitzpatrick, golang-nuts
On 17 June 2011 05:19, Chris Lu <chri...@gmail.com> wrote:
> You are right. Go is doing something extra.
>
> Total transferred:      110000000 bytes for go
> Total transferred:      78000000 bytes for node.js
>
> Factor the difference in, seems Golang is comparable to node.js.
>
> But still, node.js is not really the fastest. I was hoping Golang can be
> orders of magnitude faster than node.js.

Orders of magnitude faster? 10 to 100 times faster? Node is supposed
to be pretty fast. I assume the node guys aren't a bunch of idiots, so
I'd be surprised if any web server is that fast.

Andrew

Brad Fitzpatrick

unread,
Jun 16, 2011, 6:43:37 PM6/16/11
to ChrisLu, golang-nuts
You're cutting some of the more interesting results, though.

I've reproduce this, including removing the Date field, and here are the results for Node vs. Go:

Concurrency Level:      20
Time taken for tests:   3.486 seconds
Complete requests:      20000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    0
Total transferred:      1560000 bytes
HTML transferred:       280000 bytes
Requests per second:    5737.63 [#/sec] (mean)
Time per request:       3.486 [ms] (mean)
Time per request:       0.174 [ms] (mean, across all concurrent requests)
Transfer rate:          437.05 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       1
Processing:     1    3   1.8      3      15
Waiting:        1    3   1.8      3      15
Total:          1    3   1.8      3      15

Percentage of the requests served within a certain time (ms)
  50%      3
  66%      4
  75%      5
  80%      5
  90%      6
  95%      6
  98%      7
  99%     10
 100%     15 (longest request)

And Go:

Concurrency Level:      20
Time taken for tests:   4.504 seconds
Complete requests:      20000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    0
Total transferred:      1180000 bytes
HTML transferred:       280000 bytes
Requests per second:    4440.89 [#/sec] (mean)
Time per request:       4.504 [ms] (mean)
Time per request:       0.225 [ms] (mean, across all concurrent requests)
Transfer rate:          255.87 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       1
Processing:     1    4   0.6      4       9
Waiting:        1    4   0.6      4       8
Total:          1    4   0.6      4       9

Percentage of the requests served within a certain time (ms)
  50%      4
  66%      5
  75%      5
  80%      5
  90%      5
  95%      6
  98%      6
  99%      6
 100%      9 (longest request)


Go's median is about a millisecond slower, but much more consistent.  Standard Deviation of 0.6 ms over Node's 1.8 ms.

Node ranges from 3-15 ms.  Go is 4-9.

This is a little silly of a benchmark, since you're comparing optimized C code against unoptimized Go code.  There's barely any JavaScript being run here.  In real life you'd be doing work in your handlers.  In real-life you'd have a bunch of persistent connections too.

But while silly, it's also fun, so I'm fairly confident I'll find some time to beat Node at this particular benchmark.  :-)

ChrisLu

unread,
Jun 16, 2011, 7:18:34 PM6/16/11
to golang-nuts
This is actually very close to a real-production server, which serves
json strings
and the json data only needs to refresh not too frequently, like 1~5
minute.

BTW: setting MAX PROCS to 2 yields result 9287.10 RPS, very close to
node.js's 9565.93 PRS.
However, we are using 2-cores and node.js is only using one.

To prove golang users are not nuts, please beat this silly benchmark.

Chris

Concurrency Level: 100
Time taken for tests: 107.676 seconds
Complete requests: 1000000
Failed requests: 0
Write errors: 0
Total transferred: 59000000 bytes
HTML transferred: 14000000 bytes
Requests per second: 9287.10 [#/sec] (mean)
Time per request: 10.768 [ms] (mean)
Time per request: 0.108 [ms] (mean, across all concurrent
requests)
Transfer rate: 535.10 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.8 0 15
Processing: 1 10 1.9 10 38
Waiting: 0 10 1.8 10 35
Total: 4 11 1.8 11 39
> >   log.Println("Serving athttp://127.0.0.1:8080/")

Dave Cheney

unread,
Jun 16, 2011, 7:33:46 PM6/16/11
to ChrisLu, golang-nuts
Hi Chris and Brad,

This sounds like a very exciting competition. For the benefit of people like me who would like to play along at home, could I suggest some ground rules.

1. The code to be be benchmarked, both go and node goes into your favourite DVCS Repo.

2. Included in that repo is a script which will invoke (and compile if required) the relevant code then invoke the chosen benchmarking tool.

3. The benchmark script should assert the versions of the runtime and support libraries for consistency. Those can change over time, but should be identifiable as benchmarking parameters.

That way contributions from both sides of the fence could be incorporated in a consistent way.

Cheers

Dave

Sent from my C64

Michael Jones

unread,
Jun 16, 2011, 8:00:48 PM6/16/11
to ChrisLu, golang-nuts
Many Go users are busy building flexible, modular components while the Go team refines the core langage, tool chain, and documentation. Benchmarks are helpful to understand areas for development, particularly when they show odd and unexpected results, but they do not generally reflect on Go as a 'language' so much as on the implementation's present status and the developer community's focus in library development. The former is generally about correctness and the latter about breadth, though everyone enjoys a good benchmark battle.


On Thu, Jun 16, 2011 at 4:18 PM, ChrisLu <chris.lu@gmail.com> wrote:
To prove golang users are not nuts, please beat this silly benchmark.

--

Michael T. Jones

   Chief Technology Advocate, Google Inc.

   1600 Amphitheatre Parkway, Mountain View, California 94043

   Email: m...@google.com  Mobile: 650-335-5765  Fax: 650-649-1938

   Organizing the world's information to make it universally accessible and useful


Chris Lu

unread,
Jun 16, 2011, 8:15:11 PM6/16/11
to Michael Jones, golang-nuts
Where to focus is a common problem during different stages of the growth. I understand the community got hands full.

However, I would argue right now if Golang shows some brilliant numbers, which I think it can, it will attract more users and talents to work on the components/tool chain/documentation.

Chris

Michael Lazarev

unread,
Jun 16, 2011, 8:15:53 PM6/16/11
to golang-nuts

On Jun 17, 2:41 am, Andrew Gerrand <a...@golang.org> wrote:
> Orders of magnitude faster? 10 to 100 times faster? Node is supposed
> to be pretty fast. I assume the node guys aren't a bunch of idiots, so
> I'd be surprised if any web server is that fast.

Node.js authors are definitely not idiots. But anyway you might be
surprised by looking at this:
http://www.yesodweb.com/blog/2011/03/preliminary-warp-cross-language-benchmarks

As Dave Cheney suggested, for benefit of people who would like to play
along at home,
there is a link in that post to a benchmarking framework github
repository.

Of course, many people are obsessed with speed. Ones like the speed of
compilers,
others -- the speed of processing http connections, and so on. I must
admit that among other things about speed,
I'm excited with the speed of Go team itself, especially in a way how
they implement
the language and its supporting infrastructure. Keep up the great
work!

Andrew Gerrand

unread,
Jun 16, 2011, 8:26:29 PM6/16/11
to Michael Lazarev, golang-nuts
On 17 June 2011 10:15, Michael Lazarev <lazarev...@gmail.com> wrote:
>
> On Jun 17, 2:41 am, Andrew Gerrand <a...@golang.org> wrote:
>> Orders of magnitude faster? 10 to 100 times faster? Node is supposed
>> to be pretty fast. I assume the node guys aren't a bunch of idiots, so
>> I'd be surprised if any web server is that fast.
>
> Node.js authors are definitely not idiots.

I was definitely not implying that. (just to be explicit)

> But anyway you might be surprised by looking at this:
> http://www.yesodweb.com/blog/2011/03/preliminary-warp-cross-language-benchmarks

Is that graph a quad core benchmark? If so, I'm not at all surprised
that an efficient Haskell web server running on 4 cores is roughly 4x
more efficient than an equivalent node.js server running on a single
core. I would expect Go to exhibit similar performance in future.

> As Dave Cheney suggested, for benefit of people who would like to play
> along at home,
> there is a link in that post to a benchmarking framework github
> repository.

Looks like a good place to start. Thanks!

Andrew

Aaron Blohowiak

unread,
Jun 16, 2011, 10:16:57 PM6/16/11
to golang-nuts
after fixing system ulimit and TIME_WAIT , trying to run this under
OSX with 6prof consistently dies. it works when run without 6prof.

$ ab -n100000 -c100 http://127.0.0.1:8080/

# and over in the other terminal

mbp:gotest aaronblohowiak$ 6prof ws
mach error semaphore_wait: 16777252
throw: mach error

runtime.throw+0x40 /Users/aaronblohowiak/go/src/pkg/runtime/runtime.c:
102
runtime.throw(0x215d3a, 0x215d82)
macherror+0x45 /Users/aaronblohowiak/go/src/pkg/runtime/darwin/
thread.c:190
macherror(0xf801000024, 0x215d82, 0x0, 0x13db4)
runtime.mach_semacquire+0x3f /Users/aaronblohowiak/go/src/pkg/runtime/
darwin/thread.c:443
runtime.mach_semacquire(0xf800001403, 0xffffffff)
runtime.usemacquire+0x60 /Users/aaronblohowiak/go/src/pkg/runtime/
darwin/thread.c:104
runtime.usemacquire(0xf84002b69c, 0x0)
runtime.notesleep+0x33 /Users/aaronblohowiak/go/src/pkg/runtime/darwin/
thread.c:130
runtime.notesleep(0xf84002b698, 0xe189)
nextgandunlock+0x134 /Users/aaronblohowiak/go/src/pkg/runtime/proc.c:
403
nextgandunlock()
schedule+0xbf /Users/aaronblohowiak/go/src/pkg/runtime/proc.c:572
schedule(0xf840003000, 0xf840003410)
runtime.mcall+0x49 /Users/aaronblohowiak/go/src/pkg/runtime/amd64/
asm.s:158
runtime.mcall(0xf840003410, 0x0)

goroutine 48138 [1]:
runtime.gosched+0x5c /Users/aaronblohowiak/go/src/pkg/runtime/proc.c:
603
runtime.gosched()
runtime.exitsyscall+0x73 /Users/aaronblohowiak/go/src/pkg/runtime/
proc.c:683
runtime.exitsyscall()
syscall.Syscall+0x61 /Users/aaronblohowiak/go/src/pkg/syscall/
asm_darwin_amd64.s:34
syscall.Syscall()
syscall.Write+0x66 /Users/aaronblohowiak/go/src/pkg/syscall/
zsyscall_darwin_amd64.go:920
syscall.Write(0xf800000008, 0xf8403fb000, 0x10000000006e, 0x1f519,
0xf84017f400, ...)
net.*netFD·Write+0x1e1 /Users/aaronblohowiak/go/src/pkg/net/fd.go:486
net.*netFD·Write(0xf840038820, 0xf8403fb000, 0x10000000006e, 0x0,
0x0, ...)
net.*TCPConn·Write+0x95 /Users/aaronblohowiak/go/src/pkg/net/
tcpsock.go:102
net.*TCPConn·Write(0xf840000f48, 0xf8403fb000, 0x10000000006e, 0xa,
0x0, ...)
bufio.*Writer·Flush+0x104 /Users/aaronblohowiak/go/src/pkg/bufio/
bufio.go:418
bufio.*Writer·Flush(0xf84017f400, 0xf84017f400, 0xf8400bd1e0, 0x1c00)
http.*response·finishRequest+0x1cc /Users/aaronblohowiak/go/src/pkg/
http/server.go:424
http.*response·finishRequest(0xf84017f440, 0xf8400330c0)
http.*conn·serve+0x228 /Users/aaronblohowiak/go/src/pkg/http/server.go:
503
http.*conn·serve(0xf840011320, 0x0)
runtime.goexit /Users/aaronblohowiak/go/src/pkg/runtime/proc.c:178
runtime.goexit()
----- goroutine created by -----
http.*Server·Serve+0x203 /Users/aaronblohowiak/go/src/pkg/http/
server.go:820

goroutine 13 [4]:
runtime.gosched+0x5c /Users/aaronblohowiak/go/src/pkg/runtime/proc.c:
603
runtime.gosched()
runfinq+0x50 /Users/aaronblohowiak/go/src/pkg/runtime/mgc0.c:671
runfinq()
runtime.goexit /Users/aaronblohowiak/go/src/pkg/runtime/proc.c:178
runtime.goexit()
----- goroutine created by -----
runtime.gc /Users/aaronblohowiak/go/src/pkg/runtime/mgc0.c:547

goroutine 2 [3]:
runtime.entersyscall+0x78 /Users/aaronblohowiak/go/src/pkg/runtime/
proc.c:639
runtime.entersyscall()
syscall.Syscall6+0x5 /Users/aaronblohowiak/go/src/pkg/syscall/
asm_darwin_amd64.s:38
syscall.Syscall6()
syscall.kevent+0x76 /Users/aaronblohowiak/go/src/pkg/syscall/
zsyscall_darwin_amd64.go:158
syscall.kevent(0x7, 0x0, 0xf800000000, 0xf84002bd88,
0xf80000000a, ...)
syscall.Kevent+0x97 /Users/aaronblohowiak/go/src/pkg/syscall/
syscall_bsd.go:441
syscall.Kevent(0xf800000007, 0x0, 0x0, 0xf84002bd88,
0xa0000000a, ...)
net.*pollster·WaitFD+0x130 /Users/aaronblohowiak/go/src/pkg/net/
fd_darwin.go:95
net.*pollster·WaitFD(0xf84002bd80, 0xf84001c7c0, 0x0, 0x720000004e,
0x0, ...)
net.*pollServer·Run+0xe0 /Users/aaronblohowiak/go/src/pkg/net/fd.go:
226
net.*pollServer·Run(0xf84001c7c0, 0x0)
runtime.goexit /Users/aaronblohowiak/go/src/pkg/runtime/proc.c:178
runtime.goexit()
----- goroutine created by -----
net.newPollServer+0x33b /Users/aaronblohowiak/go/src/pkg/net/
newpollserver.go:39

goroutine 1 [1]:
runtime.gosched+0x5c /Users/aaronblohowiak/go/src/pkg/runtime/proc.c:
603
runtime.gosched()
runtime.exitsyscall+0x73 /Users/aaronblohowiak/go/src/pkg/runtime/
proc.c:683
runtime.exitsyscall()
syscall.Syscall+0x61 /Users/aaronblohowiak/go/src/pkg/syscall/
asm_darwin_amd64.s:34
syscall.Syscall()
syscall.fcntl+0x4a /Users/aaronblohowiak/go/src/pkg/syscall/
zsyscall_darwin_amd64.go:197
syscall.fcntl(0x20000000a, 0xf800000001, 0x1, 0x6cece, 0xa, ...)
syscall.CloseOnExec+0x33 /Users/aaronblohowiak/go/src/pkg/syscall/
exec_unix.go:74
syscall.CloseOnExec(0xa, 0xa)
net.*netFD·accept+0x213 /Users/aaronblohowiak/go/src/pkg/net/fd.go:612
net.*netFD·accept(0xf840038f00, 0x646ca, 0x0, 0x0, 0x0, ...)
net.*TCPListener·AcceptTCP+0x71 /Users/aaronblohowiak/go/src/pkg/net/
tcpsock.go:262
net.*TCPListener·AcceptTCP(0xf840000178, 0x215b6, 0x0, 0x0,
0xf840000f48, ...)
net.*TCPListener·Accept+0x49 /Users/aaronblohowiak/go/src/pkg/net/
tcpsock.go:272
net.*TCPListener·Accept(0xf840000178, 0x0, 0x0, 0x0, 0x0, ...)
http.*Server·Serve+0xd6 /Users/aaronblohowiak/go/src/pkg/http/
server.go:806
http.*Server·Serve(0xf84000ede0, 0xf840033fc0, 0xf840000178, 0x0,
0x0, ...)
http.*Server·ListenAndServe+0xcc /Users/aaronblohowiak/go/src/pkg/http/
server.go:793
http.*Server·ListenAndServe(0xf84000ede0, 0xf84000ede0, 0x150e54,
0x1)
http.ListenAndServe+0x68 /Users/aaronblohowiak/go/src/pkg/http/
server.go:854
http.ListenAndServe(0x153e24, 0x3830383a00000005, 0x0, 0x0, 0x0, ...)
main.main+0x63 /Users/aaronblohowiak/gotest/ws.go:15
main.main()
runtime.mainstart+0xf /Users/aaronblohowiak/go/src/pkg/runtime/amd64/
asm.s:77
runtime.mainstart()
runtime.goexit /Users/aaronblohowiak/go/src/pkg/runtime/proc.c:178
runtime.goexit()
----- goroutine created by -----
_rt0_amd64+0x8e /Users/aaronblohowiak/go/src/pkg/runtime/amd64/asm.s:
64
threadstopped thread_info 0: mach: send invalid dest
threadstopped thread_info 0: mach: send invalid dest
131 samples (avg 6 threads)
378.63% runtime.mach_semaphore_wait
99.24% syscall.Syscall6
19.85% syscall.Syscall
13.74% scanblock
9.16% runtime.mach_semaphore_signal
2.29% runtime.cas
1.53% cmpstring
1.53% net.*netFD·accept
1.53% runtime.mallocgc
1.53% runtime.memclr
1.53% sweep
0.76% ReleaseN
0.76% asn1.*ObjectIdentifier·Equal
0.76% bytes.*Buffer·Truncate
0.76% http.Header·Get
0.76% io.Copy
0.76% memcopy
0.76% memhash
0.76% net.isZeros
0.76% net.sockaddrToTCP
0.76% runtime.MCache_Free
0.76% runtime.MSpanList_IsEmpty
0.76% runtime.cmpstring
0.76% runtime.gettime
0.76% runtime.ifacethash
0.76% runtime.mcpy
0.76% runtime.slicecopy
0.76% runtime.unlock
0.76% runtime.xadd
0.76% schedlock
0.76% syscall.RawSyscall
0.76% time.nextStdChunk

ptolomy23

unread,
Oct 20, 2011, 2:11:10 PM10/20/11
to Brad Fitzpatrick, ChrisLu, golang-nuts
Any update on this? It's silly and fun, but this task is also a tempting proxy for Go's speed and suitability for implementing basic protocol libraries.
Mostly, I'm curious. 
Message has been deleted

Dave Cheney

unread,
Sep 26, 2012, 5:23:19 AM9/26/12
to umu...@gmail.com, golan...@googlegroups.com
Problem solved :)

On Wed, Sep 26, 2012 at 6:13 PM, <umu...@gmail.com> wrote:
> today, I test, go and node.js are nearly the same performance.
> 今天个试了一下,两者性能差不多。
> --
>
>
Message has been deleted
Message has been deleted

Dave Cheney

unread,
Feb 3, 2013, 1:39:51 AM2/3/13
to gben...@gmail.com, golan...@googlegroups.com
> I'm also seeing that Node.js is much faster than Go in the hello world test.

Hello,

Could you post your benchmark tests, and your benchmark results, we've
made a lot of fixes in tip (not 1.0.3) recently which should have
closed the gap.

> what is the explanation for this ?

Lack of unicorns

Dave

steve wang

unread,
Feb 3, 2013, 2:05:20 AM2/3/13
to golan...@googlegroups.com, umu...@gmail.com
Should Go be supposed to much faster thant Node.js due to the ability to run massively parallely?

Gal Ben-Haim

unread,
Feb 3, 2013, 3:52:42 AM2/3/13
to Dave Cheney, golan...@googlegroups.com

Dave Cheney

unread,
Feb 3, 2013, 4:07:09 AM2/3/13
to Gal Ben-Haim, golan...@googlegroups.com
As mentioned offline, please try again with siege and tip. Also, which hardware and os are you running ?

Gal Ben-Haim

unread,
Feb 3, 2013, 4:12:59 AM2/3/13
to Dave Cheney, golan...@googlegroups.com
Dell Insipron n3010 i3 laptop with Ubuntu 12.10 (I see that there's no tip package for it).


Gal Ben-Haim

quarnster

unread,
Feb 3, 2013, 4:59:06 AM2/3/13
to golan...@googlegroups.com, Dave Cheney, gben...@gmail.com
You mean a binary go tip package? Go's very easy (and quick) to build from sources: http://golang.org/doc/install/source

/f

Dave Cheney

unread,
Feb 3, 2013, 5:08:29 AM2/3/13
to Gal Ben-Haim, golan...@googlegroups.com
Here are some results comparing tip to the current node.js release

hardware: lenovo x220, ubunut 12.10, amd64

Node.js:

Server Hostname: localhost
Server Port: 1337

Document Path: /
Document Length: 12 bytes

Concurrency Level: 100
Time taken for tests: 9.523 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Total transferred: 11300000 bytes
HTML transferred: 1200000 bytes
Requests per second: 10501.06 [#/sec] (mean)
Time per request: 9.523 [ms] (mean)
Time per request: 0.095 [ms] (mean, across all concurrent requests)
Transfer rate: 1158.81 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 7
Processing: 1 9 4.9 9 31
Waiting: 1 9 4.9 9 31
Total: 1 9 4.9 9 31

Percentage of the requests served within a certain time (ms)
50% 9
66% 12
75% 13
80% 14
90% 16
95% 17
98% 20
99% 22
100% 31 (longest request)

lucky(~) % go version
go version devel +3a9a5d2901f7 Sun Feb 03 02:01:05 2013 -0500 linux/amd64

lucky(~) % ab -n 100000 -c 100 localhost:1337/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 10000 requests
Completed 20000 requests
Completed 30000 requests
Completed 40000 requests
Completed 50000 requests
Completed 60000 requests
Completed 70000 requests
Completed 80000 requests
Completed 90000 requests
Completed 100000 requests
Finished 100000 requests


Server Software:
Server Hostname: localhost
Server Port: 1337

Document Path: /
Document Length: 14 bytes

Concurrency Level: 100
Time taken for tests: 10.173 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Total transferred: 13500000 bytes
HTML transferred: 1400000 bytes
Requests per second: 9830.25 [#/sec] (mean)
Time per request: 10.173 [ms] (mean)
Time per request: 0.102 [ms] (mean, across all concurrent requests)
Transfer rate: 1295.98 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.2 0 8
Processing: 2 10 3.8 9 29
Waiting: 1 10 3.8 9 28
Total: 6 10 3.8 9 29

Percentage of the requests served within a certain time (ms)
50% 9
66% 10
75% 11
80% 12
90% 14
95% 18
98% 25
99% 26
100% 29 (longest request)

Which is pretty damn close. However, if we compare with siege

node.js:

lucky(~) % siege -b -t 10s -c 100 localhost:1337/
** SIEGE 2.70
** Preparing 100 concurrent users for battle.
The server is now under siege...
Lifting the server siege... done.
Transactions: 65944 hits
Availability: 100.00 %
Elapsed time: 9.17 secs
Data transferred: 0.75 MB
Response time: 0.01 secs
Transaction rate: 7191.28 trans/sec
Throughput: 0.08 MB/sec
Concurrency: 99.56
Successful transactions: 65944
Failed transactions: 0
Longest transaction: 0.05
Shortest transaction: 0.00

FILE: /var/log/siege.log
You can disable this annoying message by editing
the .siegerc file in your home directory; change
the directive 'show-logfile' to false.
[error] unable to create log file: Permission denied

go version devel +3a9a5d2901f7 Sun Feb 03 02:01:05 2013 -0500 linux/amd64

lucky(~) % siege -b -t 10s -c 100 localhost:1337/
** SIEGE 2.70
** Preparing 100 concurrent users for battle.
The server is now under siege...
Lifting the server siege... done.

Transactions: 24215 hits
Availability: 100.00 %
Elapsed time: 9.93 secs
Data transferred: 0.32 MB
Response time: 0.04 secs
Transaction rate: 2438.57 trans/sec
Throughput: 0.03 MB/sec
Concurrency: 99.35
Successful transactions: 24215
Failed transactions: 0
Longest transaction: 0.06
Shortest transaction: 0.00

Benchmarks are hard.

Dave

Liigo Zhuang

unread,
Feb 3, 2013, 6:50:38 AM2/3/13
to Dave Cheney, golang-nuts, Gal Ben-Haim

golang is still slower than node.js, which comes from a script language.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Aram Hăvărneanu

unread,
Feb 3, 2013, 9:41:43 AM2/3/13
to Liigo Zhuang, Dave Cheney, golang-nuts, Gal Ben-Haim
> golang is still slower than node.js, which comes from a script language.

Node.js HTTP server is written in C.

--
Aram Hăvărneanu

Dustin Sallings

unread,
Feb 3, 2013, 4:37:37 PM2/3/13
to golan...@googlegroups.com
steve wang <steve....@gmail.com>
writes:

> Should Go be supposed to much faster thant Node.js due to the ability
> to run massively parallely?

There's not much of an advantage to being able to use multiple CPU
cores to spit out a tiny static series of bytes onto a network
interface.

--
dustin

Dave Cheney

unread,
Feb 3, 2013, 4:42:06 PM2/3/13
to Dustin Sallings, golan...@googlegroups.com
Also please remember that the Go program was running with GOMAXPROCS unset, as node is single threaded.

Eli Janssen

unread,
Feb 3, 2013, 5:27:43 PM2/3/13
to Dave Cheney, golan...@googlegroups.com
Unrelated, but I have found wkr[1] and weighttp[2] produce more consistent results than ab or siege.

[1]: https://github.com/wg/wrk
[2]: http://redmine.lighttpd.net/projects/weighttp/wiki

Dave Cheney

unread,
Feb 3, 2013, 5:36:28 PM2/3/13
to Eli Janssen, golan...@googlegroups.com
Thanks for the suggestion, here are some sample numbers using wrk

Node:

lucky(~/devel/wrk) % ./wrk -t4 -c100 -r100k http://localhost:1337/
Making 100000 requests to http://localhost:1337/
4 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 5.33ms 630.07us 11.09ms 95.71%
Req/Sec 4.23k 421.42 5.00k 77.14%
100000 requests in 5.30s, 14.88MB read
Requests/sec: 18865.43
Transfer/sec: 2.81MB

Golang tip:

lucky(~/devel/wrk) % ./wrk -t4 -c100 -r100k http://localhost:1337/
Making 100000 requests to http://localhost:1337/
4 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 14.02ms 13.92ms 272.66ms 98.60%
Req/Sec 1.06k 257.03 3.00k 93.84%
100000 requests in 13.48s, 11.06MB read
Requests/sec: 7415.77
Transfer/sec: 840.07KB

Patrick Mylund Nielsen

unread,
Feb 3, 2013, 6:39:39 PM2/3/13
to Aram Hăvărneanu, Liigo Zhuang, Dave Cheney, golang-nuts, Gal Ben-Haim
Yes, all the stuff that's actually in play here is written in C. Try actually doing something just mildly computationally heavy, and you'll see Go push ahead (especially with GOMAXPROCS > 1).


Patrick Mylund Nielsen

unread,
Feb 3, 2013, 6:40:06 PM2/3/13
to Dave Cheney, Eli Janssen, golang-nuts
weighttp is by far the most reliable benchmarking tool, IMO.

Kevin Gillette

unread,
Feb 3, 2013, 6:45:12 PM2/3/13
to golan...@googlegroups.com, Eli Janssen
I think this kind of benchmark will always be biased in Node's favor. We're not really comparing Go to JS, after all: we're comparing Go to V8. V8 compiles to machine code, like Go, but unlike current implementations of Go, V8 has an online optimizer that can make adjustments based on runtime performance, which can make the node version cut past a whole lot of sanity checks at runtime that a non-self-optimizing system would not be able to do. In this sense, Node is a bit more like some of the very advanced Java server VMs.

Furthermore, limiting GOMAXPROCS to 1 is not a means of slowing Go down to be fair to Node -- given the above considerations, it's probably slowing Go down to give Node an unfair advantage, since a system designed specifically for single-threaded operation is going to have less runtime overhead than a system, like Go, that can handle simultaneous execution.

A fair benchmark would involve a reasonably sized application with plenty of dynamism, such that there'd be too many execution paths (as there would be in a real app) for V8 to fully optimize. Having a variety of task characteristics would also be important: for example, static-content serving, template rendering, long-polling, and something cpu-heavy like large image rendering, would be useful, as well as a duplicate barrage of tests with GOMAXPROCS set to greater than 1. The last part is important because it can demonstrate, for example, bottlenecks that Node might have (e.g. multiple clients requesting large image renders) which Go could mitigate through simultaneous execution of requests -- Node developers may have to choose a forking model, or a different server/process topology just to adapt to these same needs. Having a larger GOMAXPROCS set of tests is also important because, in the real world, it's unlikely that Go would be intentionally deployed on a multi-proc server with GOMAXPROCS=1, and it's also reasonable that Node may be deployed on the same type of hardware (especially since it'll be harder as time goes by to find hardware with only one logical processor).

Patrick Mylund Nielsen

unread,
Feb 3, 2013, 6:51:21 PM2/3/13
to Kevin Gillette, golang-nuts, Eli Janssen
Not even... All of it is C, and writing a single line to the connection just uses libuv, so it has no significant impact on the eventing. You start doing stuff like building templates, you'll see node fall behind simply from being single-threaded.

For parsing HTTP requests and writing "hello world" in a response, node.js is very fast, because 99% of that is C. I'm not trying to dismiss that--it's great to have a very fast HTTP request parser in C--but be careful about coming to the conclusion that an application written in JavaScript running in/on node.js/V8 is faster than Go.

Ziad Hatahet

unread,
Feb 3, 2013, 6:56:25 PM2/3/13
to Dave Cheney, Gal Ben-Haim, golan...@googlegroups.com
What metrics are being looked at to determine which platform is "faster"? Unless I am missing anything, the numbers from 'ab' show that Node.js has ~10.5k reqs/second compared to Go's ~9.8k reqs/sec. However, note that the standard deviation for Node.js is higher than that of Go. Even the 99th percentile and 100th percentile of requests did better in Go than Node.js. Furthermore, the transfer rate of Go is higher than that of Node.js.

--
Ziad


Patrick Mylund Nielsen

unread,
Feb 3, 2013, 7:01:45 PM2/3/13
to Ziad Hatahet, Dave Cheney, Gal Ben-Haim, golang-nuts
ab is so inaccurate that it's nearing the point of being totally irrelevant. siege and wrk are better, particularly if used from a box other than the one running the HTTP server. I've found that weighttp (lighttpd's testing tool) produces the most accurate results.

Andrew Gerrand

unread,
Feb 3, 2013, 7:27:51 PM2/3/13
to golang-nuts
I don't think we, as the Go community, should spend time trying to improve Go's performance on micro-benchmarks like this. They almost never test the important things, and rarely lead to useful performance improvements.

Instead we should improve Go's performance by optimizing for real programs. By "real" I mean programs that were designed to serve a purpose other than testing Go's performance characteristics.

So, if *you* have a real Go program whose performance you are dissatisfied with, please share it with us so we can help you analyze any performance bottlenecks. That way we might find real issues in our compiler, runtime, or libraries.

Andrew

Feng Shen

unread,
Feb 4, 2013, 4:07:25 AM2/4/13
to golan...@googlegroups.com

I do some test for fun:

CPU: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz
RAM: 16G

#  redis, SET: 217391.30 requests per second
# It's the upper limit, probably  
redis-benchmark -q 

# nginx, response a file of 1K
# Requests per second:    148442.25 [#/sec] (mean), 
ab -n 300000 -c 100 -k http://lion.mei.fm/  

# http-kit (http://http-kit.org)  respond hello world
# code: https://gist.github.com/4704565
# Requests per second:    111179.02 [#/sec] (mean)
ab -n 400000 -c 100 -k http://lion.mei.fm:8080/

# The hello world go version, with modification of "http" => "net/http" 
# Requests per second:    17465.92 [#/sec] (mean)
ab -n 100000 -c 100 -k http://lion.mei.fm:8080/

# node v0.6.19 
# Requests per second:    12964.05 [#/sec] (mean)
ab -n 100000 -c 100 -k http://127.0.0.1:8080

Go is quite fast, 17465.92 is a big number, does not seem to be the bottleneck in real life use.



On Friday, June 17, 2011 2:43:09 AM UTC+8, ChrisLu wrote:
Kind of disappointing. Golang is supposedly "closer" to the metal. But
I don't expect it only be comparable to node.js, and don't expect it
actually node.js is 45% faster than golang.

In the test, GOMAXPROCS is set to 1. Setting it to higher numbers
actually does not have much effect.

For go:
Concurrency Level:      100
Time taken for tests:   152.330 seconds
Complete requests:      1000000
Failed requests:        0
Write errors:           0
Total transferred:      110000000 bytes
HTML transferred:       14000000 bytes
Requests per second:    6564.69 [#/sec] (mean)
Time per request:       15.233 [ms] (mean)
Time per request:       0.152 [ms] (mean, across all concurrent
requests)
Transfer rate:          705.19 [Kbytes/sec] received


For node.js:
Concurrency Level:      100
Time taken for tests:   104.538 seconds
Complete requests:      1000000
Failed requests:        0
Write errors:           0
Total transferred:      78000000 bytes
HTML transferred:       14000000 bytes
Requests per second:    9565.93 [#/sec] (mean)
Time per request:       10.454 [ms] (mean)
Time per request:       0.105 [ms] (mean, across all concurrent
requests)

Feng Shen

unread,
Feb 4, 2013, 4:12:36 AM2/4/13
to golan...@googlegroups.com
Opps,  I did the test on lion.mei.fm.   lion.mei.fm == localhost

ChrisLu

unread,
Feb 4, 2013, 1:24:32 PM2/4/13
to golan...@googlegroups.com
I do have a real Go program, Weed-FS, a distributed file system, which is basically to serve static files just like nginx.

Making the http package faster will benefit many applications. Even though http does not have "important" business logic, it exists in most applications.
Go is quite capable and should be able to optimize the http pacakge. Don't turn people off when they later realize that "I should have written it in x language".

Maybe we should look into the nginx design, to see why it can get so much faster compared to Go http package? At least we should know the reason first instead of just guessing the reason and giving up quickly.

Chris

Jim Whitehead II

unread,
Feb 4, 2013, 1:40:47 PM2/4/13
to ChrisLu, golang-nuts
I don't think anyone is interesting in 'giving up' without looking for a reason. Dave has done a remarkable job of responding to posts like this trying to help people get better information about the performance of their programs in order to improve the runtime, and the performance of the net/http package.

That being said, winning a 'micro-benchmark' like this is in no one's best interest, unless it comes as a side-effect of improvements to the Go ecosystem. We could spend days and weeks trying to improve our performance when writing a one line "Hello World" web server, but what purpose does that serve?

Now, if you want to compare the performance of Go serving templated content versus Node serving similar templated content then that might be interesting. Looking at the overhead of serving static files compared to nginx, I'd love to see number on that. I just don't think a "Hello World" web server is worth anyone's time. It shows an incredible lack of understanding of web benchmarking, as has been shown time and time again.



--

Nate Finch

unread,
Feb 4, 2013, 1:48:15 PM2/4/13
to golan...@googlegroups.com
Chris - the point was not that people don't have real applications, it's that you can't optimize a dumb benchmark.

You have an application, that's great.  Are you having performance issues? Do you think Go's http package is a bottleneck? If so, what makes you think that?

nginx is a hugely optimized unitasker. It does one thing and it does one thing very very well.  Beating the best existing application in any particular area is not Go's goal.  Is it possible that http could be faster? Of course it's possible. But unlike nginx, that can't be the Go team's first priority.  Go is a general-use programming language. It has to be pretty good at a lot of stuff. And I think Go succeeds at being pretty good at a lot of stuff.

steve wang

unread,
Feb 4, 2013, 2:29:27 PM2/4/13
to golan...@googlegroups.com
Is it possible that a http server written in Go beat nginx someday by taking advantage of the capability of massive concurrency?

steve wang

unread,
Feb 4, 2013, 2:29:38 PM2/4/13
to golan...@googlegroups.com
Is it possible that a http server written in Go beat nginx someday by taking advantage of the capability of massive concurrency?

On Tuesday, February 5, 2013 2:48:15 AM UTC+8, Nate Finch wrote:

bryanturley

unread,
Feb 4, 2013, 2:54:35 PM2/4/13
to golan...@googlegroups.com
On Monday, February 4, 2013 1:29:27 PM UTC-6, steve wang wrote:
Is it possible that a http server written in Go beat nginx someday by taking advantage of the capability of massive concurrency?


It is possible that this is happening now for some workloads, if not definitely in the future.

Kevin Gillette

unread,
Feb 4, 2013, 4:29:03 PM2/4/13
to golan...@googlegroups.com
Architecturally, nginx, lighttpd, and similar projects are so fast specifically because they leverage os facilities like epoll and kqueue while, importantly, avoiding the overhead of multithreading. The extensions and modules for those servers also have to be written with single-threaded operation in mind.

Go does not have the same goal; instead, go focuses on good speed, but not at the sacrifice of safety or concurrent language features. In this sense, go is not as really optimizable as c, but is considerably more convenient and expressive, and the sum of go qualities outweighs any one of its characteristics.

Patrick Mylund Nielsen

unread,
Feb 4, 2013, 5:06:14 PM2/4/13
to Felix Geisendoerfer, golang-nuts
Thanks for your input! That's great to hear.


On Mon, Feb 4, 2013 at 10:37 PM, Felix Geisendoerfer <haim...@gmail.com> wrote:
Former node core contributor here (who is in love with go now).

Node's http parser is indeed modeled after nginx / very fast. From the benchmarking I have done however, it is not necessarily the "magic piece" that makes things fast.

I have a benchmark running node's low level JS binding to the http parser against a 623 byte GET request, and it's able to parse ~130k requests / second using a single core (~600 MBit/sec). This is about ~10x faster than what you'll see in any full stack benchmarks, as discussed in this thread.

So where does the rest of the performance go to? It's hard to say for sure, but AFAIK the following are big factors:

* accept()ing new connections read() / write() on the sockets calling C/C++
* functions from JS and vise versa (this is something the v8 team is working on
* optimizing, but last I checked it was a big cost factor) allocating the
* user-facing JS objects / garbage collecting them

I'm mentioning this because node.js, while being heavily optimized, is still far from being as fast in this benchmark as it could be. So I feel that given some time and effort, I'm not worried that go could catch up (I'm certainly going to look into it myself as time allows).

That being said, I feel that go is light years ahead of node.js when it comes to http processing. Node's http stack, starting with the http parser itself [1], is a mess [2] compared to go [3]. A lot of this is a great showcase for goroutines, as they allow keeping most of the parser state inside the functions, while node has to use a huge state machine and lots of book keeping to make the parser streaming/resumable.

So even if node was to dominate this vanity benchmark in the future, I'd still happily accept this in exchange for a clear and readable http stack.

--fg


On Monday, 4 February 2013 22:29:03 UTC+1, Kevin Gillette wrote:
Architecturally, nginx, lighttpd, and similar projects are so fast specifically because they leverage os facilities like epoll and kqueue while, importantly, avoiding the overhead of multithreading. The extensions and modules for those servers also have to be written with single-threaded operation in mind.

Go does not have the same goal; instead, go focuses on good speed, but not at the sacrifice of safety or concurrent language features. In this sense, go is not as really optimizable as c, but is considerably more convenient and expressive, and the sum of go qualities outweighs any one of its characteristics.

--

Andrew Gerrand

unread,
Feb 4, 2013, 5:27:00 PM2/4/13
to ChrisLu, golang-nuts
On 5 February 2013 05:24, ChrisLu <chri...@gmail.com> wrote:
I do have a real Go program, Weed-FS, a distributed file system, which is basically to serve static files just like nginx.

It is a lot more than just

func handler(w http.ResponseWriter, r *http.Request) { w.Write([]byte("Hi, I'm Weed-FS!")) }

right?

Making the http package faster will benefit many applications.

I agree. And Go's HTTP stack can certainly be faster; we've barely scratched the surface in terms of optimizing it.

But it's important to optimize these things in context. What's the point of making the header parsing code really really good when (pulling a fake example out of thin air) the scheduler dominates CPU time under real workloads?

We should aim to make Go generally efficient, rather than very good at just one thing.

Andrew

Felix Geisendoerfer

unread,
Feb 5, 2013, 5:55:07 AM2/5/13
to golan...@googlegroups.com
To make my previous post a little less handwavy, here is the actual code / detailed results: https://github.com/felixge/node-http-perf

--fg

On Monday, 4 February 2013 22:37:29 UTC+1, Felix Geisendoerfer wrote:
Former node core contributor here (who is in love with go now).

Node's http parser is indeed modeled after nginx / very fast. From the benchmarking I have done however, it is not necessarily the "magic piece" that makes things fast.

I have a benchmark running node's low level JS binding to the http parser against a 623 byte GET request, and it's able to parse ~130k requests / second using a single core (~600 MBit/sec). This is about ~10x faster than what you'll see in any full stack benchmarks, as discussed in this thread.

So where does the rest of the performance go to? It's hard to say for sure, but AFAIK the following are big factors:

* accept()ing new connections read() / write() on the sockets calling C/C++
* functions from JS and vise versa (this is something the v8 team is working on
* optimizing, but last I checked it was a big cost factor) allocating the
* user-facing JS objects / garbage collecting them

I'm mentioning this because node.js, while being heavily optimized, is still far from being as fast in this benchmark as it could be. So I feel that given some time and effort, I'm not worried that go could catch up (I'm certainly going to look into it myself as time allows).

That being said, I feel that go is light years ahead of node.js when it comes to http processing. Node's http stack, starting with the http parser itself [1], is a mess [2] compared to go [3]. A lot of this is a great showcase for goroutines, as they allow keeping most of the parser state inside the functions, while node has to use a huge state machine and lots of book keeping to make the parser streaming/resumable.

So even if node was to dominate this vanity benchmark in the future, I'd still happily accept this in exchange for a clear and readable http stack.

--fg


On Monday, 4 February 2013 22:29:03 UTC+1, Kevin Gillette wrote:

Eli Janssen

unread,
Feb 5, 2013, 3:34:39 PM2/5/13
to songof...@gmail.com, golan...@googlegroups.com
I think your sample code has a race condition in it with GOMAXPROCS > 1 -- I think you may occasionally serve an empty date to a client.

It might be cleaner to have a goroutine (spawn on startup?) just update/set the formattedDate every second (or N seconds if you don't care about exact time), instead of unsetting it and having the request set it. That way the request can just read it -- no race and no locking required.

On Feb 5, 2013, at 8:08 AM, songof...@gmail.com wrote:

> Date formatting is one of CPU eater.
> Caching formatted date improves performance ~10%
>
> https://gist.github.com/methane/4715414

mattn

unread,
Feb 5, 2013, 10:19:56 PM2/5/13
to golan...@googlegroups.com, songof...@gmail.com
typo s/updateHttpDate/getDate/ in your patch?

mattn

unread,
Feb 5, 2013, 10:29:38 PM2/5/13
to golan...@googlegroups.com, songof...@gmail.com
ah, sorry. It seems I slept while ago.

Kevin Gillette

unread,
Feb 6, 2013, 3:48:55 AM2/6/13
to golan...@googlegroups.com, songof...@gmail.com
What you describe is also a race condition: regardless of whether there's (one writer, many readers) or (many writers, many readers), or even (many writers, one reader), any time you have a write occurring concurrently with other reads or writes, without synchronization, then it's safe to say there's a race condition. The simplest safe option would be to have a goroutine that constantly sends formatted date strings to a buffered channel, checking the time that has passed between each loop iteration (and if enough time has passed, then the string can be reformulated).

Additionally, why does the date string need to be regenerated in this case at all? If the content of the resource has not semantically changed, then the date should reflect the initial point when this incarnation of the resource has been valid. In other words, if the content of a resource won't change for 5 years, why update the date string? There's usually no value in generating a recent modification date just for its own sake (that doesn't magically make the resource "dynamic", and even if it did, there's no magic value in being "dynamic" when the content may be static).

Johann Höchtl

unread,
Feb 6, 2013, 4:05:15 AM2/6/13
to golan...@googlegroups.com
Using tip this shouldn't be required. I can't find the commit message, but must be two, three weeks back then when this was altered for response sizes < 2k bytes AFAIK.

Am Dienstag, 5. Februar 2013 13:10:09 UTC+1 schrieb Naoki INADA:
You can set Content-Length header explicitly to avoid chunked encoding.

package main

import (
"io"
"net/http"
"strconv"
)

func HelloServer(w http.ResponseWriter, req *http.Request) {
msg := "Hello World!\n";
w.Header().Add("Content-Type", "text/plain")
w.Header().Add("Content-Length", strconv.Itoa(len(msg)))
io.WriteString(w, msg)
}

func main() {
http.HandleFunc("/", HelloServer)
http.ListenAndServe(":8000", nil)
}

INADA Naoki

unread,
Feb 6, 2013, 4:05:45 AM2/6/13
to Kevin Gillette, golan...@googlegroups.com
I've rewrote my patch with mutex.
See https://codereview.appspot.com/7315043/
--
INADA Naoki  <songof...@gmail.com>

minux

unread,
Feb 6, 2013, 7:22:12 AM2/6/13
to Kevin Gillette, golan...@googlegroups.com, songof...@gmail.com
On Wed, Feb 6, 2013 at 4:48 PM, Kevin Gillette <extempor...@gmail.com> wrote:
Additionally, why does the date string need to be regenerated in this case at all? If the content of the resource has not semantically changed, then the date should reflect the initial point when this incarnation of the resource has been valid. In other words, if the content of a resource won't change for 5 years, why update the date string? There's usually no value in generating a recent modification date just for its own sake (that doesn't magically make the resource "dynamic", and even if it did, there's no magic value in being "dynamic" when the content may be static).
We're talking about the mandatory Date header, not Last-Modified header.

the Date header be set "the date and time at which the message was originated".

This step (formatting the time) does use a somewhat large portion of time,
and in the past, Brad even suggested re-implement the required functionality
inside net/http package, but the proposal was rejected because we should
make time.Format faster rather than duplicate the code inside every client.
Then Russ said he could make time.Format at least 5x faster, and filed

Thus, we'd better not to optimize this part of net/http and wait for issue 3679
to be fixed and then re-measure/profile.

minux

unread,
Feb 6, 2013, 7:23:48 AM2/6/13
to Johann Höchtl, golan...@googlegroups.com
On Wed, Feb 6, 2013 at 5:05 PM, Johann Höchtl <johann....@gmail.com> wrote:
Using tip this shouldn't be required. I can't find the commit message, but must be two, three weeks back then when this was altered for response sizes < 2k bytes AFAIK.
seems you're talking about https://codereview.appspot.com/6964043

Am Dienstag, 5. Februar 2013 13:10:09 UTC+1 schrieb Naoki INADA:
You can set Content-Length header explicitly to avoid chunked encoding.

package main

import (
"io"
"net/http"
"strconv"
)

func HelloServer(w http.ResponseWriter, req *http.Request) {
msg := "Hello World!\n";
w.Header().Add("Content-Type", "text/plain")
w.Header().Add("Content-Length", strconv.Itoa(len(msg)))
io.WriteString(w, msg)
}

func main() {
http.HandleFunc("/", HelloServer)
http.ListenAndServe(":8000", nil)
}

--

Naoki INADA

unread,
Feb 6, 2013, 7:35:38 AM2/6/13
to golan...@googlegroups.com, Kevin Gillette, songof...@gmail.com

My benchmark also shows readNB/writeNB has significant gain.
I hope Go 1.2 is better for high performance middleware with HTTP API than Go 1.0.3.
(I'm too late to Go 1.1, maybe.)

minux

unread,
Feb 6, 2013, 7:47:59 AM2/6/13
to Naoki INADA, golan...@googlegroups.com, Kevin Gillette
it seems we've decided not to introduce exposed ReadNB/WriteNB in Go 1.1 [1],
and instead wait for scheduler enhancements (and probably automatic blocking
syscall detection, Dmitry Vyukov is working on this)

I'm not sure whether we should provide internal nonblocking syscall API just for
net/http, at least it doesn't feel right to me.

I hope Go 1.2 is better for high performance middleware with HTTP API than Go 1.0.3.
yeah, i think lots of people (if not all) have the same wish for Go. 
(I'm too late to Go 1.1, maybe.)

Mikio Hara

unread,
Feb 6, 2013, 8:20:58 AM2/6/13
to minux, Naoki INADA, golang-nuts, Kevin Gillette
On Wed, Feb 6, 2013 at 9:47 PM, minux <minu...@gmail.com> wrote:

> it seems we've decided not to introduce exposed ReadNB/WriteNB in Go 1.1 [1],
> and instead wait for scheduler enhancements (and probably automatic blocking
> syscall detection, Dmitry Vyukov is working on this)

fyi: https://codereview.appspot.com/6813046/.

Naoki INADA

unread,
Feb 6, 2013, 10:58:37 AM2/6/13
to golan...@googlegroups.com, minux, Naoki INADA, Kevin Gillette
I find the thread about new scheduler and very excited.
https://groups.google.com/forum/#!msg/golang-dev/_H9nXe7jG2U/QEjSCEVB3SMJ

Daniel Bryan

unread,
Feb 6, 2013, 4:34:21 PM2/6/13
to golan...@googlegroups.com, minux, Naoki INADA, Kevin Gillette
Just to add to the flood of things to consider: as mentioned, node.js have heavily optimised their HTTP implementation. What that basically means is that a lot of it is written in native code (C++, if I'm not mistaken). I wouldn't be surprised if node can equal or beat Go in benchmarking a simple application like that.

When you start to actually write code and manipulate data structures in ways that haven't been pre-optimised, I suspect you'd see Go coming out heavily on top - or at least, a go program written by a moderate programmer would tend to be faster than the equivalent program by a moderate node programmer; if you're very careful when targeting V8 you can make some reasonable guarantees about that data structures used by the compiler, but it's not easy.

And then there's multi-core parallelism.. but that's being unfair on poor old JavaScript.

Dave Cheney

unread,
Feb 6, 2013, 4:46:08 PM2/6/13
to Mikio Hara, minux, Naoki INADA, golang-nuts, Kevin Gillette
As the author of that final message I want to remind those following
this thread that while WriteNB showed promise on paper -- it did what
it said, moved the profile from syscall.Syscall6 to
syscall.RawSyscall6. It showed only sporadic improvement in benchmarks
at the cost of new eternal symbols that may be obsoleted shortly.

Importantly, while circumstances or benchmarks could be arranged to
show a 10% improvement in some cases, when those benchmarks were
applied to a wider set of conditions; varying response size, varying
GOMAXPROCS, even varying benchmarking tool, the results were at best a
wash.

The CL is always available for those that want to patch their systems
themselves.

Dave

Niklas Schnelle

unread,
Feb 6, 2013, 6:04:12 PM2/6/13
to golan...@googlegroups.com, Mikio Hara, minux, Naoki INADA, Kevin Gillette
So I find that thread about scheduler rework also quite interesting, it hasn't been updated in quite some time
so I wonder whether there has been any progress? Looks like quite some task, that
will heavily influence Go performance, especially for compute heavy loads.
Also I kind of disagree with the generalisation that one can generally assume Go to
be faster than node.js in real applications, at least not for now.
It's just that V8 has probably seen many times the man hours as the relevant parts of Go.
Message has been deleted
Message has been deleted

Robotic Artichoke

unread,
Feb 8, 2013, 6:43:45 PM2/8/13
to golan...@googlegroups.com
I still think we need to put our brains together and create more meaningful benchmarks. Hello worlding with the vanilla node server is close to pointless IMO.

Your computer seems to be really good. When you ran the node test with multi-cores, how did you utilize the cluster module or did you load balance a few real separate processes with a proxy? If I put together real Node could, could you write the equiv. in Go code and then bench it?

On Friday, February 8, 2013 4:08:41 PM UTC-5, Kenneth Jonsson wrote:
Its been a while since this was first posted, but this is something that isn't true at all on my system

Den torsdagen den 16:e juni 2011 kl. 20:43:09 UTC+2 skrev ChrisLu: 
In the test, GOMAXPROCS is set to 1. Setting it to higher numbers
actually does not have much effect.


I made the following changes to the Go program source

package main
import ("net/http";"io";"runtime";"fmt")

func HelloServer(w http.ResponseWriter, req *http.Request) {
    io.WriteString(w, "hello, world!\n")
}
func main() {
    fmt.Println(runtime.GOMAXPROCS(0))
    http.HandleFunc("/", HelloServer)
    http.ListenAndServe(":8080", nil)
}

My machine is a Core i7-2600 running 64-bit Ubuntu 12.04LTS and I'm using Go 1.0.3 and node.js is the version shipped with Ubuntu, v0.6.12

The node.js result is

Concurrency Level:      100
Time taken for tests:   7.340 seconds
Complete requests:      100000
Failed requests:        0
Write errors:           0
Total transferred:      7600000 bytes
HTML transferred:       1200000 bytes
Requests per second:    13624.73 [#/sec] (mean)
Time per request:       7.340 [ms] (mean)
Time per request:       0.073 [ms] (mean, across all concurrent requests)
Transfer rate:          1011.21 [Kbytes/sec] received

which is slightly higher compared to the first node.js result posted in this thread. I tried to vary the number of CPU-cores node.js is allowed to use but it does not really affect the end result. Running strace on the node.js binary suggest that it is really a single threaded application using epoll to track all active sockets.

Go result when using GOMAXPROCS=1

Concurrency Level:      100
Time taken for tests:   6.742 seconds
Complete requests:      100000
Failed requests:        0
Write errors:           0
Total transferred:      11100000 bytes
HTML transferred:       1400000 bytes
Requests per second:    14832.05 [#/sec] (mean)
Time per request:       6.742 [ms] (mean)
Time per request:       0.067 [ms] (mean, across all concurrent requests)
Transfer rate:          1607.77 [Kbytes/sec] received

which is very close to the node.js number, but actually slightly better.
But look at the result when using GOMAXPROCS=2

Concurrency Level:      100
Time taken for tests:   3.760 seconds
Complete requests:      100000
Failed requests:        0
Write errors:           0
Total transferred:      11100000 bytes
HTML transferred:       1400000 bytes
Requests per second:    26595.79 [#/sec] (mean)
Time per request:       3.760 [ms] (mean)
Time per request:       0.038 [ms] (mean, across all concurrent requests)
Transfer rate:          2882.94 [Kbytes/sec] received

80% more connections per unit of time, which is quite impressive considering the small amount of work being done on a per session basis!
It doesn't really scale beyond that, using GOMAXPROCS=3 or GOMAXPROCS=4 yield roughly the same result.

So Go really is an order of magnitude faster as I assume that all programmers count in base 2 ;)

Kenneth Jonsson

unread,
Feb 10, 2013, 11:31:14 AM2/10/13
to golan...@googlegroups.com
I'm a total novices on node.js, just figured out enough to be able to run the example at the top of this thread. Seems like an interesting technique, so I read the documentation for the cluster module, found an example that eventually became this

 var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
    for (var i = 0; i < numCPUs; i++) {
    console.log('Forking child ' + i);
    cluster.fork();
    }
} else {
    console.log('Client ' + process.pid + ' online');
    http.createServer(function(req, res) {
    res.writeHead(200, {'Content-Type': 'text/plain'});
    res.end("hello world!\n");
    }).listen(8080, "127.0.0.1");
}

I then ran into another problem, the ab program became the bottleneck which I realize was also the reason why Go didn't scale beyond GOMAXPROCS=2.

The wrk tool (https://github.com/wg/wrk) seem to be able to generate connections at a higher rate, both node.js and Go reached 92k served requests per second. This test is so simple so we are probably bottlenecked by the Linux kernel at this point + the fact that the system "only" got 4 cores.

But this is still not comparing apples to apples. Node.js builds it scaling on forking workers while Go keeps a gorouting scheduler that multiplex goroutines on top of up to GOMAXPROCS POSIX-threads.

Just by looking at that tells us that Node.js will most likely scale better on non-uniform memory (NUMA) systems as each worker runs in a separate OS process, which make it possible for the OS to keep memory allocation to a single NUMA-zone and make sure that the process stays on that CPU-socket.

So node.js is better than Go then? Well, as long as you don't need to communicate between workers and "hello world" is very important to you. Communicating between goroutines are both very cheap and very easy in Go by using channels. I',m not even sure how one communicates between node.js workers, but communicating between processes means IPC which is far more expensive compared to using Go channels.

The conclusion is probably that both techniques scale very well and it will come down to personal (or more likely, corporate) preferences which one that is "best".

Kevin Gillette

unread,
Feb 10, 2013, 12:37:12 PM2/10/13
to golan...@googlegroups.com
I honestly can't think of any serious web app where I didn't need to do something outside of the normal request flow -- this has normally involved serializing tasks/communications to the database, or using another external system like beanstalk. Some frameworks for some languages accept this and provide, e.g. specialized cron facilities.

Go retains full flexibility to do _anything_ needed and is painless at the same time. In this respect, I don't see how it can be said that node is ”better”, since go didn't have to fork to reach this level of parallelism, and if you're really arguing that forking is better, well go can fork too

Kenneth Jonsson

unread,
Feb 10, 2013, 1:43:43 PM2/10/13
to golan...@googlegroups.com
I'm not saying forking is better, I'm just saying it does solve one very hard problem: how to deal with memory when running on NUMA-systems.

I did say that Go do retain a number of advantages, cheap (and type-safe) communication between goroutines are definitely one of those.

You can run multiple instances of the same Go-program, one instance per NUMA-zone is probably a good idea if you have a multi-socket system. There are number of ways one could implement IPC between instance in Go.

However, you cannot do exactly the same thing as node.js, that is open a single socket, bind it and put it into listen state and then share that socket between multiple instances where each instance sits in accept() (well, they probably sit in a epoll_wait() call waiting on single descriptor in the specific example above).  Go doesn't have a fork() call, and using cgo to get hold of fork() in unistd.h does not work for a number of reasons (https://groups.google.com/forum/#!topic/golang-nuts/KynZO5BQGks/discussion).

One could probably start multiple instances of a Go program, write a wrapper around SCM_RIGHTS (http://stackoverflow.com/questions/2358684/can-i-share-a-file-descriptor-to-another-process-on-linux-or-are-they-local-to-t) to share the listen socket between all instances. But I guess that isn't what ppl would call "idiomatic Go" and the best way to solve this particular problem :)

And just to straighten out the personal preferences/bias here: I personally would select Go over node.js if both of them could solve the problem as I really like the goroutines, channels, static typing etc.

Robotic Artichoke

unread,
Feb 10, 2013, 2:22:55 PM2/10/13
to golan...@googlegroups.com
Your cluster implementation is pretty basic, that is basically a best case scenario implementation with the least overhead but highest chance of things going wrong.

I never looked at cluster's source but I'm pretty sure it shares the listen socket between all of the forked processes it spawns. Also the cluster API makes it really simple to send messages around between all of the workers and more.

Another popular multi-core scaling solution for Node is to just run different processes on different ports and then load balance them round robin style. Then you could keep all state out of your app and use something like Redis to hold onto your app state. I imagine this could be done with Go too. I like this approach because now you can scale past 1 physical machine and it's dead simple.

Kevin Gillette

unread,
Feb 10, 2013, 2:38:04 PM2/10/13
to golan...@googlegroups.com
I'm not sure your forking-node code does what you described: if the pid is the "cluster master", then you fork -- otherwise if it's already a forked child, you listen. Unless there are magic unicorns in there somewhere, or I'm interpreting it in as wrong a fashion as possible, what would happen is that the initial invocation would fork numCPUs children and then die, and each of those children would attempt to listen, only one of which would succeed (the rest failing silently [or failing loudly to nowhere, perhaps]). If that's so, you're getting 92k rps on node via only one pid (which means you should be getting 92k rps without any of the forking code).

You can fork in Go, it's just not clean (which is why it's not provided by the syscall package directly, though you could use syscall.RawSyscall. Along the lines of what you described, all of the current "graceful restart" implementations for Go set up a socket and then exec the same os.Args[0] though with an environment variable (or whatever) communicating the not-closed socket fd to reuse. That's not fundamentally different from the same processes with a previous fork. Granted, all of that just for the potential NUMA gains is not the right approach -- if that much scale is needed, then multiple hosts would likely be needed soon thereafter, and as you say, a communication system between hosts, or some app-specific sharding mechanism, would need to be employed. In either of those cases, fork would be unnecessary (though no heap page sharing could occur as would be the case with fork).

There are a number of compiler/runtime enhancements that could be (or have already been) made for Go to take better advantage of NUMA. Dmitry's scheduler redesign (https://groups.google.com/d/topic/golang-dev/_H9nXe7jG2U/discussion) has a number of benefits in this area, including the policy of remuxing a goroutine that was recently muxed to that same thread, and extra provisions for thread-local storage. The runtime could mark unchanging pages as readonly at an execution point when a to-be-enhanced compiler could guarantee that no changes would occur -- that's not much different from the property of fork that allows most process pages to be (at least temporarily) localized, with the exception that with fork, those pages could later change later (resulting in the kernel trapping the fault and producing a copy) -- such a disruption would not need to occur for pages that are voluntarily flagged as immutable.

These differences are untested, however -- for argument's sake, it would be worth running equivalent node (with and without forking) and go (with and without GOMAXPROCS > 1) through numastat, or something equivalent, to gauge the benefit each gets. Also, tracking shared page counts for a forked invocation of a real node app, after some burn-in, would be useful to determine how well node and V8 play with forking -- if a mixture of changing and unchanging data comprise every runtime-allocated page, then the benefit of fork over separate invocations reusing the same fd will be negligible.

Kenneth Jonsson

unread,
Feb 10, 2013, 4:11:16 PM2/10/13
to golan...@googlegroups.com
Den söndagen den 10:e februari 2013 kl. 20:38:04 UTC+1 skrev Kevin Gillette:
I'm not sure your forking-node code does what you described: if the pid is the "cluster master", then you fork -- otherwise if it's already a forked child, you listen. Unless there are magic unicorns in there somewhere, or I'm interpreting it in as wrong a fashion as possible, what would happen is that the initial invocation would fork numCPUs children and then die, and each of those children would attempt to listen, only one of which would succeed (the rest failing silently [or failing loudly to nowhere, perhaps]). If that's so, you're getting 92k rps on node via only one pid (which means you should be getting 92k rps without any of the forking code).



As I stated above, I'm a node.js novices so I must trust exactly what the documentation state as being correct and it says that the process works like this about how children works when used with the cluster module
When you call server.listen(...) in a worker, it serializes the arguments and passes the request to the master process. If the master process already has a listening server matching the worker's requirements, then it passes the handle to the worker. If it does not already have a listening server matching that requirement, then it will create one, and pass the handle to the child. 

i.e. it is the master that calls listen() and all workers use the same TCP-listen socket.

The 92k was the aggregated performance and the number was almost identical when using the Go example, which kind of suggest that the bottleneck now lies in the kernel and/or the hardware (may need more CPU-cores) and/or in the application generating traffic.

Brad Fitzpatrick

unread,
Mar 7, 2013, 2:20:49 PM3/7/13
to jftu...@gmail.com, golan...@googlegroups.com
Why was that patch never mailed out for review?


On Sat, Feb 16, 2013 at 2:58 PM, <jftu...@gmail.com> wrote:
This was what happened when I took some time to invalidate an article and boundary that was simply sporting unrealistically numbers:


See the benchmark results from the bottom of the README. This particular benchmark was between machines on a LAN (some details higher up in the readme), had basic network tuning, and was hitting Postgres for data.

The database/sql concurrency patch mentioned in the readme is here: https://codereview.appspot.com/6855102/

On Friday, February 8, 2013 3:43:45 PM UTC-8, Robotic Artichoke wrote:
I still think we need to put our brains together and create more meaningful benchmarks. Hello worlding with the vanilla node server is close to pointless IMO.

In short, I strongly agree with you that more meaningful benchmarks are critical, as they turn up patches like the above.

--

bronz...@gmail.com

unread,
Aug 1, 2013, 11:04:21 PM8/1/13
to golan...@googlegroups.com
I just update golang to v1.1.1
and node to v0.10.15
I use golang one thread and nodejs one thread.

It seems like golang did some work improve performance.

For golang:
Concurrency Level:      100
Time taken for tests:   38.396 seconds
Complete requests:      1000000
Failed requests:        0
Write errors:           0
Total transferred:      149000000 bytes
HTML transferred:       13000000 bytes
Requests per second:    26044.54 [#/sec] (mean)
Time per request:       3.840 [ms] (mean)
Time per request:       0.038 [ms] (mean, across all concurrent requests)
Transfer rate:          3789.68 [Kbytes/sec] received

for nodejs:
Concurrency Level:      100
Time taken for tests:   113.247 seconds
Complete requests:      1000000
Failed requests:        0
Write errors:           0
Total transferred:      115000000 bytes
HTML transferred:       14000000 bytes
Requests per second:    8830.28 [#/sec] (mean)
Time per request:       11.325 [ms] (mean)
Time per request:       0.113 [ms] (mean, across all concurrent requests)
Transfer rate:          991.68 [Kbytes/sec] received


On Friday, June 17, 2011 2:43:09 AM UTC+8, ChrisLu wrote:
Kind of disappointing. Golang is supposedly "closer" to the metal. But
I don't expect it only be comparable to node.js, and don't expect it
actually node.js is 45% faster than golang.

In the test, GOMAXPROCS is set to 1. Setting it to higher numbers
actually does not have much effect.

For go:
Concurrency Level:      100
Time taken for tests:   152.330 seconds
Complete requests:      1000000
Failed requests:        0
Write errors:           0
Total transferred:      110000000 bytes
HTML transferred:       14000000 bytes
Requests per second:    6564.69 [#/sec] (mean)
Time per request:       15.233 [ms] (mean)
Time per request:       0.152 [ms] (mean, across all concurrent
requests)
Transfer rate:          705.19 [Kbytes/sec] received


For node.js:
Concurrency Level:      100
Time taken for tests:   104.538 seconds
Complete requests:      1000000
Failed requests:        0
Write errors:           0
Total transferred:      78000000 bytes
HTML transferred:       14000000 bytes
Requests per second:    9565.93 [#/sec] (mean)
Time per request:       10.454 [ms] (mean)
Time per request:       0.105 [ms] (mean, across all concurrent
requests)
Transfer rate:          728.66 [Kbytes/sec] received


Here are the codes for go and node.js

go code:

package main
import ("http";"io";"runtime")
func HelloServer(w http.ResponseWriter, req *http.Request) {
        io.WriteString(w, "hello, world!\n")
}
func main() {
  runtime.GOMAXPROCS(1)
        http.HandleFunc("/", HelloServer)
        http.ListenAndServe(":8080", nil)
}

node.js code:

var http = require('http');
http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('hello, world!\n');
}).listen(8080, "127.0.0.1");
console.log('Server running at http://127.0.0.1:8080/');

gauravk

unread,
Aug 2, 2013, 10:58:16 AM8/2/13
to golan...@googlegroups.com, bronz...@gmail.com
If this is correct then it seems node.js has become little slower in the mean time:

------old---------
Time taken for tests:   104.538 seconds 
Requests per second:    9565.93 [#/sec] (mean) 
Time per request:       10.454 [ms] (mean) 
Time per request:       0.105 [ms] (mean, across all concurrent requests) 

------new--------
Time taken for tests:   113.247 seconds
Requests per second:    8830.28 [#/sec] (mean)
Time per request:       11.325 [ms] (mean)
Time per request:       0.113 [ms] (mean, across all concurrent requests)

Nguyên Nguyễn Văn Cao

unread,
Aug 2, 2013, 11:55:06 AM8/2/13
to golan...@googlegroups.com, bronz...@gmail.com
So, can we say "golang helloworld 300% faster than node.js...for now"?

Vào 10:04:21 UTC+7 Thứ sáu, ngày 02 tháng tám năm 2013, bronz...@gmail.com đã viết:

bronz...@gmail.com

unread,
Aug 3, 2013, 12:20:57 AM8/3/13
to golan...@googlegroups.com
I did more dig,here is the result:

single thread ,ab test,with keep-alive
ab -k -n 100000 -c 100 http://127.0.0.1:3000
node.js v0.10.15
Requests per second:    9132.65 [#/sec] (mean)
golang v1.1.1
Requests per second:    45553.94 [#/sec] (mean)

single thread ,ab test,without keep-alive
ab -n 100000 -c 100 http://127.0.0.1:3000
node.js v0.10.15
Requests per second:    9209.95 [#/sec] (mean)
golang v1.1.1
Requests per second:    24134.45 [#/sec] (mean)

single thread ,wrk test
wrk -c100 -d5s http://127.0.0.1:3000
node.js v0.10.15
Requests/sec:  17899.85
golang v1.1.1
Requests/sec:  51828.99

mulit thread ,wrk test (4 cores cpu ,with 4 threads or 4 processes)
wrk -c100 -d5s http://127.0.0.1:3000
node.js v0.10.15
Requests/sec:  65564.01
golang v1.1.1
Requests/sec:  92608.54

So. right now, golang helloworld is 41% faster than node.js

Alexandre Fiori

unread,
Aug 3, 2013, 6:53:33 PM8/3/13
to golan...@googlegroups.com
1. Benchmarking on localhost is pointless; and you should tune your system for networking, etc
2. Your go and node servers deliver different content (look at the total bytes transferred as someone else already mentioned); yet, transfer rates are pretty close
3. Without keep-alive you're also measuring the time taken by the OS to set up connections  

For the sake of simplicity and better comparison I've managed to get both servers to deliver the exact same amount of bytes - exact same headers, except for the content of the Date header of course.

go server:
package main

import (
"net/http"
"io"
"runtime"
)

func HelloServer(w http.ResponseWriter, req *http.Request) {
w.Header().Set("Content-Type", "text/plain") // otherwise it'll add charset=utf-8
w.Header().Set("Connection", "keep-alive") // because node adds it too
io.WriteString(w, "hello, world!\n")
}
func main() {
runtime.GOMAXPROCS(1)
http.HandleFunc("/", HelloServer)
http.ListenAndServe(":8080", nil)
}

node server:
var http = require('http'); 
http.createServer(function (req, res) { 
  res.writeHead(200, {'Content-Type': 'text/plain', 'Content-Length': 14}); // to avoid chunked responses
  res.end('hello, world!\n'); 
}).listen(8080, "0.0.0.0"); 


These are my results using "ab -n 100000 -c 100 -k server:8080/" on a 64bit quad-core i5 (Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz), with go v1.1.1 and node v0.10.15: 

go server:
Concurrency Level:      100
Time taken for tests:   1.955 seconds
Complete requests:      100000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    100000
Total transferred:      14000000 bytes
HTML transferred:       1400000 bytes
Requests per second:    51138.91 [#/sec] (mean)
Time per request:       1.955 [ms] (mean)
Time per request:       0.020 [ms] (mean, across all concurrent requests)
Transfer rate:          6991.65 [Kbytes/sec] received

node server:
Concurrency Level:      100
Time taken for tests:   5.050 seconds
Complete requests:      100000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    100000
Total transferred:      14000000 bytes
HTML transferred:       1400000 bytes
Requests per second:    19801.76 [#/sec] (mean)
Time per request:       5.050 [ms] (mean)
Time per request:       0.051 [ms] (mean, across all concurrent requests)
Transfer rate:          2707.27 [Kbytes/sec] received


And it gets more interesting with more concurrent requests.
Turning off keep-alive brings go down to ~18k rps and node to ~8k rps.

Alexandre Fiori

unread,
Aug 3, 2013, 6:57:18 PM8/3/13
to golan...@googlegroups.com, Dave Cheney
Absolutely. But it shouldn't really matter for single threaded servers as long as the OS is fine tuned. For multi-core servers ab just doesn't work and weighttp is definitely the best option.

On Sunday, February 3, 2013 5:27:43 PM UTC-5, elij wrote:
Unrelated, but I have found wkr[1] and weighttp[2] produce more consistent results than ab or siege.

[1]: https://github.com/wg/wrk
[2]: http://redmine.lighttpd.net/projects/weighttp/wiki

On Feb 3, 2013, at 2:08 AM, Dave Cheney <da...@cheney.net> wrote:

> Here are some results comparing tip to the current node.js release
>
> hardware: lenovo x220, ubunut 12.10, amd64
>
> Node.js:
>
> Server Hostname:        localhost
> Server Port:            1337
>
> Document Path:          /
> Document Length:        12 bytes
>
> Concurrency Level:      100
> Time taken for tests:   9.523 seconds
> Complete requests:      100000
> Failed requests:        0
> Write errors:           0
> Total transferred:      11300000 bytes
> HTML transferred:       1200000 bytes
> Requests per second:    10501.06 [#/sec] (mean)
> Time per request:       9.523 [ms] (mean)
> Time per request:       0.095 [ms] (mean, across all concurrent requests)
> Transfer rate:          1158.81 [Kbytes/sec] received
>
> Connection Times (ms)
>              min  mean[+/-sd] median   max
> Connect:        0    0   0.1      0       7
> Processing:     1    9   4.9      9      31
> Waiting:        1    9   4.9      9      31
> Total:          1    9   4.9      9      31
>
> Percentage of the requests served within a certain time (ms)
>  50%      9
>  66%     12
>  75%     13
>  80%     14
>  90%     16
>  95%     17
>  98%     20
>  99%     22
> 100%     31 (longest request)
>
> lucky(~) %  go version
> go version devel +3a9a5d2901f7 Sun Feb 03 02:01:05 2013 -0500 linux/amd64
>
> lucky(~) % ab -n 100000 -c 100 localhost:1337/
> This is ApacheBench, Version 2.3 <$Revision: 655654 $>
> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
> Licensed to The Apache Software Foundation, http://www.apache.org/
>
> Benchmarking localhost (be patient)
> Completed 10000 requests
> Completed 20000 requests
> Completed 30000 requests
> Completed 40000 requests
> Completed 50000 requests
> Completed 60000 requests
> Completed 70000 requests
> Completed 80000 requests
> Completed 90000 requests
> Completed 100000 requests
> Finished 100000 requests
>
>
> Server Software:
> Server Hostname:        localhost
> Server Port:            1337
>
> Document Path:          /
> Document Length:        14 bytes
>
> Concurrency Level:      100
> Time taken for tests:   10.173 seconds
> Complete requests:      100000
> Failed requests:        0
> Write errors:           0
> Total transferred:      13500000 bytes
> HTML transferred:       1400000 bytes
> Requests per second:    9830.25 [#/sec] (mean)
> Time per request:       10.173 [ms] (mean)
> Time per request:       0.102 [ms] (mean, across all concurrent requests)
> Transfer rate:          1295.98 [Kbytes/sec] received
>
> Connection Times (ms)
>              min  mean[+/-sd] median   max
> Connect:        0    0   0.2      0       8
> Processing:     2   10   3.8      9      29
> Waiting:        1   10   3.8      9      28
> Total:          6   10   3.8      9      29
>
> Percentage of the requests served within a certain time (ms)
>  50%      9
>  66%     10
>  75%     11
>  80%     12
>  90%     14
>  95%     18
>  98%     25
>  99%     26
> 100%     29 (longest request)
>
> Which is pretty damn close. However, if we compare with siege
>
> node.js:
>
> lucky(~) % siege -b -t 10s -c 100 localhost:1337/
> ** SIEGE 2.70
> ** Preparing 100 concurrent users for battle.
> The server is now under siege...
> Lifting the server siege...      done.
> Transactions:                   65944 hits
> Availability:                 100.00 %
> Elapsed time:                   9.17 secs
> Data transferred:               0.75 MB
> Response time:                  0.01 secs
> Transaction rate:            7191.28 trans/sec
> Throughput:                     0.08 MB/sec
> Concurrency:                   99.56
> Successful transactions:       65944
> Failed transactions:               0
> Longest transaction:            0.05
> Shortest transaction:           0.00
>
> FILE: /var/log/siege.log
> You can disable this annoying message by editing
> the .siegerc file in your home directory; change
> the directive 'show-logfile' to false.
> [error] unable to create log file: Permission denied
>
> go version devel +3a9a5d2901f7 Sun Feb 03 02:01:05 2013 -0500 linux/amd64
>
> lucky(~) % siege -b -t 10s -c 100 localhost:1337/
> ** SIEGE 2.70
> ** Preparing 100 concurrent users for battle.
> The server is now under siege...
> Lifting the server siege...      done.
>
>                           Transactions:                   24215 hits
> Availability:                 100.00 %
> Elapsed time:                   9.93 secs
> Data transferred:               0.32 MB
> Response time:                  0.04 secs
> Transaction rate:            2438.57 trans/sec
> Throughput:                     0.03 MB/sec
> Concurrency:                   99.35
> Successful transactions:       24215
> Failed transactions:               0
> Longest transaction:            0.06
> Shortest transaction:           0.00
>
> Benchmarks are hard.
>
> Dave
>
> On Sun, Feb 3, 2013 at 7:52 PM, Gal Ben-Haim <gben...@gmail.com> wrote:
>> https://gist.github.com/4700801
>>
>> Gal Ben-Haim
>>
>>
>> On Sun, Feb 3, 2013 at 8:39 AM, Dave Cheney <da...@cheney.net> wrote:
>>>
>>>> I'm also seeing that Node.js is much faster than Go in the hello world
>>>> test.
>>>
>>> Hello,
>>>
>>> Could you post your benchmark tests, and your benchmark results, we've
>>> made a lot of fixes in tip (not 1.0.3) recently which should have
>>> closed the gap.
>>>
>>>> what is the explanation for this ?
>>>
>>> Lack of unicorns
>>>
>>> Dave

kubaja...@gmail.com

unread,
Jan 2, 2014, 2:47:44 AM1/2/14
to golan...@googlegroups.com
Now, the go program is faster than node.js. Used go 1.2 for compiling. 

for golang:

Server Software:        
Server Hostname:        127.0.0.1
Server Port:            8080

Document Path:          /
Document Length:        14 bytes

Concurrency Level:      100
Time taken for tests:   9.873 seconds
Complete requests:      100000
Failed requests:        0
Write errors:           0
Total transferred:      13100000 bytes
HTML transferred:       1400000 bytes
Requests per second:    10128.16 [#/sec] (mean)
Time per request:       9.873 [ms] (mean)
Time per request:       0.099 [ms] (mean, across all concurrent requests)
Transfer rate:          1295.69 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    4   0.9      4       8
Processing:     1    6   1.2      5      11
Waiting:        1    4   1.2      4      10
Total:          4   10   1.1     10      16

Percentage of the requests served within a certain time (ms)
  50%     10
  66%     10
  75%     11
  80%     11
  90%     11
  95%     12
  98%     12
  99%     13
 100%     16 (longest request)

for node.js:

Server Software:        
Server Hostname:        127.0.0.1
Server Port:            8080

Document Path:          /
Document Length:        14 bytes

Concurrency Level:      100
Time taken for tests:   13.268 seconds
Complete requests:      100000
Failed requests:        0
Write errors:           0
Total transferred:      11500000 bytes
HTML transferred:       1400000 bytes
Requests per second:    7536.74 [#/sec] (mean)
Time per request:       13.268 [ms] (mean)
Time per request:       0.133 [ms] (mean, across all concurrent requests)
Transfer rate:          846.41 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       3
Processing:     2   13   1.7     13      23
Waiting:        2   13   1.7     13      22
Total:          5   13   1.7     13      23

Percentage of the requests served within a certain time (ms)
  50%     13
  66%     13
  75%     13
  80%     14
  90%     15
  95%     17
  98%     19
  99%     20
 100%     23 (longest request)

Dave Cheney

unread,
Jan 2, 2014, 4:52:02 PM1/2/14
to kubaja...@gmail.com, golan...@googlegroups.com
Excellent news. Now this thread need never be commented in again. 

wari....@gmail.com

unread,
Feb 20, 2014, 3:29:35 AM2/20/14
to golan...@googlegroups.com, kubaja...@gmail.com
Uhm, it has to be commented. Running on a RaspberryPi. basic node server with hello world, and a counter incrementor in golang (using goroutine and channel). And go is 25% faster than node, not that it really matters:

Go:
Requests per second:    214.05 [#/sec] (mean)

Node:
Requests per second:    161.55 [#/sec] (mean)

https://gist.github.com/wari/9109081

I found this thread to be interesting though.. :)

ChrisLu

unread,
Feb 21, 2014, 6:24:08 PM2/21/14
to golan...@googlegroups.com, kubaja...@gmail.com, wari....@gmail.com
Your test seems not with concurrency enabled. Just single threaded.

I am actually the original poster. What I found interesting from this unscientific test is that, at the beginning, many people will call it wrong test, not realistic, bad testing tool, bad code, etc, or even saying node.js is C implementation and it for sure outperforms Go in this test. Every doubt has its own valid reasoning. But few accepts the fact that there is a problem in Go implementation. 

And after 2.5 years with some bug fixes, maybe due to better goroutine reuse, or/and some other performance improvements, Go can now consistently beat node.js in this test. Everybody just become quiet on this, accepting that Golang is of course faster than node.js.

What can we learn from it? Maybe the obvious: always create a benchmark for your own use case, don't trust other people's numbers.

Chris

minux

unread,
Feb 21, 2014, 6:36:27 PM2/21/14
to ChrisLu, wari....@gmail.com, kubaja...@gmail.com, golan...@googlegroups.com


On Feb 21, 2014 6:24 PM, "ChrisLu" <chri...@gmail.com> wrote:
>
> Your test seems not with concurrency enabled. Just single threaded.
>
> I am actually the original poster. What I found interesting from this unscientific test is that, at the beginning, many people will call it wrong test, not realistic, bad testing tool, bad code, etc, or even saying node.js is C implementation and it for sure outperforms Go in this test. Every doubt has its own valid reasoning. But few accepts the fact that there is a problem in Go implementation. 

No matter how both implementation perform, the test is still wrong and invalid. this fact simply won't change.

> And after 2.5 years with some bug fixes, maybe due to better goroutine reuse, or/and some other performance improvements, Go can now consistently beat node.js in this test. Everybody just become quiet on this, accepting that Golang is of course faster than node.js.

it's quiet on this topic because this topic has been beaten to death and we finally get to the consensus that we should just leave this alone.

> What can we learn from it? Maybe the obvious: always create a benchmark for your own use case, don't trust other people's numbers.

what i learn from this huge thread is that people like fame wars and esp. using contrived and invalid benchmark to prove their point with numbers.

ChrisLu

unread,
Feb 21, 2014, 8:31:40 PM2/21/14
to golan...@googlegroups.com, ChrisLu, wari....@gmail.com, kubaja...@gmail.com
This is a simplified test to serve static files. Think about CDN where you want to serve lots of static files fast. Just calling it "invalid" was not valid.

Of course, what's valid for some specific purpose may be invalid for others.

Chris

minux

unread,
Feb 21, 2014, 9:40:20 PM2/21/14
to ChrisLu, golang-nuts, Wari Wahab, kubaja...@gmail.com
On Fri, Feb 21, 2014 at 8:31 PM, ChrisLu <chri...@gmail.com> wrote:
This is a simplified test to serve static files. Think about CDN where you want to serve lots of static files fast. Just calling it "invalid" was not valid.
so your static http server hard-code the file content into the source code?
if it's so, fine. it's a valid benchmark for your specific use cases.

minux

unread,
Feb 21, 2014, 9:50:58 PM2/21/14
to ChrisLu, golang-nuts, Wari Wahab, kubaja...@gmail.com
On Fri, Feb 21, 2014 at 9:40 PM, minux <minu...@gmail.com> wrote:
On Fri, Feb 21, 2014 at 8:31 PM, ChrisLu <chri...@gmail.com> wrote:
This is a simplified test to serve static files. Think about CDN where you want to serve lots of static files fast. Just calling it "invalid" was not valid.
so your static http server hard-code the file content into the source code?
A further question: Why not just use a TCP listener on port 80 and send out
hard-coded HTTP response for each request (the HTTP response is static
except for the Date header)?

The benchmark is invalid precisely because it could be optimized to bypass
the net/http, which is the very package that it's intend to benchmark, very easily
(even mechanically).

Consider, for example, what if some advanced compiler toolchain could detect
the pattern, and "optimize" it into a simple TCP server that plays back the
(mostly) hard coded HTTP response.

Put it another way, we can throw away 99% of code in net/http and still make
this benchmark functional. can we do that to get a better result? What in the
benchmark prevents that kind of "optimization"?

Chris Lu

unread,
Feb 22, 2014, 4:39:48 AM2/22/14
to minux, golang-nuts, Wari Wahab, kubaja...@gmail.com
You are assuming the http package was the problem. But that's not correct.

Last time when I profiled this using earlier version of Go, most of the time was busy with  goroutine scheduling. Seems the http package simply allocated one goroutine for each http request, and there was some semaphore waiting when allocating those goroutines, trying to reuse some free goroutines, and if no free ones, creating one goroutine. (These are from my shallow understanding of the goroutine scheduling code at that time, could be wrong and very likely outdated).

So even if most net/http code can be optimized away, the goroutine scheduling could not be avoided.

Not knowing the exact fix, this particular test did show big performance improvements with recent Go. So the fix is either in Go virtual machine, or the default net/http package. I am curious to know what was the real issue. Did net/http package add something to re-use goroutines? or the goroutines allocations are much more efficient?

Chris

Dmitry Vyukov

unread,
Feb 22, 2014, 5:26:35 AM2/22/14
to Chris Lu, minux, golang-nuts, Wari Wahab, kubaja...@gmail.com
On Sat, Feb 22, 2014 at 1:39 PM, Chris Lu <chri...@gmail.com> wrote:
> You are assuming the http package was the problem. But that's not correct.
>
> Last time when I profiled this using earlier version of Go, most of the time
> was busy with goroutine scheduling. Seems the http package simply allocated
> one goroutine for each http request, and there was some semaphore waiting
> when allocating those goroutines, trying to reuse some free goroutines, and
> if no free ones, creating one goroutine. (These are from my shallow
> understanding of the goroutine scheduling code at that time, could be wrong
> and very likely outdated).
>
> So even if most net/http code can be optimized away, the goroutine
> scheduling could not be avoided.
>
> Not knowing the exact fix, this particular test did show big performance
> improvements with recent Go. So the fix is either in Go virtual machine, or
> the default net/http package. I am curious to know what was the real issue.
> Did net/http package add something to re-use goroutines? or the goroutines
> allocations are much more efficient?

Goroutine creation, finishing and scheduling become faster and more
scalable in 1.1. As well as network system calls (accept/read/write).
It is loading more messages.
0 new messages