Go will shine on huge web projects, but how about simple ones?

550 views
Skip to first unread message

Tong Sun

unread,
Jun 7, 2019, 9:36:49 AM6/7/19
to golang-nuts
I had always believed that the web projects build with Go should be much faster than Perl, since Go is a compiled language. 

However that belief was crushed brutally last night, when I did a comparison -- the Go implementation is 8 times worse than the Perl! -- the mean response time jumped from 6ms to 48ms. 

I know this is the simplest possible web server, but still, when it comes to simple web servers like this, I have to say that Perl performs much better than Go.

I don't think there is much I can twist on the Go side, since it can't be more simpler than that. However, I also believe it won't hurt to ask and confirm. So,

Have I missed anything? Is it possible for me to make my Go implementation anywhere near the Perl's performance?

Thanks


Ronny Bangsund

unread,
Jun 7, 2019, 11:00:41 AM6/7/19
to golang-nuts
Yes, the built-in is pretty awful in many ways. There are fortunately lots of alternatives, all with roughly an order of magnitude better performance. I inspected this list yesterday to finally make a choice of packages to use: https://github.com/smallnest/go-web-framework-benchmark

I ended up with the following as my weapons of choice:
- https://github.com/valyala/fasthttp (the core HTTP server)
https://github.com/phachon/fasthttpsession (sessions/non-REST stuff)

There are other packages near the performance of fasthttp, and sometimes you just want "good enough" performance to gain some conveniences. Most, if not all, are likely to be better choices for performance than the Go http package. The ones above just looked most agreeable to me :)

For template engines, it depends on your needs. If you're preparing pages to serve them statically the standard packages are fine (html/template, or even text/template for certain uses). If you're constantly rebuilding pages you'll want to look into faster template engines.

Burak Serdar

unread,
Jun 7, 2019, 11:12:17 AM6/7/19
to Ronny Bangsund, golang-nuts
On Fri, Jun 7, 2019 at 9:01 AM Ronny Bangsund <ronny.b...@gmail.com> wrote:
>
> Yes, the built-in is pretty awful in many ways. There are fortunately lots of alternatives, all with roughly an order of magnitude better performance. I inspected this list yesterday to finally make a choice of packages to use: https://github.com/smallnest/go-web-framework-benchmark
>
> I ended up with the following as my weapons of choice:
> - https://github.com/valyala/fasthttp (the core HTTP server)
> - https://github.com/fasthttp/router (paths)
> - https://github.com/phachon/fasthttpsession (sessions/non-REST stuff)

Some time ago somebody posted a memory corruption problem which turned
out to be a race condition in one of these fasthttp libraries. Some of
the optimizations rely heavily on unsafe use.



>
> There are other packages near the performance of fasthttp, and sometimes you just want "good enough" performance to gain some conveniences. Most, if not all, are likely to be better choices for performance than the Go http package. The ones above just looked most agreeable to me :)
>
> For template engines, it depends on your needs. If you're preparing pages to serve them statically the standard packages are fine (html/template, or even text/template for certain uses). If you're constantly rebuilding pages you'll want to look into faster template engines.
>
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/19767e83-c498-4842-b41c-145730775c05%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Anger

unread,
Jun 7, 2019, 7:35:18 PM6/7/19
to golang-nuts
Yeah, you probably want to avoid fasthttp.
and
Are both better than the base httpserver and I've using them in large scale adtech deployments without issue.

Also you can get BETTER performance out of the base http client (over fasthttp's) if you just tweak the settings a bit to do a lot of the stuff fasthttp also does under the hood.

-Matt

On Friday, June 7, 2019 at 8:12:17 AM UTC-7, Burak Serdar wrote:
On Fri, Jun 7, 2019 at 9:01 AM Ronny Bangsund <ronny....@gmail.com> wrote:
>
> Yes, the built-in is pretty awful in many ways. There are fortunately lots of alternatives, all with roughly an order of magnitude better performance. I inspected this list yesterday to finally make a choice of packages to use: https://github.com/smallnest/go-web-framework-benchmark
>
> I ended up with the following as my weapons of choice:
> - https://github.com/valyala/fasthttp (the core HTTP server)
> - https://github.com/fasthttp/router (paths)
> - https://github.com/phachon/fasthttpsession (sessions/non-REST stuff)

Some time ago somebody posted a memory corruption problem which turned
out to be a race condition in one of these fasthttp libraries. Some of
the optimizations rely heavily on unsafe use.



>
> There are other packages near the performance of fasthttp, and sometimes you just want "good enough" performance to gain some conveniences. Most, if not all, are likely to be better choices for performance than the Go http package. The ones above just looked most agreeable to me :)
>
> For template engines, it depends on your needs. If you're preparing pages to serve them statically the standard packages are fine (html/template, or even text/template for certain uses). If you're constantly rebuilding pages you'll want to look into faster template engines.
>
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

Iwan Budi Kusnanto

unread,
Jun 7, 2019, 7:47:47 PM6/7/19
to Tong Sun, golang-nuts
Can you share your Go code?

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/c78e25f4-d2f8-462b-b227-f3ec6edae9fc%40googlegroups.com.

Ivan Bertona

unread,
Jun 7, 2019, 8:08:19 PM6/7/19
to golang-nuts
Looking a the two code samples I wouldn't say this is an apples to apples comparison... The Perl script seems to be a simple single-threaded loop that understands a tiny subset of HTTP vs. a fully-fledged (and secure) web server from the Go standard library. I would definitely not run that Perl script in production, even if it was for a simple project. My bet is that if you actually port the Perl script to a Go program that does more or less the same thing you'll see more or less the same performance (because the example is fundamentally I/O-bound).

Best,
Ivan

Tong Sun

unread,
Jun 8, 2019, 9:56:37 AM6/8/19
to Ivan Bertona, golang-nuts, Iwan Budi Kusnanto
Agree that it was not an apples to apples comparison. So please check
out my 2nd blog:

https://dev.to/suntong/simple-web-server-in-perl-and-go-revisit-5d82

> thanks to Axel Wagner, who replaced the net/http.Server layer with direct translation of Perl code, the code is now reading and writing directly to a socket, just as what the Perl code is doing.

@Budi, links to my Go code are available in both of my posts.
> --
> You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/iH2Ck_hpCpI/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/12e7666a-385f-47ef-b061-9738802e7a88%40googlegroups.com.

Steven Hartland

unread,
Jun 8, 2019, 11:56:51 AM6/8/19
to Tong Sun, Ivan Bertona, golang-nuts, Iwan Budi Kusnanto
Couple of things that you might want to investigate:
1. Is SetReadDeadline the same as SO_RCVTIMEO (vm vs socket)?
2. Is c.Close()  the same as shutdown (flushes vs doesn't)?
3. Is print is the same as fmt.Fprintf / c.Write (buffered vs unbuffered)?

With the go I'd be tempted to put everything from the successful accept
to the socket close in a goroutine.

    Regards
    Steve

Wojciech S. Czarnecki

unread,
Jun 8, 2019, 3:01:16 PM6/8/19
to golan...@googlegroups.com
On Sat, 8 Jun 2019 09:56:05 -0400
Tong Sun <sunto...@gmail.com> wrote:

> Agree that it was not an apples to apples comparison. So please check
> out my 2nd blog:
>
> https://dev.to/suntong/simple-web-server-in-perl-and-go-revisit-5d82
>

Trying to make sense of your measures...

...Still apples to oranges due to testing on a loopback interface. Both ab and perl are wired up to the kernel short paths
so in fact their packets were exchanged via a simple ownership change (of the block of memory).
AFAIK bytes from packets destined to go server were copied as go does not link to glibc for net support
(it does it only for nslookup services).

Try to benchmark **over real wire**.

Also use more tools: https://github.com/httperf/httperf, even old Siege.
Read https://www.nginx.com/blog/testing-the-performance-of-nginx-and-nginx-plus-web-servers/
how their tests were done.


Hope this helps,

--
Wojciech S. Czarnecki
<< ^oo^ >> OHIR-RIPE

Tong Sun

unread,
Jun 8, 2019, 9:12:51 PM6/8/19
to Wojciech S. Czarnecki, golang-nuts
Thanks.

So I benchmarked over real wire, using httperf. But seems to me
httperf is not reporting much, right?
At least not to the level you refered as "how their tests were done".

Anyway, apart from the following, what else can I get out of httperf?


$ httperf --server 192.168.0.11 --port 8888 --uri / --num-conns 1500 --hog
httperf --hog --client=0/1 --server=192.168.0.11 --port=8888 --uri=/
--send-buffer=4096 --recv-buffer=16384 --num-conns=1500 --num-calls=1
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of
open files to FD_SETSIZE
Maximum connect burst length: 1

Total: connections 1500 requests 1500 replies 1500 test-duration 1.325 s

Connection rate: 1132.2 conn/s (0.9 ms/conn, <=1 concurrent connections)
Connection time [ms]: min 0.4 avg 0.9 max 3.5 median 0.5 stddev 0.2
Connection time [ms]: connect 0.4
Connection length [replies/conn]: 1.000

Request rate: 1132.2 req/s (0.9 ms/req)
Request size [B]: 65.0

Reply rate [replies/s]: min 0.0 avg 0.0 max 0.0 stddev 0.0 (0 samples)
Reply time [ms]: response 0.4 transfer 0.1
Reply size [B]: header 136.0 content 43.0 footer 0.0 (total 179.0)
Reply status: 1xx=0 2xx=1500 3xx=0 4xx=0 5xx=0

CPU time [s]: user 0.32 system 1.00 (user 24.4% system 75.5% total 99.8%)
Net I/O: 269.8 KB/s (2.2*10^6 bps)

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0


The above is Go.
The following is Perl.

$ httperf --server 192.168.0.10 --uri / --num-conns 1500 --hog
httperf --hog --client=0/1 --server=192.168.0.10 --port=80 --uri=/
--send-buffer=4096 --recv-buffer=16384 --num-conns=1500 --num-calls=1
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of
open files to FD_SETSIZE
Maximum connect burst length: 1

Total: connections 1500 requests 1500 replies 1500 test-duration 1.285 s

Connection rate: 1167.6 conn/s (0.9 ms/conn, <=1 concurrent connections)
Connection time [ms]: min 0.3 avg 0.9 max 1.8 median 0.5 stddev 0.2
Connection time [ms]: connect 0.4
Connection length [replies/conn]: 1.000

Request rate: 1167.6 req/s (0.9 ms/req)
Request size [B]: 65.0

Reply rate [replies/s]: min 0.0 avg 0.0 max 0.0 stddev 0.0 (0 samples)
Reply time [ms]: response 0.4 transfer 0.1
Reply size [B]: header 124.0 content 43.0 footer 0.0 (total 167.0)
Reply status: 1xx=0 2xx=1500 3xx=0 4xx=0 5xx=0

CPU time [s]: user 0.34 system 0.94 (user 26.7% system 73.2% total 99.9%)
Net I/O: 264.5 KB/s (2.2*10^6 bps)

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0


I.e., I can't even figure out what the end-to-end response time is.
I presume it is *not* the connection time.

Justin Israel

unread,
Jun 8, 2019, 10:54:23 PM6/8/19
to golang-nuts
I'm wondering about a couple factors in this comparison that seem to make a difference in my local test:
  1. I think perl sockets are write buffered. So would the equivalent be to wrap the net.Conn in bufio.NewWriter(c) and flush before the Close?
  2. Since this is a straigh-line test where both servers are not using concurrent handling of connections (uncommon for a Go server even on 1 core), would it not make sense to run the Go server with GOMAXPROCS=1? 
- Justin

Axel Wagner

unread,
Jun 9, 2019, 11:09:12 AM6/9/19
to golang-nuts
As I've also mentioned: I don't think this test is meaningful.

First, as it has been pointed out, your Perl program isn't actually a web server. It only understands ridiculously simple requests and as such violates the spec left and right. It's also super naive in how it treats malformed input or actively malicious clients - all of which are handled by the Go http package, so of course it's going to have some overhead.

In its most generous form, you translate the programs faithfully to do the same syscalls and implement the same logic - and at that point, you are essentially benchmarking the Perl regular expression engine against the Go regular expression engine, as that's the only thing the program really is doing: Branching on a regexp-match. And the Perl RE-engine is famously optimized and the Go RE-engine famously is not. In fact, you aren't even benchmarking Perl against Go, you are benchmarking *C* against Go, as Perls RE-engine is written in C, AFAIK.

Lastly, as I read the results I get with either net/http *or* the naive regexp-parsing, I just… disagree with your conclusions. All the numbers I've seen seem to imply that Go responded *on average* a lot faster and *might* have higher variance and a slightly higher tail latency (though, FTR, the measurements I got also suggest that the data is mostly noise). And I'm struggling to find an application, where that would matter a lot. Like, yeah, tail latency matters, but so does average latency. In basically all applications I can think of, you a) want tail latencies to not be catastrophic, especially when considering fan-out, but b) lower averages are just as important anyway. So, I think you should, for a decent test, formulate your criteria beforehand. Currently, there seems to be a lot of reading in tea-leafs involved.


Anyway, all of that being said: While I don't think the test you devised allows the broad generalizations you are making, ultimately I don't have any stakes in this (which is why I haven't responded further on the blog post). If you like Perl and think it performs good enough or better for your use case - then go ahead and use Perl. No one here will begrudge you for it.

All of that being said

IMO this is just not a meaningful test. Or its results are totally unsurprising.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/37a1ac7e-85ee-4775-b348-5673c41a162c%40googlegroups.com.

Tong Sun

unread,
Jun 10, 2019, 5:28:07 PM6/10/19
to golang-nuts, Damian Gryski
Just to clarify some facts.

On Sun, Jun 9, 2019 at 11:09 AM 'Axel Wagner' wrote:
>
> As I've also mentioned: I don't think this test is meaningful.
>
> First, as it has been pointed out, your Perl program isn't actually a web server. It only understands ridiculously simple requests and as such violates the spec left and right. It's also super naive in how it treats malformed input or actively malicious clients - all of which are handled by the Go http package, so of course it's going to have some overhead.

There is a second test well before your this post, which is a direct
translation of Perl code, that is now reading and writing directly to
a socket. Hanging on to the first test method and not referring to the
second test is not a very constructive way of discussion, let alone
using words like "ridiculously ...".

> In its most generous form, you translate the programs faithfully to do the same syscalls and implement the same logic - and at that point, you are essentially benchmarking the Perl regular expression engine against the Go regular expression engine, as that's the only thing the program really is doing: Branching on a regexp-match. And the Perl RE-engine is famously optimized and the Go RE-engine famously is not. In fact, you aren't even benchmarking Perl against Go, you are benchmarking *C* against Go, as Perls RE-engine is written in C, AFAIK.

Over 90% of the code are NOT doing regular expression matching.
Focusing *only* on regular expression, not >90% of the rest, is not
very convincing but miss the elephant in the room, at least seems to
me. Many people have given valid inputs as where things might get
improved, including the one you are quoting.

> Lastly, as I read the results I get with either net/http *or* the naive regexp-parsing, I just… disagree with your conclusions. All the numbers I've seen seem to imply that Go responded *on average* a lot faster and *might* have higher variance and a slightly higher tail latency (though, FTR, the measurements I got also suggest that the data is mostly noise). And I'm struggling to find an application, where that would matter a lot. Like, yeah, tail latency matters, but so does average latency. In basically all applications I can think of, you a) want tail latencies to not be catastrophic, especially when considering fan-out, but b) lower averages are just as important anyway. So, I think you should, for a decent test, formulate your criteria beforehand. Currently, there seems to be a lot of reading in tea-leafs involved.

Please don't get personal and emotional over technical discussions,
and please refrain from using terms like "ridiculously simple" or
"reading in tea-leafs" in future discussions. It is inappropriately
condescending, thus not the correct attitude of communication. It
does not align with the code of conduct of this mlist, and neither of
google's, I believe.

If you have test results that contradict with mine, please show me
yours -- Let's talk the data, and not let the emotion get in the way
of technical discussion, and fact the fact, whatever it is.

> Anyway, all of that being said: While I don't think the test you devised allows the broad generalizations you are making, ultimately I don't have any stakes in this (which is why I haven't responded further on the blog post). If you like Perl and think it performs good enough or better for your use case - then go ahead and use Perl. No one here will begrudge you for it.

As I've comment in the blog many times, I'm not trying to prove Perl
performs better than Go. On the contrary, I was trying to improve Perl
performs with Go, that's why the whole thing get started, as I have
replied to your comment in the blog:"that's why I was rewriting the
Perl code to Go".

It is still the case, and I want to try out everyone's suggestion to
see how things can improve.

Moreover, if you have taken a look at my second test, you will
understand my other goal is to make it clear "who and what to trust".
Because if you have taken a look at the httperf test result I posted
to this mlist before you reply, you may have realized that, of all the
performance testing tools I've used so far, all are suggesting Perl
performs better than Go, except for httperf. So maybe the performance
testing tools are biased toward Go, and I made it clear in my second
blog that I want to get to the bottom of it.

> All of that being said
>
> IMO this is just not a meaningful test. Or its results are totally unsurprising.

Again, I was trying to improve Perl performs with Go. That
"ridiculously simple" Perl code is the foundation of the Debian dbab
package, and I was trying to improve it.

It might be meaningless to you but it is perfectly meaningful to me.
Please don't be so judgmental.

> On Sun, Jun 9, 2019 at 4:54 AM Justin Israel <justin...@gmail.com> wrote:
>>
>> I'm wondering about a couple factors in this comparison that seem to make a difference in my local test:
>>
>> I think perl sockets are write buffered. So would the equivalent be to wrap the net.Conn in bufio.NewWriter(c) and flush before the Close?
>> Since this is a straigh-line test where both servers are not using concurrent handling of connections (uncommon for a Go server even on 1 core), would it not make sense to run the Go server with GOMAXPROCS=1?
>>
>> - Justin
>>
>> On Saturday, June 8, 2019 at 1:36:49 AM UTC+12, Tong Sun wrote:
>>>
>>> I had always believed that the web projects build with Go should be much faster than Perl, since Go is a compiled language.
>>>
>>> However that belief was crushed brutally last night, when I did a comparison -- the Go implementation is 8 times worse than the Perl! -- the mean response time jumped from 6ms to 48ms.
>>>
>>> I know this is the simplest possible web server, but still, when it comes to simple web servers like this, I have to say that Perl performs much better than Go.
>>>
>>> I don't think there is much I can twist on the Go side, since it can't be more simpler than that. However, I also believe it won't hurt to ask and confirm. So,
>>>
>>> Have I missed anything? Is it possible for me to make my Go implementation anywhere near the Perl's performance?
>>>
>>> Thanks
>>>
>>>
>> --
>> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/37a1ac7e-85ee-4775-b348-5673c41a162c%40googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/iH2Ck_hpCpI/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAEkBMfF62D5v%2BRiGtAtzuH0wAHzCLqcwNUidC1Oe2KN9DfRv6Q%40mail.gmail.com.

Marcin Romaszewicz

unread,
Jun 10, 2019, 7:07:42 PM6/10/19
to Tong Sun, golang-nuts, Damian Gryski
I think the others were correct in pointing the finger at the RegEx engine in Go. It is quite slow. I hacked your inside loop which checks the request not to use regular expressions, and it's tons faster. You can't say that something can't be responsible for too much slowdown because it "1 line", since that one line has a lot of weight behind it. Using regular expressions, a benchmark showed 1782 requests per second. Using my simple hack, it's 18827 per second. Let's call it 10x faster.

if strings.Contains(line, "HTTP") {
parts := strings.Split(line, " ")
req.Method = strings.ToUpper(strings.TrimSpace(parts[0]))
req.URL = strings.ToUpper(strings.TrimSpace(parts[1]))
req.Version = strings.ToUpper(strings.TrimSpace(parts[2]))
continue
}

For this benchmark it behaves correctly, I realize this is fragile. I ran `ab -n 10000 -c 100...` to run the tests.

Benchmarks using regular expressions:
Concurrency Level:      100
Time taken for tests:   5.610 seconds
Complete requests:      10000
Failed requests:        0
Total transferred:      1790000 bytes
HTML transferred:       430000 bytes
Requests per second:    1782.41 [#/sec] (mean)
Time per request:       56.104 [ms] (mean)
Time per request:       0.561 [ms] (mean, across all concurrent requests)
Transfer rate:          311.57 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   46 352.7      8    3557
Processing:     0   10  10.6      8     158
Waiting:        0   10  10.6      8     158
Total:          0   56 352.5     21    3565

Percentage of the requests served within a certain time (ms)
  50%     21
  66%     24
  75%     24
  80%     24
  90%     25
  95%     26
  98%    164
  99%   3549
 100%   3565 (longest request)


Benchmarks using my hack:

Concurrency Level:      100
Time taken for tests:   0.531 seconds
Complete requests:      10000
Failed requests:        0
Total transferred:      1790000 bytes
HTML transferred:       430000 bytes
Requests per second:    18827.39 [#/sec] (mean)
Time per request:       5.311 [ms] (mean)
Time per request:       0.053 [ms] (mean, across all concurrent requests)
Transfer rate:          3291.12 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    3   0.3      3       4
Processing:     1    3   0.4      3       5
Waiting:        0    3   0.4      3       5
Total:          3    5   0.5      5       8

Percentage of the requests served within a certain time (ms)
  50%      5
  66%      5
  75%      5
  80%      5
  90%      6
  95%      6
  98%      7
  99%      7
 100%      8 (longest request)



Marcin Romaszewicz

unread,
Jun 10, 2019, 7:12:56 PM6/10/19
to Tong Sun, golang-nuts, Damian Gryski
One more followup.

Here's an example using an HTTP router named Echo, which I use in production. With proper HTTP parsing and validation, and no regular expressions involved in routing, it still does about 14,000 requests per second. I stubbed some of your stuff which doesn't affect the result. This is a much better implementation than speaking HTTP yourself over sockets and it's stupendously fast.

Concurrency Level:      100
Time taken for tests:   0.717 seconds

Complete requests:      10000
Failed requests:        0
Total transferred:      2160000 bytes
HTML transferred:       430000 bytes
Requests per second:    13940.29 [#/sec] (mean)
Time per request:       7.173 [ms] (mean)
Time per request:       0.072 [ms] (mean, across all concurrent requests)
Transfer rate:          2940.53 [Kbytes/sec] received


Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    3   1.2      3       8
Processing:     1    4   1.2      4       9
Waiting:        0    3   1.2      3       8
Total:          2    7   1.3      7      13


Percentage of the requests served within a certain time (ms)
  50%      7
  66%      7
  75%      8
  80%      8
  90%      9
  95%     10
  98%     11
  99%     11
 100%     13 (longest request)


package main

import (
"flag"
"fmt"
"net/http"
"strconv"

"github.com/labstack/echo"
)

const pixel = "\x47\x49\x46\x38\x39\x61\x01\x00\x01\x00\x80\x00\x00\xFF\xFF\xFF\x00\x00\x00\x21\xF9\x04\x01\x00\x00\x00\x00\x2C\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x02\x44\x01\x00\x3B"

func main() {
var port = flag.Int("port", 8080, "service port")
flag.Parse()

autoProxy := fmt.Sprintf(
"function FindProxyForURL(url, host) { return \"PROXY %s:3128; DIRECT\"; }",
"stubbed.host")
autoProxyBuf := []byte(autoProxy)

e := echo.New()

serveAutoProxy := func(c echo.Context) error {
response := c.Response()
response.Header().Add("Connection", "close")
return c.Blob(http.StatusOK, "application/octet-stream", autoProxyBuf)
}
e.GET("/proxy.pac", serveAutoProxy)
e.GET("/wpad.dat", serveAutoProxy)

pixelBuf := []byte(pixel)
servePixel := func(c echo.Context) error {
response := c.Response()
response.Header().Add("Connection", "close")
response.Header().Add("ETag", "dbab")
response.Header().Add("Cache-Control", "public, max-age=31536000")
response.Header().Add("Content-length", strconv.Itoa(len(pixelBuf)))
return c.Blob(http.StatusOK, "image/gif", pixelBuf)
}
e.GET("*", servePixel)

// Start server
e.Logger.Fatal(e.Start(fmt.Sprintf("0.0.0.0:%d", *port)))
}


Marcin Romaszewicz

unread,
Jun 10, 2019, 7:46:36 PM6/10/19
to Tong Sun, golang-nuts, Damian Gryski
In my example in the previous email, I accidentally used a very old version of echo. If you use the latest ("github.com/labstack/echo/v4"), then it's a lot faster than using simple string splitting, resulting in about 22,000 requests per second.

Concurrency Level:      100
Time taken for tests:   0.443 seconds

Complete requests:      10000
Failed requests:        0
Total transferred:      2160000 bytes
HTML transferred:       430000 bytes
Requests per second:    22565.77 [#/sec] (mean)
Time per request:       4.431 [ms] (mean)
Time per request:       0.044 [ms] (mean, across all concurrent requests)
Transfer rate:          4759.97 [Kbytes/sec] received


Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    2   0.5      2       9
Processing:     1    2   0.6      2       9
Waiting:        0    2   0.6      2       9
Total:          3    4   0.8      4      11


Percentage of the requests served within a certain time (ms)
  50%      4
  66%      4
  75%      5
  80%      5
  90%      5
  95%      5
  98%      6
  99%      6
 100%     11 (longest request)

Axel Wagner

unread,
Jun 10, 2019, 7:50:38 PM6/10/19
to Tong Sun, golang-nuts, Damian Gryski
On Mon, Jun 10, 2019 at 11:28 PM Tong Sun <sunto...@gmail.com> wrote:
There is a second test well before your this post, which is a direct
translation of Perl code, that is now reading and writing directly to
a socket. Hanging on to the first test method and not referring to the
second test is not a very constructive way of discussion, let alone
using words like "ridiculously ...".

The updated test only changes *both* not to implement a web server. My point was, that this is not a realistic test case.e 
 
Over 90% of the code are NOT doing regular expression matching.
Focusing *only* on regular expression, not >90% of the rest, is not
very convincing but miss the elephant in the room, at least seems to
me.

The amount of SLOC is not important, however, but where time is spent. When adding some basic CPU profiling to the Go code, it spent ~10% of its time RE-matching, ~10% in various allocation-related activities, ~75% blocking in syscalls and the rest in other various network related runtime code. That's pretty much what I meant - in terms of actual CPU time used, RE-matching (and some allocation) is pretty much all that's happening. Now, I don't know how to profile Perl code and I'll admit that the netpoller etc. are part of the language and as such matter for the comparison. But the more direct the translation becomes, the less these code-paths will differ in performance and the more realistic your scenario gets (in terms of concurrency used and required), the more I'd expect Go to be favored by it.

Please don't get personal and emotional over technical discussions,
and please refrain from using terms like "ridiculously simple" or
"reading in tea-leafs" in future discussions. It is inappropriately
condescending, thus not the correct attitude of communication.  It
does not align with the code of conduct of this mlist, and neither of
google's, I believe.

If you have test results that contradict with mine, please show me
yours -- Let's talk the data, and not let the emotion get in the way
of technical discussion, and fact the fact, whatever it is.

I am not contradicting your data, I'm contradicting your conclusions. That's why I said "reading tea-leafs": You seem to look at the same data and come to conclusions that just don't connect for me. For example, your original post says that the mean response time for your Perl code is 110.875ms resp. 0.222ms and for your Go code it's 83.229ms resp. 0.166ms - and yet you claim that your test shows that the mean response time is 8x higher for your Go code than your Perl code.

And in your second post, you use `hey`, but then just… throw the results out, because you don't understand them (AIUI).

I just don't know how I'm supposed to handle that discrepancy. All of this just looks like really bad science to me. ISTM that defining the criteria beforehand and designing the test to measure exactly those criteria is the most reasonable way to reconcile that. If we use "mean response time over X", it seems that so far your Go code is coming out ahead. And note that *I* didn't make any claims about the relative strengths of programming languages here. IMO the onus is clearly on you to show your work when making these claims. I don't actually care. 

Tong Sun

unread,
Jun 10, 2019, 8:28:32 PM6/10/19
to Axel Wagner, golang-nuts, Damian Gryski
I think you should at least try to apologize for your rude behavior,
but again you are ignoring all others' valid reasoning and concluding
your make sense while others don't.

That's a very good point to end the discussion.
I'll not be responding to your trolls.

Tong Sun

unread,
Jun 10, 2019, 8:35:02 PM6/10/19
to Marcin Romaszewicz, golang-nuts
Thanks a lot for all your help Marcin! Your expertise makes a total
difference here!

Double-thumbs up!

Amnon Baron Cohen

unread,
Jun 11, 2019, 7:44:37 AM6/11/19
to golang-nuts
For those interested, you can find some web benchmarks here https://www.techempower.com/benchmarks/

For the trivial http ping type server, you would expect latency to be dominated by system time.
strace'ing your code will give you an idea what it is doing. Go allows you to have multiple goroutines
serving responses. This does carry a certain overhead as any network reads will have to go through
netpoll to demux and schedule goroutines when the packets they are waiting for arrives.
So we should not be too dismayed if Go code shows higher latencies than a single treaded script. 

Tong Sun

unread,
Jun 11, 2019, 9:25:06 AM6/11/19
to golang-nuts
Thanks Amnon for the insight, which confirmed my guessing when
answering the following question against ab in my blog:

> I'm unclear how to interpret the results from ab. From the top chunk of the results (time taken for tests, requests per second, time per request, etc.), it would appear that Go is about 33% faster than Perl. But the bit at the bottom about "percentage of the requests served..." seems wildly incongruent and makes it look like Perl is orders of magnitude faster. It doesn't add up. It also doesn't logically add up for Perl to be faster than Go (which, I get, is your point). It makes me suspicious of ab's results.
> --
> You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/iH2Ck_hpCpI/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/d091d5a4-afdc-4538-b86c-fe6fa6b99ccf%40googlegroups.com.

Tong Sun

unread,
Jun 12, 2019, 11:27:03 PM6/12/19
to golang-nuts
On Mon, Jun 10, 2019 at 5:27 PM Tong Sun wrote:

> > IMO this is just not a meaningful test. Or its results are totally unsurprising.
>
> Again, I was trying to improve Perl performs with Go. That
> "ridiculously simple" Perl code is the foundation of the Debian dbab
> package, and I was trying to improve it.
>
> It might be meaningless to you but it is perfectly meaningful to me.

For those who care ... long story short, I finally made it, with helps
I got from various sources -- I finally made Go faster than Perl!
https://dev.to/suntong/simple-web-server-in-perl-and-go-finally-1kgh

Tong Sun

unread,
Jun 13, 2019, 12:05:27 AM6/13/19
to Justin Israel, golang-nuts, Ronny Bangsund
Yep, thanks Justin, both your points really made the difference,
because I do believe that the buffered socket write is the key
component for the 3-time improvement that I'm getting (from using
FastHTTP), IMHO. And FastHTTP recommends GOMAXPROCS=1 too, which I
used this time as well.

So thanks again, Justin & Ronny!
> --
> You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/iH2Ck_hpCpI/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/37a1ac7e-85ee-4775-b348-5673c41a162c%40googlegroups.com.

Justin Israel

unread,
Jun 13, 2019, 12:10:09 AM6/13/19
to Tong Sun, golang-nuts, Ronny Bangsund
On Thu, Jun 13, 2019 at 4:05 PM Tong Sun <sunto...@gmail.com> wrote:
Yep, thanks Justin, both your points really made the difference,
because I do believe that the buffered socket write is the key
component for the 3-time improvement that I'm getting (from using
FastHTTP), IMHO. And FastHTTP recommends GOMAXPROCS=1 too, which I
used this time as well.

So thanks again, Justin & Ronny!

Cool. Glad that actually helped. I do feel with your results, and in agreement with others,  that you really are comparing Go to the underlying C implementation of php, with everything stripped back so far. But at least you are happy with your results!

Tong Sun

unread,
Jun 13, 2019, 12:43:23 AM6/13/19
to Justin Israel, golang-nuts
Yeah, no doubt, I totally agree with that as well.

But now, I can proudly say ...

My Go beats C :-)

Tong Sun

unread,
Jun 13, 2019, 10:47:45 AM6/13/19
to Damian Gryski, golang-nuts
Hi Damian & all. 

I think I need to set the record straight, as most of the technical discussions has been taken place elsewhere, so I feel that I was treated unfairly because of that. If you only see one site of the story, then it is very easy to do that, I get it and understand it however. 

So first of all, was I "like Perl and think it performs good enough or better for your use"? No. As I have expressed again and again, I don't think Go should lag so far behind of Perl. How bad? See for yourself:

image.png

There has some discrepancies and discussions on Apache Bench's statistics, but my conclusion was that I trust the measured & reported response time more than the statistical numbers. 

This is the fundamental of the discussion. And as I have expressed again and again, I want to improve my Go code, because I don't think Go being 8 times longer than (that "ridiculously simple") Perl on average, "its results are totally unsurprising" and should be an accepted as an unchangeable fact

So if you had thought I should be the one who apologies first, expressed on or off line, please read on... 

On Mon, Jun 10, 2019 at 7:50 PM Axel Wagner wrote:

On Mon, Jun 10, 2019 at 11:28 PM Tong Sun wrote:
There is a second test well before your this post, which is a direct
translation of Perl code, that is now reading and writing directly to
a socket. Hanging on to the first test method and not referring to the
second test is not a very constructive way of discussion, let alone
using words like "ridiculously ...".

The updated test only changes *both* not to implement a web server. My point was, that this is not a realistic test case.e 
 
Over 90% of the code are NOT doing regular expression matching.
Focusing *only* on regular expression, not >90% of the rest, is not
very convincing but miss the elephant in the room, at least seems to
me.

The amount of SLOC is not important, however, but where time is spent. When adding some basic CPU profiling to the Go code, it spent ~10% of its time RE-matching, ~10% in various allocation-related activities, ~75% blocking in syscalls and the rest in other various network related runtime code. That's pretty much what I meant - in terms of actual CPU time used, RE-matching (and some allocation) is pretty much all that's happening. Now, I don't know how to profile Perl code and I'll admit that the netpoller etc. are part of the language and as such matter for the comparison. But the more direct the translation becomes, the less these code-paths will differ in performance and the more realistic your scenario gets (in terms of concurrency used and required), the more I'd expect Go to be favored by it.

Please don't get personal and emotional over technical discussions,
and please refrain from using terms like "ridiculously simple" or
"reading in tea-leafs" in future discussions. It is inappropriately
condescending, thus not the correct attitude of communication.  It
does not align with the code of conduct of this mlist, and neither of
google's, I believe.

If you have test results that contradict with mine, please show me
yours -- Let's talk the data, and not let the emotion get in the way
of technical discussion, and fact the fact, whatever it is.
 
There is a reason I'm saying above. Here I quote word for word from Axel Wagner:

In any case, in this direct translation, I would've actually assumed perl to be faster - because literally all we do is a regexp-match and the perl RE-engine is famously optimized and the Go RE-engine is famously meh. That perl still ends up with significantly higher average response times is… honestly, a confirmation that it's not a very fast language.
 
BTW, I also used a different tool to benchmark (github.com/rakyll/hey). AIUI, it uses multiple cores more efficiently. With that tool, the perl implementation gets absolutely demolished, with an average response time ~40x higher than the Go implementation on my machine.

I tried out his implementation eagerly (after thank him publicly of course), but my result cannot replicate his findings/claims. So I asked him a follow up question in the blog. He never replies to my inquiry, and now is this, the only reply I get, from my simple technical question, "As you can see, the test result from my machine is much different from yours. Would you blog how you tested and your results please?"

I am not contradicting your data, I'm contradicting your conclusions.

And now he changed his tune. I never accused him of "reading tea-leafs" for coming up the ungrounded conclusion as "average response time ~40x higher than Go", but he started to blame me for "reading tea-leafs", unprovoked:
 
That's why I said "reading tea-leafs": You seem to look at the same data and come to conclusions that just don't connect for me. For example, your original post says that the mean response time for your Perl code is 110.875ms resp. 0.222ms and for your Go code it's 83.229ms resp. 0.166ms - and yet you claim that your test shows that the mean response time is 8x higher for your Go code than your Perl code.

And in your second post, you use `hey`, but then just… throw the results out, because you don't understand them (AIUI).

Is it really so? Here is my full reasoning (on the hey part only, taken from here):

...when I thought more into it, it doesn't make sense to me any more.

The hey's default settings are:

Usage: hey [options...] <url>

Options:
  -n  Number of requests to run. Default is 200.
  -c  Number of requests to run concurrently. Total number of requests cannot
      be smaller than the concurrency level. Default is 50.

As we can see in the previous result, the Perl server survives the 500 concurrent requests gracefully, with 95% percentile reading of only 7ms. However hey is telling a different story that, with only 200 concurrent requests, the Perl server's response time is mostly 0.105 seconds. However,

That is 15 times than ab's reading.

Hey's inability of getting accurate reading at ms level makes me doubt it's own accuracy, and I have to suspect that hey's 15-time response time reading is due to its own slack.

Moreover, if you take a closer look at hey's Latency distribution readings, we can see that Perl at 75% is still 0.0181secs, while Go, at 75% is already 0.0222secs. Let alone at 10%, Perl is 0.0045secs while Go is 0.0150secs (>3 times more). For this we can see that hey's reading is contradicting with themselves and Perl clearly performs much better than Go from hey's Latency distribution readings.

Furthermore, that 13 long 1.043sec response time from the Perl server is only observed in hey, all the other performance testing software, including the not-documented-here httperf, none of them have exhibited such behavior. So it is fair to conclude that the long 1.043sec response time is caused by hey itself, not by the Perl server.

So, up to now, for all above three reasons, I have to conclude that hey is not suitable for this test due to its own limitations, thus cannot stacks up against the most popular Apache Bench.

Did I just "throw the results out"? I don't personally think so. 
And my whole reasoning and discussion were rejected simply with a claim: "throw the results out because you don't understand them (AIUI)."

I just don't know how I'm supposed to handle that discrepancy. All of this just looks like really bad science to me. ISTM that defining the criteria beforehand and designing the test to measure exactly those criteria is the most reasonable way to reconcile that. If we use "mean response time over X", it seems that so far your Go code is coming out ahead. And note that *I* didn't make any claims about the relative strengths of programming languages here. IMO the onus is clearly on you to show your work when making these claims. I don't actually care. 

> Anyway, all of that being said: While I don't think the test you devised allows the broad generalizations you are making, ultimately I don't have any stakes in this (which is why I haven't responded further on the blog post). If you like Perl and think it performs good enough or better for your use case - then go ahead and use Perl. No one here will begrudge you for it.

As I've comment in the blog many times, I'm not trying to prove Perl
performs better than Go. On the contrary, I was trying to improve Perl
performs with Go, that's why the whole thing get started, as I have
replied to your comment in the blog:"that's why I was rewriting the
Perl code to Go".

It is still the case, and I want to try out everyone's suggestion to
see how things can improve.

Moreover, if you have taken a look at my second test, you will
understand my other goal is to make it clear "who and what to trust".
Because if you have taken a look at the httperf test result I posted
to this mlist before you reply, you may have realized that, of all the
performance testing tools I've used so far, all are suggesting Perl
performs better than Go, except for httperf. So maybe the performance
testing tools are biased toward Go, and I made it clear in my second
blog that I want to get to the bottom of it.

> All of that being said
>
> IMO this is just not a meaningful test. Or its results are totally unsurprising.

Again, I was trying to improve Perl performs with Go. That
"ridiculously simple" Perl code is the foundation of the Debian dbab
package, and I was trying to improve it.

It might be meaningless to you but it is perfectly meaningful to me.
Please don't be so judgmental.

 As I have promised, I will try out everyone's suggestion to see how things can improve, and I did, and I finally had made my Go faster than my Perl. 

I still think somebody owns me a public apology, but I don't have any hope for that, and I don't care either. 

Jan Mercl

unread,
Jun 13, 2019, 10:59:33 AM6/13/19
to Tong Sun, Damian Gryski, golang-nuts
On Thu, Jun 13, 2019 at 4:47 PM Tong Sun <sunto...@gmail.com> wrote:

If you want or need to discuss personal feelings, please do it off-list.

Thank you.

Axel Wagner

unread,
Jun 14, 2019, 5:37:05 AM6/14/19
to Tong Sun, golang-nuts
I just want to re-emphasize one specific thing: Neither I, nor anyone else, actually owes you any of their time. I don't know you and I don't feel like it is a good investment of my time to convince strangers on the internet. So yes. You asked me to blog about things. I won't and I also won't spend a lot of time and effort trying to dissect your reasoning. I don't feel like that is something I need to apologize for. My choice of wording could have been better, probably, but I still don't feel you have actually answered the question of why the statistical results are not conclusive.

I can only tell you and others why I find your results unconvincing and ultimately meaningless. And you can decide whether you're happy with being unconvincing to me or not (I would've assumed you are fine with that).

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Axel Wagner

unread,
Jun 14, 2019, 6:44:18 AM6/14/19
to Tong Sun, golang-nuts
So, I *did* invest a little more time, and ISTM that the data you're getting are mostly a result of Go handling requests in parallel and perl not doing that.
The statistics on top of the ab-output average over all requests by total time taken. Whereas the data you dumped and plotted contains the timings for individual requests. So, individually, the Go requests are slower, but because Go handles a lot more requests in parallel than perl, the *total* request rate is faster.
If you run your Go code with GOMAXPROCS=1, the individual requests get handled a lot faster too - most likely, because they have a larger share of the CPU available. They are still individually slower though - but they are *also* in total still faster, so they are still handling more requests in parallel. This makes some sense, as GOMAXPROC only limits the number of threads running Go code, not the threads spawned to block on syscalls, so you still have more total threads blocked reading requests (and also, I'd assume that there are still kernel threads for each parallel request using CPU).
Now, even with my naive, single goroutine Go code, you *still* have that effect (the individual requests are slower, but *overall* the benchmark is quicker, meaning more requests are handled in parallel).

Anyway. I still strongly feel, that you need to define what you want to measure carefully.
You are setting high values for concurrent connects, but then effectively penalize Go for actually handling them concurrently (which has to share resources and thus naturally creates slower requests). Even though the Go code has much higher throughput (as given by the top statistics), you are interpreting it to be worse (to illustrate: Say Go would handle each request exactly as quickly as Perl, but would actually handle two requests concurrently. You would claim it's exactly as fast as Perl, even though it actually is twice as fast).

If you take throughput as the primary performance measure, Go dominates. But if you take *latency*, you need to actually take care to prevent Go from concurrently handling requests to prevent cross-contamination to make it an apples-to-apples comparison. The simplest way to do that would be to just set -c 1. When I do that and run with GOMAXPROCS=1, I get pretty much the same latency statistics (which means, essentially 0) for both my Go code and your Perl code (with the Go code still having higher throughput).

Anyway. I understand the discrepancy between the statistics better now. I still feel (or rather: I feel even more), that your test is meaningless, however. It seems clear to me, that with such a tiny request handler, the noise introduced even by the methodology of ab itself is causing more effects than the actual language.

Tong Sun

unread,
Jun 14, 2019, 10:43:50 AM6/14/19
to Axel Wagner, golang-nuts
Thanks for conducting the technical discussion in a normal way, and the detailed explanation of the the statistics on top of the ab-output average. That concurs with Amnon's insight, and also confirmed my guessing when answering the question against ab in my blog, which I quoted in an earlier message. 

Amnon Baron Cohen

unread,
Jun 15, 2019, 4:40:10 AM6/15/19
to golang-nuts
I would add that engineering is the art of tradeoffs.

The language authors decided to route all network scheduling
through netpoll, and launch a new goroutine for every incoming http 
request. These decisions carry a significant overhead. But they mean 
that users of the language can effortlessly write production ready web apps which magically
scale to utilise all the cores at our disposal, handle concurrent
and long-running requests with grace, handle orders of magnitudes more 
requests than python or ruby, and which do not require us to become
experts in tuning Apache/Nginx configs.

So the authors have decided to favour making our lives easier over 
trying to get absolutely optimal performance in benchmarks.
Personally I appreciate their choices and enjoy using Go.
Those who have other priorities are more than welcome to write the web apps in 
Perl, Rust, C++, assembler etc...
Go was never meant to be all things to all people.
Reply all
Reply to author
Forward
0 new messages