GoProxy unable to handle more than 500 TPS(Transactions Per Second)

391 views
Skip to first unread message

subrata das

unread,
Nov 25, 2016, 6:02:56 AM11/25/16
to goproxy-dev

I have system of 32 Cores and 10GB network card.
I'm using Spirent tool to test the Http CONNECT feature of goproxy.
Spirent Tool shows that goproxy unable to handle more than 500 TPS.

throwing below erros ---
==================================
2016/11/25 12:42:52 [113] WARN: Error copying to client: read tcp 127.0.0.1:8081->127.0.0.1:51055: write tcp 127.0.0.1:8081->127.0.0.1:51055: write: broken pipe
2016/11/25 12:42:52 [115] WARN: Error copying to client: read tcp 127.0.0.1:8081->127.0.0.1:51067: write tcp 127.0.0.1:8081->127.0.0.1:51067: write: broken pipe
2016/11/25 12:42:52 [118] WARN: Error copying to client: read tcp 127.0.0.1:8081->127.0.0.1:51100: write tcp 127.0.0.1:8081->127.0.0.1:51100: write: broken pipe
2016/11/25 12:42:52 [117] WARN: Error copying to client: read tcp 127.0.0.1:8081->127.0.0.1:51088: read tcp 173.25.1.1:7381->173.25.1.2:443: read: connection reset by peer
2016/11/25 12:42:52 [116] WARN: Error copying to client: read tcp 127.0.0.1:8081->127.0.0.1:51080: write tcp 127.0.0.1:8081->127.0.0.1:51080: write: broken pipe
2016/11/25 12:42:52 [122] WARN: Error copying to client: read tcp 127.0.0.1:8081->127.0.0.1:51137: read tcp 173.25.1.1:7430->173.25.1.2:443: read: connection reset by peer
2016/11/25 12:42:52 [120] WARN: Error copying to client: read tcp 127.0.0.1:8081->127.0.0.1:51114: read tcp 173.25.1.1:7407->173.25.1.2:443: read: connection reset by peer
2016/11/25 12:42:52 [119] WARN: Error copying to client: read tcp 127.0.0.1:8081->127.0.0.1:51110: read tcp 173.25.1.1:7403->173.25.1.2:443: read: connection reset by peer
==================================
These errors looks like due to delay from Proxy, Client is forcibly Closing the connection.
I checked the CPU and MEM at that time ---
CPU = 10% and MEM = 8%

Run the goproxy by setting both runtime.GOMAXPROCS(1) and runtime.GOMAXPROCS(runtime.NumCPU), but haven't found any improvement.
Run the goproxy with log Level, -Verbose=false, but haven't found any improvement. 

Here is the code which uses goproxy ---
=============================
package main

import (
        "flag"
        "log"
        "net/http"
)

func main() {
        verbose := flag.Bool("v", false, "should every proxy request be logged to stdout")
        addr := flag.String("addr", ":80", "proxy listen address")
        flag.Parse()
        //runtime.GOMAXPROCS(1)
        proxy := goproxy.NewProxyHttpServer()
        proxy.Verbose = *verbose
        log.Fatal(http.ListenAndServe(*addr, proxy))
}
=============================

If we don't put the goproxy, then Normal Client and Server interection is go beyond 5000 TPS.
Instead of putting goproxy, Put HaProxy between Client and Server and it go beyond 1300 TPS.

Then run 4 instances of goproxy and loadbalanced using Ha Proxy and using this configuration also got 600 TPS and not more than that.  

We are running goproxy in Ubuntu14.04 ...
For tuning TCP in Linux for we set some parameters ---
========================================

kernel.core_pattern = /var/cores/core.%e.%p.%h.%t

net.core.rmem_default = 131072

net.core.rmem_max = 262144

net.core.wmem_max = 2097152

net.ipv4.tcp_rmem = 4096 131072 262144

net.ipv4.tcp_wmem = 10240 524288 2097152

net.ipv4.tcp_congestion_control = cubic

net.core.somaxconn = 65535

net.ipv4.tcp_max_tw_buckets = 1440000

net.ipv4.tcp_max_syn_backlog = 3240000

net.ipv4.tcp_window_scaling = 1

vm.min_free_kbytes = 65536

vm.swappiness = 0

net.ipv4.ip_local_port_range = 1024 65535

net.ipv4.tcp_tw_recycle=0

net.ipv4.tcp_tw_reuse=1

net.ipv4.tcp_orphan_retries=1

net.ipv4.tcp_fin_timeout=10

net.ipv4.tcp_max_orphans=400000

net.ipv4.tcp_mtu_probing=1

net.ipv4.tcp_timestamps = 1

net.ipv4.tcp_sack = 1

net.ipv4.tcp_low_latency = 1

net.core.netdev_max_backlog = 250000

net.core.wmem_default = 524288

net.core.optmem_max = 16777216

 

net.ipv6.conf.all.use_tempaddr = 0

net.ipv6.conf.default.use_tempaddr = 0

 

fs.mqueue.msg_max = 8192

 

kernel.panic_on_oops = 1

kernel.panic = 1

 

# Cause OOM conditions to panic

vm.panic_on_oom = 1

 

#net.ipv6.conf.all.disable_ipv6=1

#net.ipv6.conf.default.disable_ipv6=1

#net.ipv6.conf.lo.disable_ipv6=1

#fs.file-max=1000000

 

# avoid kernel panics - other side effects to be determined

net.ipv4.tcp_retrans_collapse = 0

 

# BUG 2009 - HTTP IPv6 Traffic dies when it reached Maximum Route

net.ipv6.route.max_size = 8388608

net.ipv6.route.max_size = 8388608

net.ipv6.route.gc_thresh = 8388608

net.ipv6.route.gc_elasticity = 11

 

# 1.1.2

net.ipv4.tcp_retries2 = 8

net.ipv4.tcp_no_metrics_save = 1

 

# EndCustomSettingsForWASP

 

#conntrack max value

net.ipv4.netfilter.ip_conntrack_max = 10000000

 

fs.file-max = 10000000

fs.nr_open = 10000000

============================================================

We set the ulimit to 10000000


Some Performance Parameters of goproxy, haproxy and TinyProxy in the same machine --
===============================================================
GoProxy = 450 TPS
TinyProxy = 500 TPS
HaProxy = 1300 TPS


Can you suggest how to improve the performance of goproxy ? Do we need to any Linux Parameter to Handle more connections ?



Thanks & Regards,
Subrata Das




Elazar Leibovich

unread,
Nov 27, 2016, 6:08:17 AM11/27/16
to subrata das, goproxy-dev
Hi,
Did you get this result with verbose off?
Can you share a profile of goproxy during the run, e.g., https://blog.golang.org/profiling-go-programs?
What's the bottleneck on the system during the time the benchmark runs?
Is network utilized?
Is CPU utilized?
The system is not swapping, right? Just making sure.

I'll try to have a look and see if I can spot the bottleneck.

Thanks,

--
You received this message because you are subscribed to the Google Groups "goproxy-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to goproxy-dev...@googlegroups.com.
To post to this group, send email to gopro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/goproxy-dev/95414e89-a639-47c7-b90d-564f7da36355%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

subrata das

unread,
Nov 29, 2016, 12:04:37 PM11/29/16
to Elazar Leibovich, goproxy-dev
Hi Elazar,

Thank you very for taking interest in my mail.
Here are the highlighted answers of your question --


Did you get this result with verbose off?
=== Yes. Verbose is off


Can you share a profile of goproxy during the run, e.g., https://blog.golang.org/profiling-go-programs?

=== I did the profiling and I checked the io.Copy functions is taking time.
Here is the output profiler --

(pprof) top15 -cum
2.31s of 12.04s total (19.19%)
Dropped 209 nodes (cum <= 0.06s)
Showing top 15 nodes out of 206 (cum >= 1.37s)
      flat  flat%   sum%        cum   cum%
         0     0%     0%      6.98s 57.97%  runtime.goexit
         0     0%     0%      3.62s 30.07%  runtime.mcall
     0.04s  0.33%  0.33%      3.42s 28.41%  runtime.schedule
     0.58s  4.82%  5.15%         3s 24.92%  runtime.findrunnable
         0     0%  5.15%      2.68s 22.26%  thirdparty/goproxy.copyAndClose
     0.05s  0.42%  5.56%      2.41s 20.02%  runtime.systemstack
     0.01s 0.083%  5.65%      2.20s 18.27%  runtime.park_m
     0.01s 0.083%  5.73%      2.11s 17.52%  net/http.(*conn).serve
         0     0%  5.73%      1.77s 14.70%  io.Copy
         0     0%  5.73%      1.77s 14.70%  io.copyBuffer
     0.01s 0.083%  5.81%      1.73s 14.37%  net.(*TCPConn).ReadFrom

         0     0%  5.81%      1.69s 14.04%  net.genericReadFrom
     0.23s  1.91%  7.72%      1.58s 13.12%  runtime.mallocgc
     1.38s 11.46% 19.19%      1.45s 12.04%  syscall.Syscall
         0     0% 19.19%      1.37s 11.38%  net/http.serverHandler.ServeHTTP
         0     0% 19.19%      1.37s 11.38%  thirdparty/goproxy.(*ProxyHttpServer).ServeHTTP
     0.02s  0.17% 19.44%      1.30s 10.80%  thirdparty/goproxy.(*ProxyHttpServer).handleHttps
         0     0% 29.40%      0.91s  7.56%  net.(*Dialer).Dial
         0     0% 29.40%      0.91s  7.56%  net.Dial
         0     0% 29.40%      0.91s  7.56%  thirdparty/goproxy.(*ProxyHttpServer).connectDial
         0     0% 29.40%      0.91s  7.56%  thirdparty/goproxy.(*ProxyHttpServer).dial
(pprof) list copyAndClose
Total: 12.04s
ROUTINE ======================== thirdparty/goproxy.copyAndClose in /home/visp-verizon/repos/new/VISP/goobee/src/thirdparty/goproxy/https.go
         0      2.68s (flat, cum) 22.26% of Total
         .          .    273:    }
         .          .    274:}
         .          .    275:
         .          .    276:func copyAndClose(ctx *ProxyCtx, w, r net.Conn) {
         .          .    277:    connOk := true
         .      1.79s    278:    if _, err := io.Copy(w, r); err != nil {
         .          .    279:        connOk = false
         .      110ms    280:        ctx.Warnf("Error copying to client: %s", err)
         .          .    281:    }
         .      780ms    282:    if err := r.Close(); err != nil && connOk {
         .          .    283:        ctx.Warnf("Error closing: %s", err)
         .          .    284:    }
         .          .    285:}
         .          .    286:
         .          .    287:func dialerFromEnv(proxy *ProxyHttpServer) func(network, addr string) (net.Conn, error) {

(pprof) list connectDial
Total: 12.04s
ROUTINE ======================== thirdparty/goproxy.(*ProxyHttpServer).connectDial in /home/visp-verizon/repos/new/VISP/goobee/src/thirdparty/goproxy/https.go
         0      910ms (flat, cum)  7.56% of Total
         .          .     53:    return net.Dial(network, addr)
         .          .     54:}
         .          .     55:
         .          .     56:func (proxy *ProxyHttpServer) connectDial(network, addr string) (c net.Conn, err error) {
         .          .     57:    if proxy.ConnectDial == nil {
         .      910ms     58:        return proxy.dial(network, addr)
         .          .     59:    }
         .          .     60:    return proxy.ConnectDial(network, addr)
         .          .     61:}
         .          .     62:
         .          .     63:func (proxy *ProxyHttpServer) handleHttps(w http.ResponseWriter, r *http.Request) {


(pprof) list handleHttps    
Total: 12.04s
ROUTINE ======================== thirdparty/goproxy.(*ProxyHttpServer).handleHttps in /home/visp-verizon/repos/new/VISP/goobee/src/thirdparty/goproxy/https.go
      20ms      1.30s (flat, cum) 10.80% of Total
         .          .     59:    }
         .          .     60:    return proxy.ConnectDial(network, addr)
         .          .     61:}
         .          .     62:
         .          .     63:func (proxy *ProxyHttpServer) handleHttps(w http.ResponseWriter, r *http.Request) {
         .       30ms     64:    ctx := &ProxyCtx{Req: r, Session: atomic.AddUint64(&proxy.sess, 1), proxy: proxy}
         .          .     65:    if !proxy.IsNonProxyReq(r, ctx) {
         .          .     66:        Increment(proxy.St, "NumCorruptedRequests")
         .          .     67:        proxy.NonproxyHandler.ServeHTTP(w, r)
         .          .     68:        proxy.RspSentCallback(ctx)
         .          .     69:        return
         .          .     70:    }
         .          .     71:
         .          .     72:    hij, ok := w.(http.Hijacker)
         .          .     73:    if !ok {
         .          .     74:        panic("httpserver does not support hijacking")
         .          .     75:    }
         .          .     76:
         .       10ms     77:    proxyClient, _, e := hij.Hijack()
         .          .     78:    if e != nil {
         .          .     79:        panic("Cannot hijack connection " + e.Error())
         .          .     80:    }
         .          .     81:
         .       20ms     82:    ctx.proxyListenAddr = proxyClient.LocalAddr().String()
         .       20ms     83:    ctx.Logf("Running %d CONNECT handlers", len(proxy.httpsHandlers))
         .          .     84:    todo, host := OkConnect, r.URL.Host
         .          .     85:    for i, h := range proxy.httpsHandlers {
         .          .     86:        newtodo, newhost := h.HandleConnect(host, ctx)
         .          .     87:
         .          .     88:        // If found a result, break the loop immediately
         .          .     89:        if newtodo != nil {
         .          .     90:            todo, host = newtodo, newhost
         .       20ms     91:            ctx.Logf("on %dth handler: %v %s", i, todo, host)
         .          .     92:            break
         .          .     93:        }
         .          .     94:    }
         .          .     95:    switch todo.Action {
         .          .     96:    case ConnectAccept:
         .       60ms     97:        if !hasPort.MatchString(host) {
         .          .     98:            host += ":80"
         .          .     99:        }
         .      910ms    100:        targetSiteCon, err := proxy.connectDial("tcp", host)
         .          .    101:        if err != nil {
         .          .    102:            Increment(proxy.St, "NumDNSFailed")
         .          .    103:            httpError(proxyClient, ctx, err)
         .          .    104:            return
         .          .    105:        }
         .       20ms    106:        ctx.Logf("Accepting CONNECT to %s", host)
         .      160ms    107:        proxyClient.Write([]byte("HTTP/1.0 200 OK\r\n\r\n"))
         .       10ms    108:        Increment(proxy.St, "NumConnectServed")
         .          .    109:      
      10ms       30ms    117:            go copyAndClose(ctx, targetSiteCon, proxyClient)
      10ms       10ms    118:            go copyAndClose(ctx, proxyClient, targetSiteCon)
         .          .    120:    case ConnectHijack:
         .          .    121:        ctx.Logf("Hijacking CONNECT to %s", host)
         .          .    122:        proxyClient.Write([]byte("HTTP/1.0 200 OK\r\n\r\n"))
         .          .    123:        todo.Hijack(r, proxyClient, ctx)

==============================================================================

What's the bottleneck on the system during the time the benchmark runs?
== CPU and Memory percentage in minimal, CPU=20% and MEM=10%

Is network utilized?
== I think, it's not properly utilizing the network.

Is CPU utilized?
== No, It's not properly utilizing the CPU also.

The system is not swapping, right? Just making sure.
== How to check that ?


From the profiler, it looks like io.Copy is taking much more time to copy the data from Server.
Size of the data is around 200Kb.

Can you suggest me how to increase the performance of io.Copy() ?
Do we need to set any TCP connection related flags or functions to increase the performance of the Copy.


Thanks & Regards,
Subrata





To unsubscribe from this group and stop receiving emails from it, send an email to goproxy-dev+unsubscribe@googlegroups.com.

To post to this group, send email to gopro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/goproxy-dev/95414e89-a639-47c7-b90d-564f7da36355%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Regards,
Subrata Das
Reply all
Reply to author
Forward
0 new messages