dial io timeout and tcp short connection performance bottleneck

1,154 views
Skip to first unread message

刘桂祥

unread,
Jun 10, 2017, 9:48:30 PM6/10/17
to golang-nuts
//question
>         I write a simple golang tcp server
>         and use tcp short connect client to bench it.
>         when qps up to 28K,  client have many dial i/o timeout
>         I don't know what is the performance bottleneck??
>         
>         I modify the server:
>         cat /proc/sys/net/ipv4/tcp_max_syn_backlog    1024
>         cat /proc/sys/net/core/somaxconn              1024

//a simple golang tcp server
        
       package main
        
        import (
        "io"
        "log"
        "net"
        )
        
        func init() {
        log.SetFlags(log.LstdFlags | log.Lshortfile)
        }
        func main() {
        tcpAddr, err := net.ResolveTCPAddr("tcp", ":9999")
        if err != nil {
        log.Println(err)
        return
        }   
        tcpListener, err := net.ListenTCP("tcp", tcpAddr)
        if err != nil {
        log.Println(err)
           return
        }   
        defer tcpListener.Close()
        
        for {
        tcpConn, err := tcpListener.AcceptTCP()
        if err != nil {
        log.Println(err)
        continue
        }
        go handle(tcpConn)
        }
        
        }
        func handle(conn *net.TCPConn) {
        //log.Println(conn.RemoteAddr().String())
        
        buf := make([]byte, 1024)
        for {
        _, err := conn.Read(buf)
        if err != nil {
        if err != io.EOF {
        log.Println(err)
        }
        return
        }
        
        _, err = conn.Write([]byte{'h', 'e', 'l', 'l', 'o'})
        
        if err != nil {
        log.Println(err)
        }
        }
        }
    
// tcp short connection bench

    package main
    
    import (
    "log"
    "net"
    "sync"
    "time"
    )
    
    var (
    wg sync.WaitGroup
    )
    
    func init() {
    log.SetFlags(log.LstdFlags | log.Lshortfile)
    }
    func main() {
    
    for i := 0; i < 150; i++ {
    wg.Add(1)
    go func() {
    for {
    req()
    time.Sleep(10 * time.Millisecond)
    }
    wg.Done()
    }()
    }
    wg.Wait()
    }
    // tcp req
    func req() {
    conn, err := net.DialTimeout("tcp", "192.168.67.133:9999", 300*time.Millisecond)
    if err != nil {
    log.Println(err)
    return
    }
    // conn.SetDeadline(time.Now().Add(300 * time.Millisecond))
    defer conn.Close()
    _, err = conn.Write([]byte("client"))
    if err != nil {
    log.Println(err)
    return
    }
    buf := make([]byte, 1024)
    _, err = conn.Read(buf)
    if err != nil {
    log.Println(err)
    }
    }

mjste...@gmail.com

unread,
Jun 11, 2017, 7:03:58 AM6/11/17
to golang-nuts
Are you exhausting the local TCP sockets? Some (50k?) in a wait state? Try using a few IP addresses on the test box. Increase /proc/sys/net/core/somaxconn to a larger value, if you have a 20 millisecond pause at 28kreqs/sec then thats 500 ish connections so 1024 may not be enough.

刘桂祥

unread,
Jun 11, 2017, 9:41:58 AM6/11/17
to golang-nuts
HI Matthew:
       Because I don't konw the performance bottleneck;  do you mean the tcp syn queue (half open queue) is full ??    but I don't see the syn flood warn message in /var/log/messages

在 2017年6月11日星期日 UTC+8下午7:03:58,Matthew Stevenson写道:

mjste...@gmail.com

unread,
Jun 11, 2017, 2:07:27 PM6/11/17
to golang-nuts
I meant the fully open connections waiting for application handling so the listen queue (somaxconn). On the client and server I'd check you aren't exhausting any of the more obvious limits, so ports/listen queues. Adding more virtual IPs is easy way to test the ports. Listen queues should show up in "netstat -s" as TPC drops. Increasing somaxconn should help unless you are CPU/network limited. It's very easy to hit a system limit and then think your programming is at fault.

hong...@gmail.com

unread,
Jun 11, 2017, 8:54:20 PM6/11/17
to golang-nuts
Maybe your client's local ports is used out. threre are only about 65000 ports you can use to make a connection from a client to server per IP. You can check output of netstat to confirm that. 

刘桂祥

unread,
Jun 11, 2017, 10:53:03 PM6/11/17
to golang-nuts
Hi Darkofday:
       I doubt that if the client's local port is  used out, it should't  return dail io timeout
       I use one client with less worker it don't have dial timeout and when another client with the same worker count to bench the server,  now  two clients have dial io  timeout  
       so I guess the performance bottleneck  is in the server

在 2017年6月12日星期一 UTC+8上午8:54:20,Darkofday Darkofday写道:
Reply all
Reply to author
Forward
0 new messages