unexpected stuck in sync.(*Pool).Get()

354 views
Skip to first unread message

Peter Z

unread,
Jun 17, 2021, 12:19:45 PM6/17/21
to golang-nuts

Golang ENV: 
go1.14.3 linux/amd64

Description:
We have about half a million agents running on each of our machines.The agent is written in Go. Recently we found that the agent may get stuck, no response for the sent requests. The metrics exported from the agent show that a channel in the agent(caching the request) is full. Deep into the goroutine stacks, we found that the goroutines consuming messages from the channel are all waiting for a lock.The goroutines Stack details are shown below.

    // Consuming Goroutine
    166 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x520ebd 0x51fdff 0x51fcd0 0x737fb4 0x73a836 0x73a813 0x97d660 0x97d60a 0x97d5e9 0x4689e1
    #       0x449546        sync.runtime_SemacquireMutex+0x46                                               /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71
    #       0x481c1b        sync.(*Mutex).lockSlow+0xfb                                                     /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138
    #       0x482791        sync.(*Mutex).Lock+0x271                                                        /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
    #       0x482792        sync.(*Pool).pinSlow+0x272                                                      /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213
    #       0x4824ed        sync.(*Pool).pin+0x5d                                                           /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:206
    #       0x4821ae        sync.(*Pool).Get+0x2e                                                           /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:128
    #       0x520ebc        go.uber.org/zap/zapcore.getCheckedEntry+0x2c                                    /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/zapcore/entry.go:45
    #       0x51fdfe        go.uber.org/zap/zapcore.(*CheckedEntry).AddCore+0x18e                           /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/zapcore/entry.go:240
    #       0x51fccf        go.uber.org/zap/zapcore.(*ioCore).Check+0x5f                                    /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/zapcore/core.go:80
    #       0x737fb3        go.uber.org/zap.(*Logger).check+0x153                                           /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/logger.go:269
    #       0x73a835        go.uber.org/zap.(*Logger).Check+0x85                                            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/logger.go:172
    #       0x73a812        go.uber.org/zap.(*SugaredLogger).log+0x62                                       /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/sugar.go:233
    #       0x97d65f        go.uber.org/zap.(*SugaredLogger).Infof+0x30f                                    /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/sugar.go:138
    #       0x97d609        ******/log.Infof+0x2b9                               /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/******/com...@v0.0.0-20210323102343-4a6074e63e74/log/log.go:71
    #       0x97d5e8        ******/api.(*xxHeadServer).workerProcMsg+0x298   /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/xxhead_server.go:350
    
    // Consuming Goroutine
    119 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x5269e1 0x5269d2 0x526892 0x51f6cd 0x51f116 0x51ff39 0x5218e7 0x73a8b0 0x97d660 0x97d60a 0x97d5e9 0x4689e1
    #       0x449546        sync.runtime_SemacquireMutex+0x46                                               /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71
    #       0x481c1b        sync.(*Mutex).lockSlow+0xfb                                                     /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138
    #       0x482791        sync.(*Mutex).Lock+0x271                                                        /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
    #       0x482792        sync.(*Pool).pinSlow+0x272                                                      /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213
    #       0x4824ed        sync.(*Pool).pin+0x5d                                                           /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:206
    #       0x4821ae        sync.(*Pool).Get+0x2e                                                           /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:128
    #       0x5269e0        go.uber.org/zap/zapcore.getJSONEncoder+0x30                                     /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/zapcore/json_encoder.go:43
    #       0x5269d1        go.uber.org/zap/zapcore.(*jsonEncoder).clone+0x21                               /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/zapcore/json_encoder.go:300
    #       0x526891        go.uber.org/zap/zapcore.(*jsonEncoder).Clone+0x31                               /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/zapcore/json_encoder.go:294
    #       0x51f6cc        go.uber.org/zap/zapcore.consoleEncoder.writeContext+0x4c                        /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/zapcore/console_encoder.go:128
    #       0x51f115        go.uber.org/zap/zapcore.consoleEncoder.EncodeEntry+0x3d5                        /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/zapcore/console_encoder.go:110
    #       0x51ff38        go.uber.org/zap/zapcore.(*ioCore).Write+0xa8                                    /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/zapcore/core.go:86
    #       0x5218e6        go.uber.org/zap/zapcore.(*CheckedEntry).Write+0x116                             /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/zapcore/entry.go:215
    #       0x73a8af        go.uber.org/zap.(*SugaredLogger).log+0xff                                       /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/sugar.go:234
    #       0x97d65f        go.uber.org/zap.(*SugaredLogger).Infof+0x30f                                    /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/go.uber.org/z...@v1.9.1/sugar.go:138
    #       0x97d609        ******/log.Infof+0x2b9                               /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/pkg/mod/******/com...@v0.0.0-20210323102343-4a6074e63e74/log/log.go:71
    #       0x97d5e8        ******/api.(*xxHeadServer).workerProcMsg+0x298   /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/xxhead_server.go:350
    
    // Consuming Goroutine
    59 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x4d8291 0x4d5726 0x9857f7 0x9804ca 0x97d5d5 0x4689e1
    #       0x449546        sync.runtime_SemacquireMutex+0x46                                                       /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71
    #       0x481c1b        sync.(*Mutex).lockSlow+0xfb                                                             /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138
    #       0x482791        sync.(*Mutex).Lock+0x271                                                                /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
    #       0x482792        sync.(*Pool).pinSlow+0x272                                                              /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213
    #       0x4824ed        sync.(*Pool).pin+0x5d                                                                   /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:206
    #       0x4821ae        sync.(*Pool).Get+0x2e                                                                   /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:128
    #       0x4d8290        fmt.newPrinter+0x30                                                                     /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/fmt/print.go:137
    #       0x4d5725        fmt.Errorf+0x25                                                                         /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/fmt/errors.go:18
    #       0x9857f6        ******/api.(*xxHeadServer).parseDataName+0xc36           /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/xxhead_server.go:1731
    #       0x9804c9        ******/api.(*xxHeadServer).procLocalServiceConfReq+0xc9  /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/xxhead_server.go:742
    #       0x97d5d4        ******/api.(*xxHeadServer).workerProcMsg+0x284           /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/xxhead_server.go:349

    // Another 10 Goroutines not consuming the messages in channel, 
    // but also get stuck in the sync.(*Pool).Get method
    10 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x4d8291 0x4d8856 0x9761b6 0x4689e1
        #       0x449546        sync.runtime_SemacquireMutex+0x46                                                       /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71
        #       0x481c1b        sync.(*Mutex).lockSlow+0xfb                                                             /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138
        #       0x482791        sync.(*Mutex).Lock+0x271                                                                /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
        #       0x482792        sync.(*Pool).pinSlow+0x272                                                              /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213
        #       0x4824ed        sync.(*Pool).pin+0x5d                                                                   /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:206
        #       0x4821ae        sync.(*Pool).Get+0x2e                                                                   /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:128
        #       0x4d8290        fmt.newPrinter+0x30                                                                     /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/fmt/print.go:137
        #       0x4d8855        fmt.Sprintf+0x25                                                                        /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/fmt/print.go:218
        #       0x9761b5        ******/datainfect.(*Infector).workerPushVersion+0x225    /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/datainfect.go:93

One of the goroutine detail is here, odd 4581 minutes.

    goroutine 6295 [semacquire, 4581 minutes]:
    sync.runtime_SemacquireMutex(0x107aa64, 0x0, 0x1)
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71 +0x47
    sync.(*Mutex).lockSlow(0x107aa60)
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138 +0xfc
    sync.(*Mutex).Lock(...)
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
    sync.(*Pool).pinSlow(0x103dea0, 0x0, 0x0)
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213 +0x272
    sync.(*Pool).pin(0x103dea0, 0xc00a258020, 0x10)
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:206 +0x5e
    sync.(*Pool).Get(0x103dea0, 0x20, 0x9f61e0)
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:128 +0x2f
    fmt.newPrinter(0xc002b18900)
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/fmt/print.go:137 +0x31
    fmt.Errorf(0xb00213, 0x18, 0xc007629d98, 0x1, 0x1, 0x1, 0xc00a39e020)
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/fmt/errors.go:18 +0x26
    ******/api.(*xxHeadServer).parseDataName(0xc0006ae080, 0xc00a244000, 0xc00a200060, 0x1f)
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/xxhead_server.go:1731 +0xc37
    ******/api.(*xxHeadServer).procLocalServiceConfReq(0xc0006ae080, 0xc00a244000, 0x0, 0x0)
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/xxhead_server.go:742 +0xca
    ******/api.(*xxHeadServer).workerProcMsg(0xc0006ae080)
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/xxhead_server.go:349 +0x285
    created by ******/api.(*xxHeadServer).Init
            /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/xxhead_server.go:224 +0x339

The stack shows that all of the goroutines are waiting for the global lock in sync.Pool. But I can't figure out which gouroutine is holding the lock. There should be a gouroutine which has `sync.runtime_SemacquireMutex` in it's stack not at the top, but there isn't.

    [****** ~]$ curl ******:795/debug/pprof/goroutine?debug=1 2>/dev/null | grep 'sync.runtime_SemacquireMutex' -A5 -B1
    166 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x520ebd 0x51fdff 0x51fcd0 0x737fb4 0x73a836 0x73a813 0x97d660 0x97d60a 0x97d5e9 0x4689e1
    # 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71
    # 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138
    # 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
    # 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:213
    # 0x4824ed sync.(*Pool).pin+0x5d /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:206
    # 0x4821ae sync.(*Pool).Get+0x2e /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:128
    --
    120 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x4f646f 0x51ed7b 0x51ff39 0x5218e7 0x73a8b0 0x97d660 0x97d60a 0x97d5e9 0x4689e1
    # 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71
    # 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138
    # 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
    # 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:213
    # 0x4824ed sync.(*Pool).pin+0x5d /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:206
    # 0x4821ae sync.(*Pool).Get+0x2e /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:128
    --
    119 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x5269e1 0x5269d2 0x526892 0x51f6cd 0x51f116 0x51ff39 0x5218e7 0x73a8b0 0x97d660 0x97d60a 0x97d5e9 0x4689e1
    # 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71
    # 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138
    # 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
    # 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:213
    # 0x4824ed sync.(*Pool).pin+0x5d /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:206
    # 0x4821ae sync.(*Pool).Get+0x2e /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:128
    --
    59 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x4d8291 0x4d5726 0x9857f7 0x9804ca 0x97d5d5 0x4689e1
    # 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71
    # 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138
    # 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
    # 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:213
    # 0x4824ed sync.(*Pool).pin+0x5d /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:206
    # 0x4821ae sync.(*Pool).Get+0x2e /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:128
    --
    36 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x51ed98 0x51ed88 0x51ff39 0x5218e7 0x73a8b0 0x97d660 0x97d60a 0x97d5e9 0x4689e1
    # 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71
    # 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138
    # 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
    # 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:213
    # 0x4824ed sync.(*Pool).pin+0x5d /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:206
    # 0x4821ae sync.(*Pool).Get+0x2e /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:128
    --
    10 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x4d8291 0x4d8856 0x9761b6 0x4689e1
    # 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71
    # 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138
    # 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
    # 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:213
    # 0x4824ed sync.(*Pool).pin+0x5d /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:206
    # 0x4821ae sync.(*Pool).Get+0x2e /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:128
    --
    2 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x7c6ddc 0x7c6dc3 0x7c8cdf 0x4689e1
    # 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71
    # 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138
    # 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81
    # 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:213
    # 0x4824ed sync.(*Pool).pin+0x5d /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:206
    # 0x4821ae sync.(*Pool).Get+0x2e /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/    ******/go-env/go1-14-linux-amd64/src/sync/pool.go:128

I walked the code in Golang's runtime, but can't find any clue to this phenomenon, can anyone who is expert in Golang give me a hand?

    func (p *Pool) pinSlow() (*poolLocal, int) {
        // Retry under the mutex.
        // Can not lock the mutex while pinned.
        runtime_procUnpin()
        allPoolsMu.Lock() //--------> HERE, all of the goroutines are waiting here
        defer allPoolsMu.Unlock()
        pid := runtime_procPin()
        // poolCleanup won't be called while we are pinned.
        s := p.localSize
        l := p.local
        if uintptr(pid) < s {
            return indexLocal(l, pid), pid
        }
        if p.local == nil {
            allPools = append(allPools, p)
        }
        // If GOMAXPROCS changes between GCs, we re-allocate the array and lose the old one.
        size := runtime.GOMAXPROCS(0)
        local := make([]poolLocal, size)
        atomic.StorePointer(&p.local, unsafe.Pointer(&local[0])) // store-release
        runtime_StoreReluintptr(&p.localSize, uintptr(size))     // store-release
        return &local[pid], pid
    }

Reproduce:
Can't find a way to reproduce this problem for now.

Robert Engels

unread,
Jun 17, 2021, 1:12:02 PM6/17/21
to Peter Z, golang-nuts
You probably need multiple pools in and partition them. 500k accessors of a shared lock is going to have contention. 


On Jun 17, 2021, at 11:19 AM, Peter Z <zjy19...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/1e6f8895-b6d2-418f-a942-b31d0ee6de1fn%40googlegroups.com.

Ian Lance Taylor

unread,
Jun 17, 2021, 1:19:05 PM6/17/21
to Peter Z, golang-nuts
On Thu, Jun 17, 2021 at 9:19 AM Peter Z <zjy19...@gmail.com> wrote:
>
> The original post is on stackoverflow https://stackoverflow.com/questions/67999117/unexpected-stuck-in-sync-pool-get
>
> Golang ENV:
> go1.14.3 linux/amd64
>
> Description:
> We have about half a million agents running on each of our machines.The agent is written in Go. Recently we found that the agent may get stuck, no response for the sent requests. The metrics exported from the agent show that a channel in the agent(caching the request) is full. Deep into the goroutine stacks, we found that the goroutines consuming messages from the channel are all waiting for a lock.The goroutines Stack details are shown below.

That is peculiar. What is happening under the lock is that the pool
is allocating a slice that is GOMAXPROCS in length. This shouldn't
take long, obviously. And it only needs to happen when the pool is
first created, or when GOMAXPROCS changes. So: how often do you
create this pool? Is it the case that you create the pool and then
have a large number of goroutines try to Get a value simultaneously?
Or, how often do you change GOMAXPROCS? (And, if you do change
GOMAXPROCS, why?)


> The stack shows that all of the goroutines are waiting for the global lock in sync.Pool. But I can't figure out which gouroutine is holding the lock. There should be a gouroutine which has `sync.runtime_SemacquireMutex` in it's stack not at the top, but there isn't.

I don't think that is what you would see. I think you would see a
goroutine with pinSlow in the stack but with SemaquireMutex not in the
stack.


> Reproduce:
> Can't find a way to reproduce this problem for now.

It's going to be pretty hard for us to solve the problem without a reproducer.

Ian

Ian Lance Taylor

unread,
Jun 17, 2021, 1:20:21 PM6/17/21
to Robert Engels, Peter Z, golang-nuts
On Thu, Jun 17, 2021 at 10:11 AM Robert Engels <ren...@ix.netcom.com> wrote:
>
> You probably need multiple pools in and partition them. 500k accessors of a shared lock is going to have contention.

That might well help, but note that sync.Pool does not have a shared
lock in general use. The shared lock is only used when the pool is
first created, and each time that GOMAXPROCS changes.

Ian

Robert Engels

unread,
Jun 17, 2021, 4:09:18 PM6/17/21
to Ian Lance Taylor, Peter Z, golang-nuts
You’re right. Inspecting the code it is internally partitioned by P.

I agree that it looks like the pool is being continually created.

> On Jun 17, 2021, at 12:18 PM, Ian Lance Taylor <ia...@golang.org> wrote:
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOyqgcW89OSL9frH%3D7f_gMga8qiKQ%2BR6cRnnWK3Kr5dKh3yLsg%40mail.gmail.com.

Peter Z

unread,
Jun 20, 2021, 10:48:20 PM6/20/21
to golang-nuts
On Thu, Jun 17, 2021 at 9:19 AM Peter Z  wrote:
>
> The original post is on stackoverflow https://stackoverflow.com/questions/67999117/unexpected-stuck-in-sync-pool-get
>
> Golang ENV:
> go1.14.3 linux/amd64
>
> Description:
> We have about half a million agents running on each of our machines.The agent is written in Go. Recently we found that the agent may get stuck, no response for the sent requests. The metrics exported from the agent show that a channel in the agent(caching the request) is full. Deep into the goroutine stacks, we found that the goroutines consuming messages from the channel are all waiting for a lock.The goroutines Stack details are shown below.

That is peculiar. What is happening under the lock is that the pool
is allocating a slice that is GOMAXPROCS in length. This shouldn't
take long, obviously. And it only needs to happen when the pool is
first created, or when GOMAXPROCS changes. So: how often do you
create this pool? Is it the case that you create the pool and then
have a large number of goroutines try to Get a value simultaneously?
Or, how often do you change GOMAXPROCS? (And, if you do change
GOMAXPROCS, why?)
1) The GOMAXPROCS is not manually changed, it's initialized as default.
2) sync.Pool is not explicitly used here, we use fmt package and log package
(zap from go.uber.org/zap), which will use sync.Pool internally. 
3) There are 7000 goroutines total, and about 500 goroutines waiting for the 
lock, it's not much but 'create a pool and a number of goroutines 
try go Get a value simultaneously'  may really happen on program start up. 
4) Does the 'taskset' command have any effect ? The program is running 
with 'taskset -c $last_2nd_core,$last_3rd_core,$last_4th_core'.

> The stack shows that all of the goroutines are waiting for the global lock in sync.Pool. But I can't figure out which gouroutine is holding the lock. There should be a gouroutine which has `sync.runtime_SemacquireMutex` in it's stack not at the top, but there isn't.

I don't think that is what you would see. I think you would see a
goroutine with pinSlow in the stack but with SemaquireMutex not in the
stack.
 
Shown as the grep result, all of the goroutines with pinSlow have a SemaquireMutex

[******@****** ~]$ curl ******795/debug/pprof/goroutine?debug=1 2>/dev/null | grep pinSlow -B4

166 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x520ebd 0x51fdff 0x51fcd0 0x737fb4 0x73a836 0x73a813 0x97d660 0x97d60a 0x97d5e9 0x4689e1

# 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71

# 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138

# 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81

# 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213

--

120 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x4f646f 0x51ed7b 0x51ff39 0x5218e7 0x73a8b0 0x97d660 0x97d60a 0x97d5e9 0x4689e1

# 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71

# 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138

# 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81

# 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213

--

119 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x5269e1 0x5269d2 0x526892 0x51f6cd 0x51f116 0x51ff39 0x5218e7 0x73a8b0 0x97d660 0x97d60a 0x97d5e9 0x4689e1

# 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71

# 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138

# 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81

# 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213

--

59 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x4d8291 0x4d5726 0x9857f7 0x9804ca 0x97d5d5 0x4689e1

# 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71

# 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138

# 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81

# 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213

--

36 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x51ed98 0x51ed88 0x51ff39 0x5218e7 0x73a8b0 0x97d660 0x97d60a 0x97d5e9 0x4689e1

# 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71

# 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138

# 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81

# 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213

--

10 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x4d8291 0x4d8856 0x9761b6 0x4689e1

# 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71

# 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138

# 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81

# 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213

--

2 @ 0x438cd0 0x4497e0 0x4497cb 0x449547 0x481c1c 0x482792 0x482793 0x4824ee 0x4821af 0x7c6ddc 0x7c6dc3 0x7c8cdf 0x4689e1

# 0x449546 sync.runtime_SemacquireMutex+0x46 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/runtime/sema.go:71

# 0x481c1b sync.(*Mutex).lockSlow+0xfb /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:138

# 0x482791 sync.(*Mutex).Lock+0x271 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/mutex.go:81

# 0x482792 sync.(*Pool).pinSlow+0x272 /home/ferry/ONLINE_SERVICE/other/ferry/task_workspace/gopath/src/******/go-env/go1-14-linux-amd64/src/sync/pool.go:213

> Reproduce:
> Can't find a way to reproduce this problem for now.

It's going to be pretty hard for us to solve the problem without a reproducer.
 
 We are now trying to reproduce this problem, but  haven't catch the bug.

Thanks for the comment.

jake...@gmail.com

unread,
Jun 21, 2021, 7:17:40 AM6/21/21
to golang-nuts
Could you clarify something? You say:
" We have about half a million agents running on each of our machines"
in your initial message. I thought maybe it was a language thing, and you meant 500,000 goroutines. But then you said:
"There are 7000 goroutines total"

So, you have about 500,000 processes running this agent on each machine, and each process has around 7,000 gorouines? Is that correct?

Peter Z

unread,
Jun 21, 2021, 7:30:58 AM6/21/21
to golang-nuts
So, you have about 500,000 processes running this agent on each machine, and each process has around 7,000 gorouines? Is that correct?

Yes, that's  exactly what I mean. 

Robert Engels

unread,
Jun 21, 2021, 12:56:13 PM6/21/21
to Peter Z, golang-nuts
How many processes per machine? It seems like scheduling latency to me. 

On Jun 21, 2021, at 6:31 AM, Peter Z <zjy19...@gmail.com> wrote:


So, you have about 500,000 processes running this agent on each machine, and each process has around 7,000 gorouines? Is that correct?

Yes, that's  exactly what I mean. 

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Peter Z

unread,
Jun 22, 2021, 5:24:45 AM6/22/21
to golang-nuts
Only one process per machine. We use 'taskset -c $last_2nd_core,$last_3rd_core,$last_4th_core ./agent -c ../conf/agent.toml' to start the agent. I wonder if it has any relationship with this problem ?

jake...@gmail.com

unread,
Jun 22, 2021, 8:06:55 AM6/22/21
to golang-nuts
Sorry, now I am completely confused.

So, you have about 500,000 processes running this agent on each machine, and each process has around 7,000 gorouines? Is that correct?

Yes, that's  exactly what I mean. 

but then you say:  "Only one process per machine".

Is there a language barrier, or am I missing something?

Robert Engels

unread,
Jun 22, 2021, 8:43:18 AM6/22/21
to jake...@gmail.com, golang-nuts
He is stating he has a cloud cluster consisting of 500k machines - each machine runs an agent process - each agent has 7000 Go routines. 

On Jun 22, 2021, at 7:07 AM, jake...@gmail.com <jake...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Peter Z

unread,
Jun 22, 2021, 9:16:12 AM6/22/21
to golang-nuts
He is stating he has a cloud cluster consisting of 500k machines - each machine runs an agent process - each agent has 7000 Go routines. 
 
Aha. Yes, this is what I mean.
Sorry, now I am completely confused.

So, you have about 500,000 processes running this agent on each machine, and each process has around 7,000 gorouines? Is that correct?

Yes, that's  exactly what I mean. 

but then you say:  "Only one process per machine".

Is there a language barrier, or am I missing something?
Sorry. I didn't quit get you in the previous post.Now I may get what you mean: how many processes are there on the machine with this problem. About 1000 processes.
It's a physical machine that runs CentOS with kernel 3.10.x on Intel Xeon CPUs.

Peter Z

unread,
Jun 22, 2021, 10:01:48 AM6/22/21
to golang-nuts
I just checked the monitor data and found that the machine suffered from high 'load average'(about 30+) at approximately the time the agent get stuck.
A 24 cores(2 CPUs * 14 cores), hyperthread closed machine with load average over 30 seems bad. But after the load average got down to below 1, the 
agent process still hung there.

2021-6-22 Tue UTC+8 pm 8:06:55<jake...@gmail.com> :

Peter Z

unread,
Jun 22, 2021, 10:21:27 AM6/22/21
to golang-nuts
Sorry for a mistake: 'hyperthread closed', hyperthread is actually on.

Robert Engels

unread,
Jun 22, 2021, 10:56:59 AM6/22/21
to Peter Z, golang-nuts
With a 500k machine cluster I suggest getting professional Go support - someone experienced in troubleshooting that can sit with you and review the code and configuration to diagnose the issue. 

Personally it sounds like overallicated machines causing thrashing delays in the context switching.  

On Jun 22, 2021, at 9:21 AM, Peter Z <zjy19...@gmail.com> wrote:

Sorry for a mistake: 'hyperthread closed', hyperthread is actually on.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages