The VM on which I am running has 40 vCPUs.
I changed my GOMAXPROCS to a lower value of 8.
And also changed my client to use 90 HTTP/2 connections towards the HTTP/2 server.
With this setup, I can now achieve about 9000 Txns per second.
However, if I go any higher than that, my client starts throwing the "context deadline exceeded" errors. I am sure that the responses are received by the client well within the timeout of 7 seconds.
When I observe the CPU utilization of my client process, it only uses about 300% vCPU i.e. 3 out of 40 vCPUs.
Still, I don't understand why I cannot achieve higher throughput than 9000 per second.
Here is a sample output from the atop utility. As seen below, only 257% shows as CPU utilization.
Is this because goroutines are not getting context switched onto a logical processor or thread. I thought if there is so much CPU available, then context switching should be very fast and not an issue at all.
Thus leading to timeouts ?
ATOP - myserver 2023/05/05 22:16:19 y--------P---- 10s elapsed
PRC | sys 6.63s | user 21.56s | | #proc 543 | #trun 4 | #tslpi 639 | | #tslpu 0 | #zombie 0 | clones 11 | | no procacct |
CPU | sys 61% | user 213% | irq 4% | idle 3724% | wait 0% | steal 0% | guest 0% | | ipc notavail | cycl unknown | curf 3.00GHz | curscal ?% |
CPL | avg1 2.28 | avg5 2.40 | | avg15 2.45 | | | csw 656705 | intr 515996 | | | numcpu 40 | |
MEM | tot 503.7G | free 468.6G | cache 26.9G | dirty 0.1M | buff 1.0G | slab 1.4G | slrec 1.2G | shmem 4.1G | shrss 117.0M | vmbal 0.0M | hptot 0.0M | hpuse 0.0M |
SWP | tot 4.0G | free 4.0G | | | | | | | | | vmcom 6.0G | vmlim 255.8G |
LVM | g_vd0-lv_var | busy 0% | read 0 | | write 158 | KiB/r 0 | KiB/w 4 | MBr/s 0.0 | MBw/s 0.1 | | avq 5.00 | avio 0.01 ms |
DSK | sda | busy 0% | read 0 | | write 122 | KiB/r 0 | KiB/w 5 | MBr/s 0.0 | MBw/s 0.1 | | avq 2.00 | avio 0.01 ms |
NET | transport | tcpi 181070 | tcpo 289884 | udpi 18 | udpo 18 | tcpao 0 | tcppo 26 | tcprs 556 | tcpie 0 | tcpor 1 | udpnp 0 | udpie 0 |
NET | network | ipi 181092 | ipo 201839 | | ipfrw 0 | deliv 181091 | | | | | icmpi 0 | icmpo 0 |
NET | bond1 0% | pcki 181057 | pcko 290688 | sp 20 Gbps | si 16 Mbps | so 190 Mbps | coll 0 | mlti 0 | erri 0 | erro 0 | drpi 0 | drpo 0 |
NET | net2 0% | pcki 181056 | pcko 290688 | sp 20 Gbps | si 16 Mbps | so 190 Mbps | coll 0 | mlti 0 | erri 0 | erro 0 | drpi 0 | drpo 0 |
PID TID SYSCPU USRCPU VGROW RGROW RUID EUID ST EXC THR S CPUNR CPU CMD 1/1
40753 - 4.08s 21.52s 0K -11.7M mtx mtx -- - 15 S 20 257% 5gclient
40753 40753 0.00s 0.00s 0K -11.7M mtx mtx -- - 1 S 20 0% 5gclient
40753 40754 0.00s 0.00s 0K -11.7M mtx mtx -- - 1 S 5 0% 5gclient
40753 40755 0.00s 0.00s 0K -11.7M mtx mtx -- - 1 S 3 0% 5gclient
40753 40756 0.00s 0.00s 0K -11.7M mtx mtx -- - 1 S 21 0% 5gclient
40753 40757 0.45s 2.35s 0K -11.7M mtx mtx -- - 1 S 6 28% 5gclient
40753 40758 0.44s 2.31s 0K -11.7M mtx mtx -- - 1 S 20 28% 5gclient
40753 40759 0.44s 2.52s 0K -11.7M mtx mtx -- - 1 S 0 30% 5gclient
40753 40760 0.36s 1.80s 0K -11.7M mtx mtx -- - 1 S 28 22% 5gclient
40753 40761 0.41s 2.04s 0K -11.7M mtx mtx -- - 1 S 9 25% 5gclient
40753 40762 0.48s 2.46s 0K -11.7M mtx mtx -- - 1 R 1 30% 5gclient
40753 40763 0.00s 0.00s 0K -11.7M mtx mtx -- - 1 S 23 0% 5gclient
40753 40764 0.45s 2.48s 0K -11.7M mtx mtx -- - 1 S 2 29% 5gclient
40753 40765 0.21s 1.17s 0K -11.7M mtx mtx -- - 1 S 1 14% 5gclient
40753 40766 0.45s 2.46s 0K -11.7M mtx mtx -- - 1 S 5 29% 5gclient
40753 41140 0.39s 1.90s 0K -11.7M mtx mtx -- - 1 R 29 23% 5gclient