Hi Brian, thanks for your help.
I run the tests in a laptop with linux as the main OS, no emulation.
I just spent 5 hours obsesively trying to reproduce it and I know it sounds very strange. The 100% CPU spikes is how I started noticing this, because the laptop fan would start spinning at full speed. Since then, I am using go1.16 from
&& go1.16 download" and I haven't experienced it again.
I checked the issue links and it seems similar to what I am seeing, but only from version 1.17 and on. Also they say you can dissable it with GODEBUG=asyncpreemptoff=1 but I see the problem in the build version. Maybe the difference with go run is that it probably adds debug flags? I expected it to just compile the program in a temp location and run it, but this is a consistent difference I just tested:
$ go run test.go
And in another console:
$ ps aux | grep test.go
bill 143464 2.5 0.1 1897584 16480 pts/3 Sl+ 09:54 0:00 go run test.go
bill 143604 0.0 0.0 11568 2764 pts/4 S+ 09:54 0:00 grep --color=auto test.go
$ sudo strace -f -tt -p 143464
strace: Process 143464 attached with 17 threads
[pid 143544] 09:54:42.564230 futex(0xc000224148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143485] 09:54:42.564321 futex(0xc00056a148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143478] 09:54:42.564340 epoll_pwait(4, <unfinished ...>
[pid 143477] 09:54:42.564366 futex(0xc000580548, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143476] 09:54:42.564386 futex(0xc000680148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143475] 09:54:42.564407 futex(0xc000580148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143474] 09:54:42.564430 futex(0xc000500148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143473] 09:54:42.564451 futex(0xdf3e80, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143472] 09:54:42.564471 futex(0xc000400148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143471] 09:54:42.564493 futex(0xc000380148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143470] 09:54:42.564515 futex(0xc000300148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143469] 09:54:42.564535 futex(0xdf3d38, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143468] 09:54:42.564556 futex(0xc000090148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143467] 09:54:42.564576 futex(0xc000023148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143466] 09:54:42.564596 futex(0xc000022d48, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143465] 09:54:42.564616 restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 143464] 09:54:42.564637 waitid(P_PID, 143586, <unfinished ...>
[pid 143465] 09:55:14.806222 <... restart_syscall resumed>) = -1 ETIMEDOUT (Connection timed out)
[pid 143465] 09:55:14.806608 nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
[pid 143465] 09:55:14.807053 futex(0xdc55d8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=60, tv_nsec=0}
In another console I would request multiple times:
$ curl
http://localhost:8080or
$ wrk -d2
http://localhost:8080but strace won't show anything else. It just stays like that and CPU's show 0% activity when is not getting requests.
On the other hand, if I compile it:
$ go build test.go
$ ./test
Check strace in another console, it starts like this:
$ ps aux | grep ./test
bill 143817 0.0 0.0 1077296 4448 pts/3 Sl+ 09:57 0:00 ./test
bill 143829 0.0 0.0 11568 712 pts/4 S+ 09:58 0:00 grep --color=auto ./test
$ sudo strace -f -tt -p 143817
strace: Process 143817 attached with 6 threads
[pid 143818] 09:58:23.811515 restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 143822] 09:58:23.811538 futex(0x85f438, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143821] 09:58:23.811561 futex(0xc000080148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143820] 09:58:23.811580 futex(0xc000015148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143819] 09:58:23.811585 futex(0xc000014d48, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143817] 09:58:23.811613 epoll_pwait(5,
But below is what I get with a single curl request:
$ sudo strace -f -tt -p 143817
strace: Process 143817 attached with 6 threads
[pid 143818] 09:58:23.811515 restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 143822] 09:58:23.811538 futex(0x85f438, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143821] 09:58:23.811561 futex(0xc000080148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143820] 09:58:23.811580 futex(0xc000015148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143819] 09:58:23.811585 futex(0xc000014d48, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143817] 09:58:23.811613 epoll_pwait(5, <unfinished ...>
[pid 143818] 09:58:42.447774 <... restart_syscall resumed>) = -1 ETIMEDOUT (Connection timed out)
[pid 143818] 09:58:42.447898 nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
[pid 143818] 09:58:42.448141 futex(0x830ed8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=60, tv_nsec=0} <unfinished ...>
[pid 143817] 09:59:30.164453 <... epoll_pwait resumed>[{EPOLLIN, {u32=1288671128, u64=140674352517016}}], 128, -1, NULL, 0) = 1
[pid 143817] 09:59:30.164531 futex(0x830ed8, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 143818] 09:59:30.164725 <... futex resumed>) = 0
[pid 143817] 09:59:30.164737 accept4(3, <unfinished ...>
[pid 143818] 09:59:30.164859 nanosleep({tv_sec=0, tv_nsec=20000}, <unfinished ...>
[pid 143817] 09:59:30.164889 <... accept4 resumed>{sa_family=AF_INET6, sin6_port=htons(40122), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_scope_id=0}, [112->28], SOCK_CLOEXEC|SOCK_NONBLOCK) = 4
[pid 143817] 09:59:30.164962 epoll_ctl(5, EPOLL_CTL_ADD, 4, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=1288670888, u64=140674352516776}} <unfinished ...>
[pid 143818] 09:59:30.164993 <... nanosleep resumed>NULL) = 0
[pid 143817] 09:59:30.165040 <... epoll_ctl resumed>) = 0
[pid 143818] 09:59:30.165072 nanosleep({tv_sec=0, tv_nsec=20000}, <unfinished ...>
[pid 143817] 09:59:30.165097 getsockname(4, {sa_family=AF_INET6, sin6_port=htons(8080), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_scope_id=0}, [112->28]) = 0
[pid 143817] 09:59:30.165142 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid 143817] 09:59:30.165162 setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
[pid 143817] 09:59:30.165180 setsockopt(4, SOL_TCP, TCP_KEEPINTVL, [15], 4) = 0
[pid 143817] 09:59:30.165199 setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [15], 4 <unfinished ...>
[pid 143818] 09:59:30.165209 <... nanosleep resumed>NULL) = 0
[pid 143817] 09:59:30.165215 <... setsockopt resumed>) = 0
[pid 143818] 09:59:30.165222 nanosleep({tv_sec=0, tv_nsec=20000}, <unfinished ...>
[pid 143817] 09:59:30.165271 futex(0xc000015148, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 143820] 09:59:30.165311 <... futex resumed>) = 0
[pid 143817] 09:59:30.165317 accept4(3, 0xc00012db20, [112], SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
[pid 143818] 09:59:30.165373 <... nanosleep resumed>NULL) = 0
[pid 143820] 09:59:30.165380 epoll_pwait(5, <unfinished ...>
[pid 143818] 09:59:30.165403 nanosleep({tv_sec=0, tv_nsec=20000}, <unfinished ...>
[pid 143820] 09:59:30.165438 <... epoll_pwait resumed>[{EPOLLIN|EPOLLOUT, {u32=1288670888, u64=140674352516776}}], 128, 0, NULL, 0) = 1
[pid 143820] 09:59:30.165501 epoll_pwait(5, <unfinished ...>
[pid 143818] 09:59:30.165509 <... nanosleep resumed>NULL) = 0
[pid 143817] 09:59:30.165515 read(4, <unfinished ...>
[pid 143818] 09:59:30.165536 nanosleep({tv_sec=0, tv_nsec=20000}, <unfinished ...>
[pid 143817] 09:59:30.165564 <... read resumed>"GET / HTTP/1.1\r\nHost: localhost:"..., 4096) = 78
[pid 143817] 09:59:30.165595 futex(0xc000080148, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 143821] 09:59:30.165610 <... futex resumed>) = 0
[pid 143821] 09:59:30.165624 nanosleep({tv_sec=0, tv_nsec=3000}, <unfinished ...>
[pid 143817] 09:59:30.165632 write(4, "HTTP/1.1 200 OK\r\nDate: Mon, 11 A"..., 120 <unfinished ...>
[pid 143818] 09:59:30.165642 <... nanosleep resumed>NULL) = 0
[pid 143817] 09:59:30.165669 <... write resumed>) = 120
[pid 143821] 09:59:30.165700 <... nanosleep resumed>NULL) = 0
[pid 143818] 09:59:30.165706 nanosleep({tv_sec=0, tv_nsec=20000}, <unfinished ...>
[pid 143821] 09:59:30.165716 futex(0xc000080148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143817] 09:59:30.165739 read(4, 0xc000176000, 4096) = -1 EAGAIN (Resource temporarily unavailable)
[pid 143817] 09:59:30.165785 futex(0x830108, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 143820] 09:59:30.165794 <... epoll_pwait resumed>[{EPOLLIN|EPOLLOUT|EPOLLRDHUP, {u32=1288670888, u64=140674352516776}}], 128, 4999, NULL, 34914803367208) = 1
[pid 143820] 09:59:30.165809 read(4, "", 4096) = 0
[pid 143820] 09:59:30.165832 epoll_ctl(5, EPOLL_CTL_DEL, 4, 0xc000123974) = 0
[pid 143820] 09:59:30.165850 close(4 <unfinished ...>
[pid 143818] 09:59:30.165858 <... nanosleep resumed>NULL) = 0
[pid 143818] 09:59:30.165870 nanosleep({tv_sec=0, tv_nsec=20000}, <unfinished ...>
[pid 143820] 09:59:30.165879 <... close resumed>) = 0
[pid 143820] 09:59:30.165892 epoll_pwait(5, [], 128, 0, NULL, 1) = 0
[pid 143820] 09:59:30.165909 epoll_pwait(5, <unfinished ...>
[pid 143818] 09:59:30.165953 <... nanosleep resumed>NULL) = 0
[pid 143818] 09:59:30.165963 futex(0x830ed8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=60, tv_nsec=0}
and the output keeps growing like that with every new curl request until eventually it starts constantly printing output like this:
[pid 143818] 10:02:22.697120 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.707609 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.718125 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.728753 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.739195 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.749618 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.760160 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.770669 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.781137 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.791581 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.802060 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.812540 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.822981 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
[pid 143818] 10:02:22.833468 nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
In htop, without strace attached, CPU is constant about 0.7% for the first one or two cores.