Locking (`__GI___pthread_mutex_unlock`) takes most of the execution time when using cgo with large number of CPUs

Bảo Phan Quốc

unread,

Jun 20, 2023, 8:32:25 AM6/20/23

to golang-nuts

I'm using cgo to call a C function from Go. Inside the C function there is a callback to a Go function. In other way, I'm calling Go -> C -> Go.

After running pprof, I noticed that the __GI___pthread_mutex_unlock took half of the execution time. AFAIK, cgo has an overhead, especially calling back from C to Go. But it's weird that cgo takes half of the execution time to do some locking. Is there something wrong in my code?

Here is the gist containing the code https://gist.github.com/phqb/2fa5bc76b77208d7f36b151b405e6dcc.

pprof output:

```

(pprof) top
Showing nodes accounting for 136.20s, 89.12% of 152.82s total
Dropped 252 nodes (cum <= 0.76s)
Showing top 10 nodes out of 64
flat flat% sum% cum cum%
68.20s 44.63% 44.63% 68.20s 44.63% __GI___pthread_mutex_unlock
54.32s 35.55% 80.17% 54.32s 35.55% __lll_lock_wait
3.57s 2.34% 82.51% 57.89s 37.88% __pthread_mutex_lock
2.38s 1.56% 84.07% 3.15s 2.06% runtime.casgstatus
1.81s 1.18% 85.25% 1.81s 1.18% runtime.futex
1.26s 0.82% 86.08% 3.99s 2.61% runtime.mallocgc
1.21s 0.79% 86.87% 2.54s 1.66% runtime.lock2
1.21s 0.79% 87.66% 1.48s 0.97% runtime.reentersyscall
1.15s 0.75% 88.41% 1.15s 0.75% runtime.procyield
1.09s 0.71% 89.12% 1.59s 1.04% runtime.exitsyscallfast

```

Running environment:

* Golang version: go version go1.20.5 linux/amd64

* `lscpu`:

```

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU @ 2.20GHz
Stepping: 0
CPU MHz: 2200.152
BogoMIPS: 4400.30
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 512 KiB
L1i cache: 512 KiB
L2 cache: 4 MiB
L3 cache: 55 MiB
NUMA node0 CPU(s): 0-31
...
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtop
ology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowpre
fetch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities

```

Ian Lance Taylor

unread,

Jun 20, 2023, 10:04:43 AM6/20/23

to Bảo Phan Quốc, golang-nuts

On Tue, Jun 20, 2023 at 5:32 AM Bảo Phan Quốc <phq...@gmail.com> wrote:
>
> I'm using cgo to call a C function from Go. Inside the C function there is a callback to a Go function. In other way, I'm calling Go -> C -> Go.
>
> After running pprof, I noticed that the __GI___pthread_mutex_unlock took half of the execution time. AFAIK, cgo has an overhead, especially calling back from C to Go. But it's weird that cgo takes half of the execution time to do some locking. Is there something wrong in my code?

Every call from C to Go does acquire a mutex to check that the Go
runtime has been fully initialized. This is normally not a big deal
as the mutex is held only briefly. But your code does a lot of
parallel calls from C to Go. It is possible that the heavy contention
on the mutex is causing the results that you are seeing.

The code in question is the C function _cgo_wait_runtime_init_done in
runtime/gco/gcc_libinit.c. It might be possible to speed it up in
some way to avoid the mutex in the normal case.

Ian

robert engels

unread,

Jun 20, 2023, 6:36:46 PM6/20/23

to Ian Lance Taylor, Bảo Phan Quốc, golang-nuts

Since this is an initialization guard - can’t it simply be double checked with an atomic flag?

> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOyqgcU7Csb_hf5gvy8dVHKEHd8GFkN-fUML18uW0HiQGOAXyg%40mail.gmail.com.

Ian Lance Taylor

unread,

Jun 20, 2023, 6:38:56 PM6/20/23

to robert engels, Bảo Phan Quốc, golang-nuts

On Tue, Jun 20, 2023 at 3:36 PM robert engels <ren...@ix.netcom.com> wrote:
>
> Since this is an initialization guard - can’t it simply be double checked with an atomic flag?

That would be a good first thing to try.

Ian

Reply all

Reply to author

Forward