I think either kmemleak or syzbot are mis-reporting this. I've added a
bunch of printks around all allocations performed by BPF ringbuf. When
I run repro, I see this:
[ 26.013500] ALLOC rb_map ffff888118d7d000
[ 26.013946] ALLOC KMALLOC AREA ffff88810d538c00
[ 26.014439] ALLOC PAGES ffff88810d538c00
[ 26.014826] ALLOC PAGE[0] ffffea000419af00
[ 26.015272] ALLOC PAGE[1] ffffea000419aec0
[ 26.015686] ALLOC PAGE[2] ffffea000419ae80
[ 26.016090] ALLOC PAGE[3] ffffea00042e29c0
[ 26.016513] ALLOC PAGE[4] ffffea00042a1000
[ 26.016928] VMAP rb ffffc90000539000
[ 26.017291] ALLOC rb_map->rb ffffc90000539000
[ 26.017712] FINISHED ALLOC BPF_MAP ffff888118d7d000
[ 32.105069] ALLOC rb_map ffff888118d7d200
[ 32.105568] ALLOC KMALLOC AREA ffff88810d538c80
[ 32.106005] ALLOC PAGES ffff88810d538c80
[ 32.106407] ALLOC PAGE[0] ffffea000419aa80
[ 32.106805] ALLOC PAGE[1] ffffea000419ab00
[ 32.107206] ALLOC PAGE[2] ffffea000419abc0
[ 32.107607] ALLOC PAGE[3] ffffea0004284480
[ 32.108003] ALLOC PAGE[4] ffffea0004284440
[ 32.108419] VMAP rb ffffc900005ad000
[ 32.108765] ALLOC rb_map->rb ffffc900005ad000
[ 32.109186] FINISHED ALLOC BPF_MAP ffff888118d7d200
[ 33.592874] kmemleak: 1 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
[ 40.526922] kmemleak: 1 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
On repro side I get these two warnings:
[vmuser@archvm bpf]$ sudo ./repro
BUG: memory leak
unreferenced object 0xffff88810d538c00 (size 64):
comm "repro", pid 2140, jiffies 4294692933 (age 14.540s)
hex dump (first 32 bytes):
00 af 19 04 00 ea ff ff c0 ae 19 04 00 ea ff ff ................
80 ae 19 04 00 ea ff ff c0 29 2e 04 00 ea ff ff .........)......
backtrace:
[<0000000077bfbfbd>] __bpf_map_area_alloc+0x31/0xc0
[<00000000587fa522>] ringbuf_map_alloc.cold.4+0x48/0x218
[<0000000044d49e96>] __do_sys_bpf+0x359/0x1d90
[<00000000f601d565>] do_syscall_64+0x2d/0x40
[<0000000043d3112a>] entry_SYSCALL_64_after_hwframe+0x44/0xae
BUG: memory leak
unreferenced object 0xffff88810d538c80 (size 64):
comm "repro", pid 2143, jiffies 4294699025 (age 8.448s)
hex dump (first 32 bytes):
80 aa 19 04 00 ea ff ff 00 ab 19 04 00 ea ff ff ................
c0 ab 19 04 00 ea ff ff 80 44 28 04 00 ea ff ff .........D(.....
backtrace:
[<0000000077bfbfbd>] __bpf_map_area_alloc+0x31/0xc0
[<00000000587fa522>] ringbuf_map_alloc.cold.4+0x48/0x218
[<0000000044d49e96>] __do_sys_bpf+0x359/0x1d90
[<00000000f601d565>] do_syscall_64+0x2d/0x40
[<0000000043d3112a>] entry_SYSCALL_64_after_hwframe+0x44/0xae
Note that both reported leaks (ffff88810d538c80 and ffff88810d538c00)
correspond to pages array bpf_ringbuf is allocating and tracking
properly internally.
Note also that syzbot repro doesn't close FD of created BPF ringbufs,
and even when ./repro itself exits with error, there are still two
forked processes hanging around in my system. So clearly ringbuf maps
are alive at that point. So reporting any memory leak looks weird at
that point, because that memory is being used by active referenced BPF
ringbuf.
It's also a question why repro doesn't clean up its forks. But if I do
`pkill repro`, I do see that all the allocated memory is properly
cleaned up:
[ 84.039790] MAP RELEASE MAP ffff888118d7d000
[ 84.039980] MAP RELEASE MAP ffff888118d7d200
[ 84.040421] MAP ffff888118d7d000 PUT USERCNT 0
[ 84.040849] MAP ffff888118d7d200 PUT USERCNT 0
[ 84.040854] MAP ffff888118d7d200 PUT REFCNT 0
[ 84.041485] MAP ffff888118d7d000 PUT REFCNT 0
[ 84.041513] MAP FREE DEFERRED MAP ffff888118d7d000
[ 84.041921] MAP FREE DEFERRED MAP ffff888118d7d200
[ 84.042530] VUNMAP rb ffffc90000539000
[ 84.043127] VUNMAP rb ffffc900005ad000
[ 84.043802] DEALLOC page[0] ffffea000419af00
[ 84.044258] DEALLOC page[0] ffffea000419aa80
[ 84.044814] DEALLOC page[1] ffffea000419aec0
[ 84.045180] DEALLOC page[1] ffffea000419ab00
[ 84.045772] DEALLOC page[2] ffffea000419ae80
[ 84.046188] DEALLOC page[2] ffffea000419abc0
[ 84.046817] DEALLOC page[3] ffffea00042e29c0
[ 84.047245] DEALLOC page[3] ffffea0004284480
[ 84.047895] DEALLOC page[4] ffffea00042a1000
[ 84.048371] DEALLOC page[4] ffffea0004284440
[ 84.048373] DEALLOC pages ffff88810d538c80
[ 84.048375] DEALLOC rb_map ffff888118d7d200
[ 84.052392] DEALLOC pages ffff88810d538c00
[ 84.053015] DEALLOC rb_map ffff888118d7d000
Note that "leaks" are deallocated properly:
[ 84.048373] DEALLOC pages ffff88810d538c80
[ 84.052392] DEALLOC pages ffff88810d538c00
BTW, if I add close() right after bpf() syscall in syzbot repro, I see
that everything is immediately deallocated, like designed. And no
memory leak is reported.
So I don't think the problem is anywhere in bpf_ringbuf code, rather
in the leak detection and/or repro itself. Any suggestions how to
silence or fix these reports?