memory leak in inotify_update_watch

57 views
Skip to first unread message

syzbot

unread,
Jul 6, 2020, 11:42:26 AM7/6/20
to amir...@gmail.com, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following crash on:

HEAD commit: 7cc2a8ea Merge tag 'block-5.8-2020-07-01' of git://git.ker..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17644c05100000
kernel config: https://syzkaller.appspot.com/x/.config?x=5ee23b9caef4e07a
dashboard link: https://syzkaller.appspot.com/bug?extid=dec34b033b3479b9ef13
compiler: gcc (GCC) 10.1.0-syz 20200507
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1478a67b100000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+dec34b...@syzkaller.appspotmail.com

BUG: memory leak
unreferenced object 0xffff888115db8480 (size 576):
comm "systemd-udevd", pid 11037, jiffies 4295104591 (age 56.960s)
hex dump (first 32 bytes):
00 04 00 00 00 00 00 00 80 fd e8 15 81 88 ff ff ................
a0 02 dd 20 81 88 ff ff b0 81 d0 09 81 88 ff ff ... ............
backtrace:
[<00000000288c0066>] radix_tree_node_alloc.constprop.0+0xc1/0x140 lib/radix-tree.c:252
[<00000000f80ba6a7>] idr_get_free+0x231/0x3b0 lib/radix-tree.c:1505
[<00000000ec9ab938>] idr_alloc_u32+0x91/0x120 lib/idr.c:46
[<00000000aea98d29>] idr_alloc_cyclic+0x84/0x110 lib/idr.c:125
[<00000000dbad44a4>] inotify_add_to_idr fs/notify/inotify/inotify_user.c:365 [inline]
[<00000000dbad44a4>] inotify_new_watch fs/notify/inotify/inotify_user.c:578 [inline]
[<00000000dbad44a4>] inotify_update_watch+0x1af/0x2d0 fs/notify/inotify/inotify_user.c:617
[<00000000e141890d>] __do_sys_inotify_add_watch fs/notify/inotify/inotify_user.c:755 [inline]
[<00000000e141890d>] __se_sys_inotify_add_watch fs/notify/inotify/inotify_user.c:698 [inline]
[<00000000e141890d>] __x64_sys_inotify_add_watch+0x12f/0x180 fs/notify/inotify/inotify_user.c:698
[<00000000d872d7cc>] do_syscall_64+0x4c/0xe0 arch/x86/entry/common.c:359
[<000000005c62d8da>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0xffff88811fb8c180 (size 192):
comm "systemd-udevd", pid 11486, jiffies 4295108810 (age 14.770s)
hex dump (first 32 bytes):
08 80 00 00 06 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 89 13 1a 81 88 ff ff ................
backtrace:
[<000000009fe0803b>] __d_alloc+0x2a/0x260 fs/dcache.c:1709
[<000000005a828803>] d_alloc+0x21/0xb0 fs/dcache.c:1788
[<00000000e0349988>] __lookup_hash+0x67/0xc0 fs/namei.c:1441
[<00000000907d6c36>] filename_create+0xa5/0x1c0 fs/namei.c:3459
[<0000000025ebf47f>] user_path_create fs/namei.c:3516 [inline]
[<0000000025ebf47f>] do_symlinkat+0x70/0x180 fs/namei.c:3973
[<00000000d872d7cc>] do_syscall_64+0x4c/0xe0 arch/x86/entry/common.c:359
[<000000005c62d8da>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0xffff888107962b00 (size 704):
comm "systemd-udevd", pid 11486, jiffies 4295108810 (age 14.770s)
hex dump (first 32 bytes):
00 00 00 00 01 00 00 00 00 00 20 00 00 00 00 00 .......... .....
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<000000001bbffdf0>] shmem_alloc_inode+0x18/0x40 mm/shmem.c:3701
[<000000008bdb5db7>] alloc_inode+0x27/0xf0 fs/inode.c:232
[<00000000b322bd08>] new_inode_pseudo fs/inode.c:928 [inline]
[<00000000b322bd08>] new_inode+0x21/0xf0 fs/inode.c:957
[<0000000090aa6bc7>] shmem_get_inode+0x47/0x2b0 mm/shmem.c:2229
[<00000000d46b8299>] shmem_symlink+0x6b/0x290 mm/shmem.c:3080
[<00000000edfa50df>] vfs_symlink fs/namei.c:3953 [inline]
[<00000000edfa50df>] vfs_symlink+0x15a/0x230 fs/namei.c:3939
[<00000000a8f2bfa3>] do_symlinkat+0x14f/0x180 fs/namei.c:3980
[<00000000d872d7cc>] do_syscall_64+0x4c/0xe0 arch/x86/entry/common.c:359
[<000000005c62d8da>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0xffff88811952fa80 (size 56):
comm "systemd-udevd", pid 11486, jiffies 4295108810 (age 14.770s)
hex dump (first 32 bytes):
a8 2c 96 07 81 88 ff ff e0 18 b9 81 ff ff ff ff .,..............
70 2b 96 07 81 88 ff ff 98 fa 52 19 81 88 ff ff p+........R.....
backtrace:
[<00000000369fbe38>] kmem_cache_zalloc include/linux/slab.h:659 [inline]
[<00000000369fbe38>] lsm_inode_alloc security/security.c:588 [inline]
[<00000000369fbe38>] security_inode_alloc+0x2e/0xb0 security/security.c:971
[<000000005b4a8c5f>] inode_init_always+0x10c/0x200 fs/inode.c:171
[<0000000022ebc8f1>] alloc_inode+0x44/0xf0 fs/inode.c:239
[<00000000b322bd08>] new_inode_pseudo fs/inode.c:928 [inline]
[<00000000b322bd08>] new_inode+0x21/0xf0 fs/inode.c:957
[<0000000090aa6bc7>] shmem_get_inode+0x47/0x2b0 mm/shmem.c:2229
[<00000000d46b8299>] shmem_symlink+0x6b/0x290 mm/shmem.c:3080
[<00000000edfa50df>] vfs_symlink fs/namei.c:3953 [inline]
[<00000000edfa50df>] vfs_symlink+0x15a/0x230 fs/namei.c:3939
[<00000000a8f2bfa3>] do_symlinkat+0x14f/0x180 fs/namei.c:3980
[<00000000d872d7cc>] do_syscall_64+0x4c/0xe0 arch/x86/entry/common.c:359
[<000000005c62d8da>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0xffff88811f95dcc0 (size 192):
comm "systemd-udevd", pid 11488, jiffies 4295108822 (age 14.650s)
hex dump (first 32 bytes):
08 80 00 00 06 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 89 13 1a 81 88 ff ff ................
backtrace:
[<000000009fe0803b>] __d_alloc+0x2a/0x260 fs/dcache.c:1709
[<000000005a828803>] d_alloc+0x21/0xb0 fs/dcache.c:1788
[<00000000e0349988>] __lookup_hash+0x67/0xc0 fs/namei.c:1441
[<00000000907d6c36>] filename_create+0xa5/0x1c0 fs/namei.c:3459
[<0000000025ebf47f>] user_path_create fs/namei.c:3516 [inline]
[<0000000025ebf47f>] do_symlinkat+0x70/0x180 fs/namei.c:3973
[<00000000d872d7cc>] do_syscall_64+0x4c/0xe0 arch/x86/entry/common.c:359
[<000000005c62d8da>] entry_SYSCALL_64_after_hwframe+0x44/0xa9



---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

Jan Kara

unread,
Jul 7, 2020, 11:24:13 AM7/7/20
to syzbot, amir...@gmail.com, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, Catalin Marinas
Hello!
I've been looking into this for a while and I don't think this is related
to inotify at all. Firstly the reproducer looks totally benign:

prlimit64(0x0, 0xe, &(0x7f0000000280)={0x9, 0x8d}, 0x0)
sched_setattr(0x0, &(0x7f00000000c0)={0x38, 0x2, 0x0, 0x0, 0x9}, 0x0)
vmsplice(0xffffffffffffffff, 0x0, 0x0, 0x0)
perf_event_open(0x0, 0x0, 0xffffffffffffffff, 0xffffffffffffffff, 0x0)
clone(0x20000103, 0x0, 0xfffffffffffffffe, 0x0, 0xffffffffffffffff)
syz_mount_image$vfat(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)

So we seem to set SCHED_RR class and prio 9 to itself, the rest of syscalls
seem to be invalid and should fail. Secondly, the kernel log shows that we
hit OOM killer frequently and after one of these kills, many leaked objects
(among them this radix tree node from inotify idr) are reported. I'm not
sure if it could be the leak detector getting confused (e.g. because it got
ENOMEM at some point) or something else... Catalin, any idea?

Honza
--
Jan Kara <ja...@suse.com>
SUSE Labs, CR

Catalin Marinas

unread,
Jul 7, 2020, 2:17:16 PM7/7/20
to Jan Kara, syzbot, amir...@gmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Kmemleak never performs well under heavy load. Normally you'd need to
let the system settle for a bit before checking whether the leaks are
still reported. The issue is caused by the memory scanning not stopping
the whole machine, so pointers may be hidden in registers on different
CPUs (list insertion/deletion for example causes transient kmemleak
confusion).

I think the syzkaller guys tried a year or so ago to run it in parallel
with kmemleak and gave up shortly. The proposal was to add a "stopscan"
command to kmemleak which would do this under stop_machine(). However,
no-one got to implementing it.

So, in this case, does the leak still appear with the reproducer, once
the system went idle?

--
Catalin

Dmitry Vyukov

unread,
Jul 8, 2020, 3:17:50 AM7/8/20
to Catalin Marinas, Jan Kara, syzbot, Amir Goldstein, linux-fsdevel, LKML, syzkaller-bugs, syzkaller
Hi Catalin,

This report came from syzbot, so obviously we did not give up :)

We don't run scanning in parallel with fuzzing and do a very intricate
multi-step dance to overcome false positives:
https://github.com/google/syzkaller/blob/5962a2dc88f6511b77100acdf687c1088f253f6b/executor/common_linux.h#L3407-L3478
and only report leaks that are reproducible.
So far I have not seen any noticable amount of false positives, and
you can see 70 already fixed leaks here:
https://syzkaller.appspot.com/upstream/fixed?manager=ci-upstream-gce-leak
https://syzkaller.appspot.com/upstream?manager=ci-upstream-gce-leak

Catalin Marinas

unread,
Jul 8, 2020, 7:08:20 AM7/8/20
to Dmitry Vyukov, Jan Kara, syzbot, Amir Goldstein, linux-fsdevel, LKML, syzkaller-bugs, syzkaller
On Wed, Jul 08, 2020 at 09:17:37AM +0200, Dmitry Vyukov wrote:
> On Tue, Jul 7, 2020 at 8:17 PM Catalin Marinas <catalin...@arm.com> wrote:
> > Kmemleak never performs well under heavy load. Normally you'd need to
> > let the system settle for a bit before checking whether the leaks are
> > still reported. The issue is caused by the memory scanning not stopping
> > the whole machine, so pointers may be hidden in registers on different
> > CPUs (list insertion/deletion for example causes transient kmemleak
> > confusion).
> >
> > I think the syzkaller guys tried a year or so ago to run it in parallel
> > with kmemleak and gave up shortly. The proposal was to add a "stopscan"
> > command to kmemleak which would do this under stop_machine(). However,
> > no-one got to implementing it.
> >
> > So, in this case, does the leak still appear with the reproducer, once
> > the system went idle?
>
> This report came from syzbot, so obviously we did not give up :)

That's good to know ;).

> We don't run scanning in parallel with fuzzing and do a very intricate
> multi-step dance to overcome false positives:
> https://github.com/google/syzkaller/blob/5962a2dc88f6511b77100acdf687c1088f253f6b/executor/common_linux.h#L3407-L3478
> and only report leaks that are reproducible.
> So far I have not seen any noticable amount of false positives, and
> you can see 70 already fixed leaks here:
> https://syzkaller.appspot.com/upstream/fixed?manager=ci-upstream-gce-leak
> https://syzkaller.appspot.com/upstream?manager=ci-upstream-gce-leak

Thanks for the information and the good work here. If you have time, you
could implement the stop_machine() kmemleak scan as well ;).

--
Catalin

Dmitry Vyukov

unread,
Jul 8, 2020, 7:15:13 AM7/8/20
to Catalin Marinas, Jan Kara, syzbot, Amir Goldstein, linux-fsdevel, LKML, syzkaller-bugs, syzkaller
stop_machine will only help with pointers stored in registers/jumping
in memory. But there may be other sources of false positives like
hidden pointers via some hashing, offsets, reused low/high bits. Doing
several scans and crc checksum of object contents helps with these as
well and is orthogonal to stop_machine.
So now I wonder if using stop_machine will actually solve all
problems... because if not, then doing this work but then having to do
several scans and checksums anyway is kinda pointless...

Catalin Marinas

unread,
Jul 8, 2020, 8:03:33 AM7/8/20
to Jan Kara, syzbot, amir...@gmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On Tue, Jul 07, 2020 at 05:24:11PM +0200, Jan Kara wrote:
Just wondering, if this leak is reproducible, could we have some
condition where inotify_remove_from_idr() is not called in case of a
forced exit triggered by the OOM kill? Also, can the leak be reproduced
without the OOM?

--
Catalin

syzbot

unread,
Sep 14, 2022, 10:29:26 PM9/14/22
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
No recent activity, existing reproducers are no longer triggering the issue.
Reply all
Reply to author
Forward
0 new messages