WARNING in ovl_instantiate

21 views
Skip to first unread message

syzbot

unread,
Nov 10, 2018, 8:56:03 PM11/10/18
to linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
Hello,

syzbot found the following crash on:

HEAD commit: 442b8cea2477 Add linux-next specific files for 20181109
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=169a6fbd400000
kernel config: https://syzkaller.appspot.com/x/.config?x=2f72bdb11df9fbe8
dashboard link: https://syzkaller.appspot.com/bug?extid=9c69c282adc4edd2b540
compiler: gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+9c69c2...@syzkaller.appspotmail.com

WARNING: CPU: 0 PID: 9768 at fs/overlayfs/dir.c:263
ovl_instantiate+0x369/0x400 fs/overlayfs/dir.c:263
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 9768 Comm: syz-executor2 Not tainted 4.20.0-rc1-next-20181109+
#110
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x39d lib/dump_stack.c:113
panic+0x2ad/0x55c kernel/panic.c:188
__warn.cold.8+0x20/0x45 kernel/panic.c:540
report_bug+0x254/0x2d0 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:178 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
RIP: 0010:ovl_instantiate+0x369/0x400 fs/overlayfs/dir.c:263
Code: c3 89 c6 e8 69 84 ee fe 85 db 0f 85 9e 00 00 00 e8 4c 83 ee fe 4c 89
e7 45 31 f6 e8 11 18 45 ff e9 ec fe ff ff e8 37 83 ee fe <0f> 0b e9 e0 fe
ff ff e8 2b 83 ee fe 0f 0b e9 63 ff ff ff e8 1f db
RSP: 0018:ffff88018f31f990 EFLAGS: 00010212
RAX: 0000000000040000 RBX: ffff88018f31fa28 RCX: ffffc90013c02000
RDX: 000000000000a369 RSI: ffffffff82912579 RDI: 0000000000000007
RBP: ffff88018f31fa50 R08: ffff8801bb18a000 R09: ffffed0031e63ee5
R10: ffffed0031e63ee5 R11: 0000000000000003 R12: ffff8801cd1e8300
R13: ffff88018f31f9c8 R14: ffffffffffffff8c R15: 0000000000000000
ovl_create_over_whiteout fs/overlayfs/dir.c:518 [inline]
ovl_create_or_link+0xad6/0x1560 fs/overlayfs/dir.c:582
ovl_create_object+0x2e9/0x3a0 fs/overlayfs/dir.c:616
ovl_create+0x2b/0x30 fs/overlayfs/dir.c:630
vfs_create+0x388/0x5b0 fs/namei.c:2912
do_mknodat+0x410/0x530 fs/namei.c:3766
__do_sys_mknod fs/namei.c:3795 [inline]
__se_sys_mknod fs/namei.c:3793 [inline]
__x64_sys_mknod+0x7b/0xb0 fs/namei.c:3793
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2126d09c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000085
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457569
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000340
RBP: 000000000072bfa0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2126d0a6d4
R13: 00000000004c2a6e R14: 00000000004d4110 R15: 00000000ffffffff
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.

syzbot

unread,
Dec 15, 2018, 2:34:03 PM12/15/18
to linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
syzbot has found a reproducer for the following crash on:

HEAD commit: d14b746c6c1c Add linux-next specific files for 20181214
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=143f9a15400000
kernel config: https://syzkaller.appspot.com/x/.config?x=1da6d2d18f803140
dashboard link: https://syzkaller.appspot.com/bug?extid=9c69c282adc4edd2b540
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12a6e543400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+9c69c2...@syzkaller.appspotmail.com

overlayfs: filesystem on './file0' not supported as upperdir
overlayfs: filesystem on './file0' not supported as upperdir
overlayfs: filesystem on './file0' not supported as upperdir
overlayfs: filesystem on './file0' not supported as upperdir
overlayfs: filesystem on './file0' not supported as upperdir
WARNING: CPU: 1 PID: 28918 at fs/overlayfs/dir.c:263
ovl_instantiate+0x369/0x400 fs/overlayfs/dir.c:263
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 28918 Comm: syz-executor1 Not tainted 4.20.0-rc6-next-20181214+
#171
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x39d lib/dump_stack.c:113
panic+0x2ad/0x632 kernel/panic.c:214
__warn.cold.8+0x20/0x4f kernel/panic.c:571
report_bug+0x254/0x2d0 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:178 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
RIP: 0010:ovl_instantiate+0x369/0x400 fs/overlayfs/dir.c:263
Code: c3 89 c6 e8 89 35 ed fe 85 db 0f 85 9e 00 00 00 e8 6c 34 ed fe 4c 89
e7 45 31 f6 e8 a1 b1 44 ff e9 ec fe ff ff e8 57 34 ed fe <0f> 0b e9 e0 fe
ff ff e8 4b 34 ed fe 0f 0b e9 63 ff ff ff e8 ef 88
RSP: 0018:ffff8881ca6679a8 EFLAGS: 00010293
RAX: ffff8881d39c44c0 RBX: ffff8881ca667a40 RCX: ffffffff8292cd44
RDX: 0000000000000000 RSI: ffffffff8292cec9 RDI: 0000000000000007
RBP: ffff8881ca667a68 R08: ffff8881d39c44c0 R09: ffffed10394ccee8
R10: ffffed10394ccee8 R11: 0000000000000003 R12: ffff8881a357c8c0
R13: ffff8881ca6679e0 R14: ffffffffffffff8c R15: 0000000000000000
ovl_create_over_whiteout fs/overlayfs/dir.c:518 [inline]
ovl_create_or_link+0xad6/0x1560 fs/overlayfs/dir.c:582
ovl_create_object+0x2e9/0x3a0 fs/overlayfs/dir.c:616
ovl_symlink+0x24/0x30 fs/overlayfs/dir.c:651
vfs_symlink+0x37a/0x5d0 fs/namei.c:4127
do_symlinkat+0x242/0x2d0 fs/namei.c:4154
__do_sys_symlink fs/namei.c:4173 [inline]
__se_sys_symlink fs/namei.c:4171 [inline]
__x64_sys_symlink+0x59/0x80 fs/namei.c:4171
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457659
Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fbde1680c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000058
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 0000000000457659
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000020000040
RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fbde16816d4
R13: 00000000004c532d R14: 00000000004d97a0 R15: 00000000ffffffff

Amir Goldstein

unread,
Dec 16, 2018, 12:00:45 PM12/16/18
to syzbot+9c69c2...@syzkaller.appspotmail.com, Dmitry Vyukov, linux-kernel, overlayfs, Miklos Szeredi, syzkall...@googlegroups.com
On Sat, Dec 15, 2018 at 9:34 PM syzbot
<syzbot+9c69c2...@syzkaller.appspotmail.com> wrote:
>
> syzbot has found a reproducer for the following crash on:
>
> HEAD commit: d14b746c6c1c Add linux-next specific files for 20181214
> git tree: linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=143f9a15400000
> kernel config: https://syzkaller.appspot.com/x/.config?x=1da6d2d18f803140
> dashboard link: https://syzkaller.appspot.com/bug?extid=9c69c282adc4edd2b540
> compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12a6e543400000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+9c69c2...@syzkaller.appspotmail.com
>
> overlayfs: filesystem on './file0' not supported as upperdir
> overlayfs: filesystem on './file0' not supported as upperdir
> overlayfs: filesystem on './file0' not supported as upperdir
> overlayfs: filesystem on './file0' not supported as upperdir
> overlayfs: filesystem on './file0' not supported as upperdir
> WARNING: CPU: 1 PID: 28918 at fs/overlayfs/dir.c:263
> ovl_instantiate+0x369/0x400 fs/overlayfs/dir.c:263

Looks like some corner case race when using same dir as upper and lower.
Doesn't look like a critical issue, I just don't know how to explain
getting to this
state. Couldn't reproduce on my target machine.

It would have been interesting for me to see the strace of the repro threads
when that WARN happens. I wonder if anyone else has already asked for it and
how hard would it be to make that information available with the bug report.

Thanks,
Amir.

Dmitry Vyukov

unread,
Dec 17, 2018, 5:47:07 AM12/17/18
to Amir Goldstein, syzbot+9c69c2...@syzkaller.appspotmail.com, linux-kernel, overlayfs, Miklos Szeredi, syzkaller-bugs
Hi Amir,

By strace you mean return values of syscalls, or something else?

We had only 1 strace-related request, and it was related to better
static decoding of inputs rather then dynamic behavior:
https://github.com/google/syzkaller/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+strace

I don't immediately see how to capture runtime behavior. It would work
if we dump everything onto console right away. But this will produce
tons of output (really lots). And that output will be intermixed
across parallel processes. And it will be hard to understand which
exactly syscalls participated in the process that provoked the crash.
Or maybe it's exactly syscalls from several processes interacted. Lots
of output can also slow down and perturb execution.
Capturing return values in memory and then printing for crashed
program is also problematic. First, once the kernel has crashed we
can't really do anything (print return values). Second, it's
impossible to detect which exactly process triggered a kernel crash.
Crashes can be in interrupts executed on behalf of some processes, on
kernel threads, and again interactions between several processes.

But meanwhile I was able to reproduce this on the first run within 4
minutes. Maybe you need to wait longer, it does not happen
immediately. Once you can reproduce it, you can do any amount of
custom instrumentation and printing for debugging.

root@syzkaller:~# ./syz-execprog -repeat=0 -procs=6 ovl
2018/12/17 10:20:24 parsed 1 programs
[ 37.279031] overlayfs: filesystem on './file0' not supported as upperdir
[ 37.280611] overlayfs: filesystem on './file0' not supported as upperdir
[ 37.292983] overlayfs: filesystem on './file0' not supported as upperdir
[ 37.321756] overlayfs: filesystem on './file0' not supported as upperdir
[ 37.350428] overlayfs: filesystem on './file0' not supported as upperdir
[ 37.422091] overlayfs: filesystem on './file0' not supported as upperdir
...
[ 256.420482] overlayfs: filesystem on './file0' not supported as upperdir
[ 256.426182] overlayfs: filesystem on './file0' not supported as upperdir
[ 256.508980] overlayfs: filesystem on './file0' not supported as upperdir
[ 256.515183] WARNING: CPU: 1 PID: 28156 at fs/overlayfs/dir.c:263
ovl_instantiate+0x369/0x400
[ 256.516468] Kernel panic - not syncing: panic_on_warn set ...
[ 256.517312] CPU: 1 PID: 28156 Comm: syz-executor Not tainted
4.20.0-rc6-next-20181214 #5
[ 256.518455] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1 04/01/2014
[ 256.519624] Call Trace:
[ 256.519995] dump_stack+0x244/0x39d
[ 256.520518] ? dump_stack_print_info.cold.1+0x20/0x20

[M es sa2ge5 f6ro.m520635] kobject: 'loop4' (00000000e9855b9a):
kobject_uevent_env
[ 256.521244] panic+0x2ad/0x632
sy[sl og d...@2sy5zk6al.l522732] kobject: 'loop4' (00000000e9855b9a):
fill_kobj_path: path = '/devices/virtual/block/loop4'
[ 256.523137] ? add_taint.cold.5+0x16/0x16
er a[t D2e5c6 .1572 150:766] ? __warn.cold.8+0x5/0x4f
54:0[6 .2.5.6
2 6k5e1r1] ? __warn+0xe8/0x1d0
n[e l :2[5 6 .25526.751262467] ? ovl_instantiate+0x369/0x400
8[] Ke rn2el5 p6a.n5i2c 8046] __warn.cold.8+0x20/0x4f
[ 256.528727] ? rcu_softirq_qs+0x20/0x20
- n[o t 2sy5nc6in.g:5 p29285] ? ovl_instantiate+0x369/0x400
ani[c _ on2_w5ar6n .se5t30050] report_bug+0x254/0x2d0
5[. . .2
.530716] do_error_trap+0x11b/0x200
[ 256.531371] do_invalid_op+0x36/0x40
[ 256.531897] ? ovl_instantiate+0x369/0x400
[ 256.532496] invalid_op+0x14/0x20
[ 256.532981] RIP: 0010:ovl_instantiate+0x369/0x400
[ 256.533675] Code: c3 89 c6 e8 89 35 ed fe 85 db 0f 85 9e 00 00 00
e8 6c 34 ed fe 4c 89 e7 45 31 f6 e8 a1 b1 44 ff e9 ec fe ff ff e8 57
34 ed fe <0f> 0b e9 e0 fe ff ff e8 4b 34 ed fe 0f 0b e9 63 ff ff ff e8
ef 88
[ 256.536308] RSP: 0018:ffff888050ecf9a8 EFLAGS: 00010293
[ 256.537056] RAX: ffff888051030380 RBX: ffff888050ecfa40 RCX: ffffffff8292cd44
[ 256.538109] RDX: 0000000000000000 RSI: ffffffff8292cec9 RDI: 0000000000000007
[ 256.539116] RBP: ffff888050ecfa68 R08: ffff888051030380 R09: ffffed100a1d9ee8
[ 256.540115] R10: ffffed100a1d9ee8 R11: 0000000000000003 R12: ffff88804d0884a0
[ 256.541108] R13: ffff888050ecf9e0 R14: ffffffffffffff8c R15: 0000000000000000
[ 256.542126] ? ovl_instantiate+0x1e4/0x400
[ 256.542714] ? ovl_instantiate+0x369/0x400
[ 256.543304] ? ovl_instantiate+0x369/0x400
[ 256.543895] ? ovl_set_opaque_xerr+0x80/0x80
[ 256.544512] ovl_create_or_link+0xad6/0x1560
[ 256.545144] ? ovl_unlink+0x20/0x20
[ 256.545659] ? ovl_create_object+0x22f/0x3a0
[ 256.546270] ? lock_downgrade+0x900/0x900
[ 256.546852] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
[ 256.547602] ? kasan_check_read+0x11/0x20
[ 256.548163] ? do_raw_spin_unlock+0xa7/0x330
[ 256.548781] ? do_raw_spin_trylock+0x270/0x270
[ 256.549423] ? ovl_fill_inode+0x32a/0x6f0
[ 256.550002] ? debug_lockdep_rcu_enabled+0x77/0x90
[ 256.550688] ovl_create_object+0x2e9/0x3a0
[ 256.551278] ? ovl_create_or_link+0x1560/0x1560
[ 256.551926] ? security_inode_permission+0xd2/0x100
[ 256.552636] ovl_symlink+0x24/0x30
[ 256.553132] vfs_symlink+0x37a/0x5d0
[ 256.553659] do_symlinkat+0x242/0x2d0
[ 256.554194] ? __ia32_sys_unlink+0x50/0x50
[ 256.554795] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 256.555543] ? trace_hardirqs_off_caller+0x310/0x310
[ 256.556254] __x64_sys_symlink+0x59/0x80
[ 256.556823] do_syscall_64+0x1b9/0x820
[ 256.557375] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
[ 256.558123] ? syscall_return_slowpath+0x5e0/0x5e0
[ 256.558807] ? trace_hardirqs_on_caller+0x310/0x310
[ 256.559504] ? prepare_exit_to_usermode+0x3b0/0x3b0
[ 256.559964] overlayfs: filesystem on './file0' not supported as upperdir
[ 256.560200] ? post_copy_siginfo_from_user.isra.25.part.26+0x250/0x250
[ 256.560213] ? __switch_to_asm+0x40/0x70
[ 256.562831] ? __switch_to_asm+0x34/0x70
[ 256.563390] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 256.564048] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 256.564743] RIP: 0033:0x4570d9
[ 256.565208] Code: 5d af fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66
90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b af fb ff c3 66 2e 0f 1f 84 00 00
00 00
[ 256.567790] RSP: 002b:00007fca039f7c88 EFLAGS: 00000246 ORIG_RAX:
0000000000000058
[ 256.568852] RAX: ffffffffffffffda RBX: 000000000071bf00 RCX: 00000000004570d9
[ 256.569904] RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000020000040
[ 256.570933] RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000
[ 256.571976] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fca039f86d4
[ 256.572993] R13: 00000000004ac640 R14: 00000000006ec840 R15: 00000000ffffffff
[ 256.574406] Kernel Offset: disabled
[ 256.574982] Rebooting in 86400 seconds..

Amir Goldstein

unread,
Dec 17, 2018, 8:30:27 AM12/17/18
to Dmitry Vyukov, syzbot+9c69c2...@syzkaller.appspotmail.com, linux-kernel, overlayfs, Miklos Szeredi, syzkall...@googlegroups.com
I do mean return values.
Some of the commands in the repro are obviously going to fail and
some will fail conditionally depending on who wins the race.
It could have been good for analysis of the bug to know when the
race happened which syscall sequence took place.

> We had only 1 strace-related request, and it was related to better
> static decoding of inputs rather then dynamic behavior:
> https://github.com/google/syzkaller/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+strace
>
> I don't immediately see how to capture runtime behavior. It would work
> if we dump everything onto console right away. But this will produce
> tons of output (really lots). And that output will be intermixed
> across parallel processes. And it will be hard to understand which
> exactly syscalls participated in the process that provoked the crash.
> Or maybe it's exactly syscalls from several processes interacted. Lots
> of output can also slow down and perturb execution.

Yeh, I figured. Maybe the return values of syscalls is something that syzkaller
should cache and in case of failure, report recent run sequences in format
similar to the repro program. Just a though. Much easier said than done.

>
> But meanwhile I was able to reproduce this on the first run within 4
> minutes. Maybe you need to wait longer, it does not happen
> immediately.

Oh! I wonder if this type of information, how long or how many repeats before
crash happens is available in the bug report and I missed it - if not, could be
useful to add it.

Anyway, The reason that WARN_ON is there is because I wasn't sure
if that could happen. Apparently it can with this weird setup. Once I am able to
understand how it happens most likely the result will be to covert the WARN_ON
to pr_warn. User anyway gets an error, so there is probably nothing to
worry about
(famous last words).

Thanks,
Amir.

Dmitry Vyukov

unread,
Dec 18, 2018, 9:13:08 AM12/18/18
to Amir Goldstein, syzbot+9c69c2...@syzkaller.appspotmail.com, linux-kernel, overlayfs, Miklos Szeredi, syzkaller-bugs
Yes, not so easy to do. This info needs to be piped through several
processes and then sent off the machine. But then again we won't know
which exactly execution provoked the crash. Reporting just any
execution can work simpler cases, but in these cases this info is not
so useful because all executions are the same and a developer can
probably predict outcome of syscalls, or re-reproduce locally and
strace. And it won't exactly for harder cases like this where 1/10000
executions trigger the crash. And in such cases reporting a wrong info
may be worse then not reporting it at all.
Part of the idea is to provide enough info for a developer to
reproduce the crash locally and then they can dump any kind of info,
add additional checks, debugging output, etc. We can't cover debugging
problem in all its generality, there is no common recipe. And for
simpler cases a developer frequently does not need anything other than
kernel crash message.


> > But meanwhile I was able to reproduce this on the first run within 4
> > minutes. Maybe you need to wait longer, it does not happen
> > immediately.
>
> Oh! I wonder if this type of information, how long or how many repeats before
> crash happens is available in the bug report and I missed it - if not, could be
> useful to add it.

There are some implicit signals like the log saying that it took 7 minutes:

2018/12/15 18:45:07 executed programs: 0
[ 1266.911390] IPVS: ftp: loaded support on port[0] = 21
...
[ 1665.610555] overlayfs: filesystem on './file0' not supported as upperdir
[ 1665.617831] WARNING: CPU: 1 PID: 28918 at fs/overlayfs/dir.c:263
ovl_instantiate+0x369/0x400

or mentions of any of these in the repro (which are all present in this repro):

"threaded":true,"collide":true,"repeat":true,"procs":6

But I filed https://github.com/google/syzkaller/issues/885 for a more
verbal signal.

syzbot

unread,
Mar 26, 2019, 8:10:01 AM3/26/19
to amir...@gmail.com, dvy...@google.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, msze...@redhat.com, syzkall...@googlegroups.com
syzbot has bisected this bug to:

commit 01b39dcc95680b04c7af5de7f39f577e9c4865e3
Author: Amir Goldstein <amir...@gmail.com>
Date: Fri May 11 08:15:15 2018 +0000

ovl: use inode_insert5() to hash a newly created inode

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=176da0cd200000
start commit: de6629eb Merge tag 'pci-v5.0-fixes-1' of git://git.kernel...
git tree: upstream
final crash: https://syzkaller.appspot.com/x/report.txt?x=14eda0cd200000
console output: https://syzkaller.appspot.com/x/log.txt?x=10eda0cd200000
kernel config: https://syzkaller.appspot.com/x/.config?x=edf1c3031097c304
dashboard link: https://syzkaller.appspot.com/bug?extid=9c69c282adc4edd2b540
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12c7a94f400000

Reported-by: syzbot+9c69c2...@syzkaller.appspotmail.com
Fixes: 01b39dcc9568 ("ovl: use inode_insert5() to hash a newly created
inode")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Amir Goldstein

unread,
Apr 19, 2019, 4:21:15 AM4/19/19
to syzbot, Dmitry Vyukov, linux-kernel, overlayfs, Miklos Szeredi, Miklos Szeredi, syzkaller-bugs
Dmitry,

The root cause of this bug is that repro is mounting overlapping overlay
layers (i.e. upperdir=./file0,lowerdir=.:file0).
Miklos claimed that the fix should be to fail such mounts.
Below is a patch to test:

#syz test: https://github.com/amir73il/linux.git ovl-check-overlap

However, I see that this specific overlapping layers mount has already
mutated to several other repros out there, like the ones in this bug:

https://syzkaller.appspot.com/bug?extid=a55ccfc8a853d3cff213

I believe that disallowing overlapping layers will silence some
bugs, whose root cause may be different.

Besides doing the overlapping layers mount, this repro family also
does extensive access to overlay underlying layers concurrently
with overlay access and *that* is the root cause for most of these
"possible deadlock" bugs (some false positives and some real).

Assuming that ovl-check-overlap will get merged, you may need to
hint syzkaller that overlapping layers is no longer a valid input or
maybe it will figure it out on its own?...

Thanks,
Amir.

syzbot

unread,
Apr 19, 2019, 4:52:02 AM4/19/19
to amir...@gmail.com, dvy...@google.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, msze...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger
crash:

Reported-and-tested-by:
syzbot+9c69c2...@syzkaller.appspotmail.com

Tested on:

commit: 59ef64de ovl: detect overlapping layers
git tree: https://github.com/amir73il/linux.git ovl-check-overlap
kernel config: https://syzkaller.appspot.com/x/.config?x=f3ee46e9dfc8a383
compiler: gcc (GCC) 9.0.0 20181231 (experimental)

Note: testing is done by a robot and is best-effort only.

syzbot

unread,
Apr 19, 2019, 4:53:01 AM4/19/19
to amir...@gmail.com, dvy...@google.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, msze...@redhat.com, syzkall...@googlegroups.com

Dmitry Vyukov

unread,
Apr 22, 2019, 6:12:16 AM4/22/19
to Amir Goldstein, syzbot, linux-kernel, overlayfs, Miklos Szeredi, Miklos Szeredi, syzkaller-bugs
Hi Amir,

It should figure it out on its own, it's coverage-guided fuzzer. And
unlearning things is easier then learning them :) But thanks for
thinking about this.
But maybe there is something else important in overlayfs that's not
covered. Here you can see the current coverage of overlayfs:
https://storage.googleapis.com/syzkaller/cover/ci-upstream-linux-next-kasan-gce-root.html#e2f448f0ca2e4397fd609ff8c42d4cd118411148

Amir Goldstein

unread,
Apr 22, 2019, 7:08:15 AM4/22/19
to Dmitry Vyukov, syzbot, linux-kernel, overlayfs, Miklos Szeredi, Miklos Szeredi, syzkaller-bugs
That's nice, but the actual possible deadlocks that syzbot has currently
unveiled are not strictly by covering overlayfs code but rather by covering
VFS code that is *also* used by overlayfs.

See this thread for example:
https://lore.kernel.org/lkml/CAJfpegvt6eVhX8v5faMP76K0...@mail.gmail.com/

Documentation/filesystems/overlayfs.txt says:
"Changes to the underlying filesystems while part of a mounted overlay
filesystem are not allowed. If the underlying filesystem is changed,
the behavior of the overlay is undefined, though it will not result in
a crash or deadlock."

The part of "will not result in crash or deadlock" is only proven
empirically, so long as syzbot is not reproducing a crash or deadlock...

Thanks,
Amir.

Dmitry Vyukov

unread,
Apr 22, 2019, 7:59:10 AM4/22/19
to Amir Goldstein, syzbot, linux-kernel, overlayfs, Miklos Szeredi, Miklos Szeredi, syzkaller-bugs
I see.
Still we generally only teach it interfaces, and then let it loose
combining them and building sequences of syscalls and figuring out
what's interesting and what's not.
Reply all
Reply to author
Forward
0 new messages