KASAN: stack-out-of-bounds Write in compat_copy_entries

65 views
Skip to first unread message

syzbot

unread,
Apr 25, 2018, 1:19:03 AM4/25/18
to bri...@lists.linux-foundation.org, core...@netfilter.org, da...@davemloft.net, f...@strlen.de, kad...@blackhole.kfki.hu, linux-...@vger.kernel.org, net...@vger.kernel.org, netfilt...@vger.kernel.org, pa...@netfilter.org, ste...@networkplumber.org, syzkall...@googlegroups.com
Hello,

syzbot hit the following crash on upstream commit
24cac7009cb1b211f1c793ecb6a462c03dc35818 (Tue Apr 24 21:16:40 2018 +0000)
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
syzbot dashboard link:
https://syzkaller.appspot.com/bug?extid=4e42a04e0bc33cb6c087

So far this crash happened 3 times on upstream.
syzkaller reproducer:
https://syzkaller.appspot.com/x/repro.syz?id=4827027970457600
Raw console output:
https://syzkaller.appspot.com/x/log.txt?id=6212733133389824
Kernel config:
https://syzkaller.appspot.com/x/.config?id=7043958930931867332
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
user-space arch: i386

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+4e42a0...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for
details.
If you forward the report, please keep this part and the footer.

random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
IPVS: ftp: loaded support on port[0] = 21
==================================================================
BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300
[inline]
BUG: KASAN: stack-out-of-bounds in compat_mtw_from_user
net/bridge/netfilter/ebtables.c:1957 [inline]
BUG: KASAN: stack-out-of-bounds in ebt_size_mwt
net/bridge/netfilter/ebtables.c:2059 [inline]
BUG: KASAN: stack-out-of-bounds in size_entry_mwt
net/bridge/netfilter/ebtables.c:2155 [inline]
BUG: KASAN: stack-out-of-bounds in compat_copy_entries+0x96c/0x14a0
net/bridge/netfilter/ebtables.c:2194
Write of size 33 at addr ffff8801b0abf888 by task syz-executor0/4504

CPU: 0 PID: 4504 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #40
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
print_address_description+0x6c/0x20b mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
check_memory_region_inline mm/kasan/kasan.c:260 [inline]
check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
memcpy+0x37/0x50 mm/kasan/kasan.c:303
strlcpy include/linux/string.h:300 [inline]
compat_mtw_from_user net/bridge/netfilter/ebtables.c:1957 [inline]
ebt_size_mwt net/bridge/netfilter/ebtables.c:2059 [inline]
size_entry_mwt net/bridge/netfilter/ebtables.c:2155 [inline]
compat_copy_entries+0x96c/0x14a0 net/bridge/netfilter/ebtables.c:2194
compat_do_replace+0x483/0x900 net/bridge/netfilter/ebtables.c:2285
compat_do_ebt_set_ctl+0x2ac/0x324 net/bridge/netfilter/ebtables.c:2367
compat_nf_sockopt net/netfilter/nf_sockopt.c:144 [inline]
compat_nf_setsockopt+0x9b/0x140 net/netfilter/nf_sockopt.c:156
compat_ip_setsockopt+0xff/0x140 net/ipv4/ip_sockglue.c:1279
inet_csk_compat_setsockopt+0x97/0x120 net/ipv4/inet_connection_sock.c:1041
compat_tcp_setsockopt+0x49/0x80 net/ipv4/tcp.c:2901
compat_sock_common_setsockopt+0xb4/0x150 net/core/sock.c:3050
__compat_sys_setsockopt+0x1ab/0x7c0 net/compat.c:403
__do_compat_sys_setsockopt net/compat.c:416 [inline]
__se_compat_sys_setsockopt net/compat.c:413 [inline]
__ia32_compat_sys_setsockopt+0xbd/0x150 net/compat.c:413
do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7fb3cb9
RSP: 002b:00000000fff0c26c EFLAGS: 00000282 ORIG_RAX: 000000000000016e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000000
RDX: 0000000000000080 RSI: 0000000020000300 RDI: 00000000000005f4
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

The buggy address belongs to the page:
page:ffffea0006c2afc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x2fffc0000000000()
raw: 02fffc0000000000 0000000000000000 0000000000000000 00000000ffffffff
raw: 0000000000000000 ffffea0006c20101 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff8801b0abf780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff8801b0abf800: 00 00 00 00 00 f1 f1 f1 f1 00 00 f2 f2 f2 f2 f2
> ffff8801b0abf880: f2 00 00 00 07 f3 f3 f3 f3 00 00 00 00 00 00 00
^
ffff8801b0abf900: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00
ffff8801b0abf980: 00 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00
==================================================================


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzk...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.
Note: all commands must start from beginning of the line in the email body.

Paolo Abeni

unread,
Apr 26, 2018, 9:14:50 AM4/26/18
to syzbot, syzkall...@googlegroups.com
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master

more coffee is needed here, or is the generic strlcpy implementation
bugged ?!? (or at least not suitable for this use case)

---
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 28a4c3490359..72e6e1708f91 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1953,8 +1953,10 @@ static int compat_mtw_from_user(struct compat_ebt_entry_mwt *mwt,
void *dst = NULL;
int off, pad = 0;
unsigned int size_kern, match_size = mwt->match_size;
+ int len = strnlen(mwt->u.name, EBT_EXTENSION_MAXNAMELEN - 1);

- strlcpy(name, mwt->u.name, sizeof(name));
+ memcpy(name, mwt->u.name, len);
+ name[len] = 0;

if (state->buf_kern_start)
dst = state->buf_kern_start + state->buf_kern_offset;

Dmitry Vyukov

unread,
Apr 26, 2018, 9:33:05 AM4/26/18
to Paolo Abeni, syzbot, syzkaller-bugs, Kees Cook
On Thu, Apr 26, 2018 at 3:14 PM, Paolo Abeni <pab...@redhat.com> wrote:
>
> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master
>
> more coffee is needed here, or is the generic strlcpy implementation
> bugged ?!? (or at least not suitable for this use case)

strlcpy does require that src is a NUL-terminated string. It returns
number of bytes that would have been copied provided that dst has
enough space, and that is simply not possible to do without reading
src up to NUL. strlcpy is different from strncpy.

On top of this, the "fortified" version of strlcpy currently has a
feature that it can also overwrite dst with full src len ignoring
provided bound in such case (what we are seeing here). It's kinda
difficult to say who is guilty here, because function's preconditions
are violated. But I would say that the fortified version should still
respect the bound for dst even if it does a wild read of src. +Kees
for this.

syzbot

unread,
Apr 26, 2018, 9:39:02 AM4/26/18
to pab...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer still triggered
crash:
WARNING: inconsistent lock state


================================
WARNING: inconsistent lock state
4.17.0-rc2+ #1 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
syz-executor3/14919 [HC1[1]:SC0[0]:HE0:SE1] takes:
(ptrval) (fs_reclaim){?.+.}, at:
fs_reclaim_acquire.part.82+0x0/0x30 mm/page_alloc.c:463
{HARDIRQ-ON-W} state was registered at:
lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
fs_reclaim_acquire.part.82+0x24/0x30 mm/page_alloc.c:3739
fs_reclaim_acquire+0x14/0x20 mm/page_alloc.c:3740
slab_pre_alloc_hook mm/slab.h:418 [inline]
slab_alloc_node mm/slab.c:3299 [inline]
kmem_cache_alloc_node_trace+0x39/0x770 mm/slab.c:3661
kmalloc_node include/linux/slab.h:550 [inline]
kzalloc_node include/linux/slab.h:712 [inline]
alloc_worker+0xbd/0x2e0 kernel/workqueue.c:1704
init_rescuer.part.25+0x1f/0x190 kernel/workqueue.c:4000
init_rescuer kernel/workqueue.c:3997 [inline]
workqueue_init+0x51f/0x7d0 kernel/workqueue.c:5732
kernel_init_freeable+0x2ad/0x58e init/main.c:1115
kernel_init+0x11/0x1b3 init/main.c:1053
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
irq event stamp: 1324
hardirqs last enabled at (1323): [<ffffffff81b5a789>] qlink_free
mm/kasan/quarantine.c:150 [inline]
hardirqs last enabled at (1323): [<ffffffff81b5a789>]
qlist_free_all+0xe9/0x160 mm/kasan/quarantine.c:166
hardirqs last disabled at (1324): [<ffffffff87800905>]
interrupt_entry+0xb5/0xf0 arch/x86/entry/entry_64.S:625
softirqs last enabled at (296): [<ffffffff87a00778>]
__do_softirq+0x778/0xaf5 kernel/softirq.c:311
softirqs last disabled at (217): [<ffffffff814750c1>] invoke_softirq
kernel/softirq.c:365 [inline]
softirqs last disabled at (217): [<ffffffff814750c1>] irq_exit+0x1d1/0x200
kernel/softirq.c:405

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(fs_reclaim);
<Interrupt>
lock(fs_reclaim);

*** DEADLOCK ***

1 lock held by syz-executor3/14919:
#0: (ptrval) (remove_cache_srcu){....}, at:
quarantine_reduce+0x3f/0x170 mm/kasan/quarantine.c:261

stack backtrace:
CPU: 0 PID: 14919 Comm: syz-executor3 Not tainted 4.17.0-rc2+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
print_usage_bug.cold.59+0x320/0x41a kernel/locking/lockdep.c:2542
valid_state kernel/locking/lockdep.c:2555 [inline]
mark_lock_irq kernel/locking/lockdep.c:2749 [inline]
mark_lock+0x1034/0x19e0 kernel/locking/lockdep.c:3147
mark_irqflags kernel/locking/lockdep.c:3022 [inline]
__lock_acquire+0x1595/0x5140 kernel/locking/lockdep.c:3388
lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
fs_reclaim_acquire.part.82+0x24/0x30 mm/page_alloc.c:3739
fs_reclaim_acquire+0x14/0x20 mm/page_alloc.c:3740
slab_pre_alloc_hook mm/slab.h:418 [inline]
slab_alloc mm/slab.c:3378 [inline]
__do_kmalloc mm/slab.c:3716 [inline]
__kmalloc+0x45/0x760 mm/slab.c:3727
kmalloc_array include/linux/slab.h:631 [inline]
kcalloc include/linux/slab.h:642 [inline]
numa_crng_init drivers/char/random.c:798 [inline]
crng_reseed+0x427/0x920 drivers/char/random.c:923
credit_entropy_bits+0x98d/0xa30 drivers/char/random.c:708
add_interrupt_randomness+0x494/0x860 drivers/char/random.c:1254
handle_irq_event_percpu+0xf9/0x1c0 kernel/irq/handle.c:191
handle_irq_event+0xa7/0x135 kernel/irq/handle.c:206
handle_edge_irq+0x20f/0x870 kernel/irq/chip.c:791
generic_handle_irq_desc include/linux/irqdesc.h:159 [inline]
handle_irq+0x18c/0x2e7 arch/x86/kernel/irq_64.c:77
do_IRQ+0x78/0x190 arch/x86/kernel/irq.c:245
common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:642
</IRQ>
RIP: 0010:qlink_to_object mm/kasan/quarantine.c:136 [inline]
RIP: 0010:qlink_free mm/kasan/quarantine.c:141 [inline]
RIP: 0010:qlist_free_all+0x3e/0x160 mm/kasan/quarantine.c:166
RSP: 0018:ffff8801bb507b58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda
RAX: ffff8801b13cdcc0 RBX: 0000000000000286 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffea0006c304df RDI: 0000000000000286
RBP: ffff8801bb507b90 R08: ffff8801ac7cef38 R09: 0000000000000006
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8801da94a500 R14: ffff8801b13cdcc0 R15: ffffffff88d18ae0
quarantine_reduce+0x141/0x170 mm/kasan/quarantine.c:259
kasan_kmalloc+0x99/0xe0 mm/kasan/kasan.c:538
kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
slab_post_alloc_hook mm/slab.h:444 [inline]
slab_alloc mm/slab.c:3392 [inline]
kmem_cache_alloc+0x11b/0x760 mm/slab.c:3552
getname_flags+0xd0/0x5a0 fs/namei.c:140
getname fs/namei.c:211 [inline]
do_symlinkat+0x83/0x2b0 fs/namei.c:4113
__do_sys_symlink fs/namei.c:4143 [inline]
__se_sys_symlink fs/namei.c:4141 [inline]
__ia32_sys_symlink+0x57/0x80 fs/namei.c:4141
do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7f68cb9
RSP: 002b:00000000ffeb3f1c EFLAGS: 00000246 ORIG_RAX: 0000000000000053
RAX: ffffffffffffffda RBX: 00000000ffeb4c5c RCX: 00000000080d0eea
RDX: 0000000000000003 RSI: 00000000000000fb RDI: 0000000000004c01
RBP: 0000000000000771 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
BUG: sleeping function called from invalid context at mm/slab.h:421
in_atomic(): 1, irqs_disabled(): 1, pid: 14919, name: syz-executor3
INFO: lockdep is turned off.
irq event stamp: 1324
hardirqs last enabled at (1323): [<ffffffff81b5a789>] qlink_free
mm/kasan/quarantine.c:150 [inline]
hardirqs last enabled at (1323): [<ffffffff81b5a789>]
qlist_free_all+0xe9/0x160 mm/kasan/quarantine.c:166
hardirqs last disabled at (1324): [<ffffffff87800905>]
interrupt_entry+0xb5/0xf0 arch/x86/entry/entry_64.S:625
softirqs last enabled at (296): [<ffffffff87a00778>]
__do_softirq+0x778/0xaf5 kernel/softirq.c:311
softirqs last disabled at (217): [<ffffffff814750c1>] invoke_softirq
kernel/softirq.c:365 [inline]
softirqs last disabled at (217): [<ffffffff814750c1>] irq_exit+0x1d1/0x200
kernel/softirq.c:405
CPU: 0 PID: 14919 Comm: syz-executor3 Not tainted 4.17.0-rc2+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
___might_sleep.cold.87+0x11f/0x13a kernel/sched/core.c:6188
__might_sleep+0x95/0x190 kernel/sched/core.c:6141
slab_pre_alloc_hook mm/slab.h:421 [inline]
slab_alloc mm/slab.c:3378 [inline]
__do_kmalloc mm/slab.c:3716 [inline]
__kmalloc+0x2b9/0x760 mm/slab.c:3727
kmalloc_array include/linux/slab.h:631 [inline]
kcalloc include/linux/slab.h:642 [inline]
numa_crng_init drivers/char/random.c:798 [inline]
crng_reseed+0x427/0x920 drivers/char/random.c:923
credit_entropy_bits+0x98d/0xa30 drivers/char/random.c:708
add_interrupt_randomness+0x494/0x860 drivers/char/random.c:1254
handle_irq_event_percpu+0xf9/0x1c0 kernel/irq/handle.c:191
handle_irq_event+0xa7/0x135 kernel/irq/handle.c:206
handle_edge_irq+0x20f/0x870 kernel/irq/chip.c:791
generic_handle_irq_desc include/linux/irqdesc.h:159 [inline]
handle_irq+0x18c/0x2e7 arch/x86/kernel/irq_64.c:77
do_IRQ+0x78/0x190 arch/x86/kernel/irq.c:245
common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:642
</IRQ>
RIP: 0010:qlink_to_object mm/kasan/quarantine.c:136 [inline]
RIP: 0010:qlink_free mm/kasan/quarantine.c:141 [inline]
RIP: 0010:qlist_free_all+0x3e/0x160 mm/kasan/quarantine.c:166
RSP: 0018:ffff8801bb507b58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda
RAX: ffff8801b13cdcc0 RBX: 0000000000000286 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffea0006c304df RDI: 0000000000000286
RBP: ffff8801bb507b90 R08: ffff8801ac7cef38 R09: 0000000000000006
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8801da94a500 R14: ffff8801b13cdcc0 R15: ffffffff88d18ae0
quarantine_reduce+0x141/0x170 mm/kasan/quarantine.c:259
kasan_kmalloc+0x99/0xe0 mm/kasan/kasan.c:538
kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
slab_post_alloc_hook mm/slab.h:444 [inline]
slab_alloc mm/slab.c:3392 [inline]
kmem_cache_alloc+0x11b/0x760 mm/slab.c:3552
getname_flags+0xd0/0x5a0 fs/namei.c:140
getname fs/namei.c:211 [inline]
do_symlinkat+0x83/0x2b0 fs/namei.c:4113
__do_sys_symlink fs/namei.c:4143 [inline]
__se_sys_symlink fs/namei.c:4141 [inline]
__ia32_sys_symlink+0x57/0x80 fs/namei.c:4141
do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7f68cb9
RSP: 002b:00000000ffeb3f1c EFLAGS: 00000246 ORIG_RAX: 0000000000000053
RAX: ffffffffffffffda RBX: 00000000ffeb4c5c RCX: 00000000080d0eea
RDX: 0000000000000003 RSI: 00000000000000fb RDI: 0000000000004c01
RBP: 0000000000000771 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
random: crng init done


Tested on net commit
25eb0ea7174c6e84f21fa59dccbddd0318b17b12 (Thu Apr 26 02:55:33 2018 +0000)
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

compiler: gcc (GCC) 8.0.1 20180413 (experimental)
Patch: https://syzkaller.appspot.com/x/patch.diff?id=6538742961537024
Kernel config:
https://syzkaller.appspot.com/x/.config?id=7043958930931867332
Raw console output:
https://syzkaller.appspot.com/x/log.txt?id=5410319962734592

Paolo Abeni

unread,
Apr 26, 2018, 10:35:30 AM4/26/18
to Dmitry Vyukov, syzbot, syzkaller-bugs, Kees Cook
On Thu, 2018-04-26 at 15:32 +0200, Dmitry Vyukov wrote:
> On Thu, Apr 26, 2018 at 3:14 PM, Paolo Abeni <pab...@redhat.com> wrote:
> >
> > #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master
> >
> > more coffee is needed here, or is the generic strlcpy implementation
> > bugged ?!? (or at least not suitable for this use case)
>
> strlcpy does require that src is a NUL-terminated string. It returns
> number of bytes that would have been copied provided that dst has
> enough space, and that is simply not possible to do without reading
> src up to NUL.

Oops, I missed the details about strlcpy return value. I would say that
strlcpy() does not fit with untrusted, user provided, input.

Thanks,

Paolo

Paolo Abeni

unread,
Apr 26, 2018, 10:39:36 AM4/26/18
to syzbot, syzkall...@googlegroups.com
This looks unrelated to the original bug. Possibly some other issue
introduced by in-beteween commits?

Is there an easy why to test a changeset on top of a specific known
status (commit hash), beyond pulishing somewhere a tree with a branch
matching such hash?

Thanks!

Paolo

Dmitry Vyukov

unread,
Apr 26, 2018, 10:54:19 AM4/26/18
to Paolo Abeni, syzbot, syzkaller-bugs
It's this guy:

https://syzkaller.appspot.com/bug?id=0e06e9b4ed9a043361196cb8413cdc16a15b4b1f

The problem is that fires randomly. There are guilty commit and a fix
mentioned here:

https://groups.google.com/forum/#!msg/syzkaller-bugs/Z4a-3bfklR8/WsEX4JXoAgAJ

> Is there an easy why to test a changeset on top of a specific known
> status (commit hash), beyond pulishing somewhere a tree with a branch
> matching such hash?

Yes, there is such feature (since yesterday):

https://github.com/google/syzkaller/blob/master/docs/syzbot.md#testing-patches

Paolo Abeni

unread,
Apr 26, 2018, 11:07:50 AM4/26/18
to syzbot, syzkall...@googlegroups.com
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 18b7fd1c93e5

trying again, explicitly avoiding the commit triggering the random
crash reported by syzbot on the previous attempt (thanks Dmitry!)

syzbot

unread,
Apr 26, 2018, 11:29:02 AM4/26/18
to pab...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot tried to test the proposed patch but build/boot failed:

lost connection to test machine



INIT: Entering runlevel: 2

[ [36minfo [39;49m] Using makefile-style concurrent boot in runlevel 2.
[....] Starting enhanced syslogd: rsyslogd [?25l [?1c 7 [1G[ [32m ok
[39;49m 8 [?25h [?0c.
[....] Starting periodic command scheduler: cron [?25l [?1c 7 [1G[ [32m ok
[39;49m 8 [?25h [?0c.
[....] Starting OpenBSD Secure Shell server: sshd [?25l [?1c 7 [1G[ [32m ok
[39;49m 8 [?25h [?0c.

Debian GNU/Linux 7 syzkaller ttyS0

Warning: Permanently added '10.128.0.24' (ECDSA) to the list of known hosts.
2018/04/26 15:27:50 fuzzer started
2018/04/26 15:27:50 connecting to host at 10.128.0.26:37751
2018/04/26 15:27:50 checking config...
syzkaller login: [ 36.238341] can: request_module (can-proto-0) failed.
[ 36.248842] can: request_module (can-proto-0) failed.
2018/04/26 15:27:54 enabled syscalls: 1625
2018/04/26 15:27:54 testing simple program...
[ 36.900351] IPVS: ftp: loaded support on port[0] = 21
[ 37.291223] bridge0: port 1(bridge_slave_0) entered blocking state
[ 37.297962] bridge0: port 1(bridge_slave_0) entered disabled state
[ 37.306696] device bridge_slave_0 entered promiscuous mode
[ 37.335658] bridge0: port 2(bridge_slave_1) entered blocking state
[ 37.342121] bridge0: port 2(bridge_slave_1) entered disabled state
[ 37.349407] device bridge_slave_1 entered promiscuous mode
[ 37.375932] IPv6: ADDRCONF(NETDEV_UP): veth0_to_bridge: link is not ready
[ 37.403283] IPv6: ADDRCONF(NETDEV_UP): veth1_to_bridge: link is not ready
[ 37.476280] bond0: Enslaving bond_slave_0 as an active interface with an
up link
[ 37.507760] bond0: Enslaving bond_slave_1 as an active interface with an
up link
[ 37.630777] IPv6: ADDRCONF(NETDEV_UP): team_slave_0: link is not ready
[ 37.638290] team0: Port device team_slave_0 added
[ 37.667264] IPv6: ADDRCONF(NETDEV_UP): team_slave_1: link is not ready
[ 37.675111] team0: Port device team_slave_1 added
[ 37.704725] IPv6: ADDRCONF(NETDEV_CHANGE): team_slave_0: link becomes
ready
[ 37.734549] IPv6: ADDRCONF(NETDEV_CHANGE): team_slave_1: link becomes
ready
[ 37.763449] IPv6: ADDRCONF(NETDEV_CHANGE): veth0_to_bridge: link becomes
ready
[ 37.792415] IPv6: ADDRCONF(NETDEV_CHANGE): veth1_to_bridge: link becomes
ready
[ 38.043127] bridge0: port 2(bridge_slave_1) entered blocking state
[ 38.050175] bridge0: port 2(bridge_slave_1) entered forwarding state
[ 38.057371] bridge0: port 1(bridge_slave_0) entered blocking state
[ 38.063902] bridge0: port 1(bridge_slave_0) entered forwarding state
[ 38.966258] 8021q: adding VLAN 0 to HW filter on device bond0
[ 39.053294] IPv6: ADDRCONF(NETDEV_UP): veth0: link is not ready
[ 39.147430] IPv6: ADDRCONF(NETDEV_UP): veth1: link is not ready
[ 39.153755] IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
[ 39.165913] IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
[ 39.250890] 8021q: adding VLAN 0 to HW filter on device team0
2018/04/26 15:27:57 execution failed: executor 0: failed:
net.ipv6.conf.syz_tun.accept_dad = 0
net.ipv6.conf.syz_tun.router_solicitations = 0
RTNETLINK answers: Operation not supported
RTNETLINK answers: No buffer space available
RTNETLINK answers: Operation not supported
RTNETLINK answers: Operation not supported
RTNETLINK answers: Operation not supported
RTNETLINK answers: Operation not supported
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
getsockopt(EBT_SO_GET_INIT_ENTRIES) (errno 22)
loop failed (errno 0)

net.ipv6.conf.syz_tun.accept_dad = 0
net.ipv6.conf.syz_tun.router_solicitations = 0
RTNETLINK answers: Operation not supported
RTNETLINK answers: No buffer space available
RTNETLINK answers: Operation not supported
RTNETLINK answers: Operation not supported
RTNETLINK answers: Operation not supported
RTNETLINK answers: Operation not supported
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
getsockopt(EBT_SO_GET_INIT_ENTRIES) (errno 22)
loop failed (errno 0)



Tested on git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git commit
18b7fd1c93e5204355ddbf2608a097d64df81b88 (Sat Apr 14 15:50:50 2018 +0000)
Merge branch 'akpm' (patches from Andrew)

compiler: gcc (GCC) 8.0.1 20180413 (experimental)
Patch: https://syzkaller.appspot.com/x/patch.diff?id=5548307598278656
Kernel config:
https://syzkaller.appspot.com/x/.config?id=5872294200030231777

Dmitry Vyukov

unread,
Apr 26, 2018, 11:48:42 AM4/26/18
to syzbot, Paolo Abeni, syzkaller-bugs
On Thu, Apr 26, 2018 at 5:29 PM, syzbot
<syzbot+4e42a0...@syzkaller.appspotmail.com> wrote:
> Hello,
>
> syzbot tried to test the proposed patch but build/boot failed:
>
> lost connection to test machine

Now, this is a true one. Compat ebtables were just recently fixed.
Perhaps you could resubmit the original request, there are chances
that it won't fail with the "inconsistent lock state" because it's
flaky and happens only sometimes.
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/0000000000000e84eb056ac20c5d%40google.com.
>
> For more options, visit https://groups.google.com/d/optout.

Paolo Abeni

unread,
Apr 26, 2018, 12:04:44 PM4/26/18
to syzbot, syzkall...@googlegroups.com
As per Dmitry suggestion, re-submit and hope to not hit again the
unrelated locking issue.
---
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 28a4c3490359..6ba639f6c51d 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1954,7 +1954,8 @@ static int compat_mtw_from_user(struct compat_ebt_entry_mwt *mwt,
int off, pad = 0;
unsigned int size_kern, match_size = mwt->match_size;

- strlcpy(name, mwt->u.name, sizeof(name));
+ if (strscpy(name, mwt->u.name, sizeof(name)) < 0)
+ return -EINVAL;

syzbot

unread,
Apr 26, 2018, 12:33:02 PM4/26/18
to pab...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer still triggered
crash:
WARNING: inconsistent lock state


================================
WARNING: inconsistent lock state
4.17.0-rc2+ #1 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
syz-executor3/4839 [HC1[1]:SC0[0]:HE0:SE1] takes:
(ptrval) (fs_reclaim){?.+.}, at:
fs_reclaim_acquire.part.82+0x0/0x30 mm/page_alloc.c:463
{HARDIRQ-ON-W} state was registered at:
lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
fs_reclaim_acquire.part.82+0x24/0x30 mm/page_alloc.c:3739
fs_reclaim_acquire+0x14/0x20 mm/page_alloc.c:3740
slab_pre_alloc_hook mm/slab.h:418 [inline]
slab_alloc_node mm/slab.c:3299 [inline]
kmem_cache_alloc_node_trace+0x39/0x770 mm/slab.c:3661
kmalloc_node include/linux/slab.h:550 [inline]
kzalloc_node include/linux/slab.h:712 [inline]
alloc_worker+0xbd/0x2e0 kernel/workqueue.c:1704
init_rescuer.part.25+0x1f/0x190 kernel/workqueue.c:4000
init_rescuer kernel/workqueue.c:3997 [inline]
workqueue_init+0x51f/0x7d0 kernel/workqueue.c:5732
kernel_init_freeable+0x2ad/0x58e init/main.c:1115
kernel_init+0x11/0x1b3 init/main.c:1053
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
irq event stamp: 6410532
hardirqs last enabled at (6410531): [<ffffffff876ecf87>]
__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline]
hardirqs last enabled at (6410531): [<ffffffff876ecf87>]
_raw_spin_unlock_irq+0x27/0x70 kernel/locking/spinlock.c:192
hardirqs last disabled at (6410532): [<ffffffff87800905>]
interrupt_entry+0xb5/0xf0 arch/x86/entry/entry_64.S:625
softirqs last enabled at (6410102): [<ffffffff81a21db7>] spin_unlock_bh
include/linux/spinlock.h:355 [inline]
softirqs last enabled at (6410102): [<ffffffff81a21db7>]
wb_wakeup_delayed+0xa7/0xf0 mm/backing-dev.c:284
softirqs last disabled at (6410098): [<ffffffff81a21d77>] spin_lock_bh
include/linux/spinlock.h:315 [inline]
softirqs last disabled at (6410098): [<ffffffff81a21d77>]
wb_wakeup_delayed+0x67/0xf0 mm/backing-dev.c:281

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(fs_reclaim);
<Interrupt>
lock(fs_reclaim);

*** DEADLOCK ***

no locks held by syz-executor3/4839.

stack backtrace:
CPU: 1 PID: 4839 Comm: syz-executor3 Not tainted 4.17.0-rc2+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
print_usage_bug.cold.59+0x320/0x41a kernel/locking/lockdep.c:2542
valid_state kernel/locking/lockdep.c:2555 [inline]
mark_lock_irq kernel/locking/lockdep.c:2749 [inline]
mark_lock+0x1034/0x19e0 kernel/locking/lockdep.c:3147
mark_irqflags kernel/locking/lockdep.c:3022 [inline]
__lock_acquire+0x1595/0x5140 kernel/locking/lockdep.c:3388
lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
fs_reclaim_acquire.part.82+0x24/0x30 mm/page_alloc.c:3739
fs_reclaim_acquire+0x14/0x20 mm/page_alloc.c:3740
slab_pre_alloc_hook mm/slab.h:418 [inline]
slab_alloc mm/slab.c:3378 [inline]
__do_kmalloc mm/slab.c:3716 [inline]
__kmalloc+0x45/0x760 mm/slab.c:3727
kmalloc_array include/linux/slab.h:631 [inline]
kcalloc include/linux/slab.h:642 [inline]
numa_crng_init drivers/char/random.c:798 [inline]
crng_reseed+0x427/0x920 drivers/char/random.c:923
credit_entropy_bits+0x98d/0xa30 drivers/char/random.c:708
add_interrupt_randomness+0x494/0x860 drivers/char/random.c:1254
handle_irq_event_percpu+0xf9/0x1c0 kernel/irq/handle.c:191
handle_irq_event+0xa7/0x135 kernel/irq/handle.c:206
handle_edge_irq+0x20f/0x870 kernel/irq/chip.c:791
generic_handle_irq_desc include/linux/irqdesc.h:159 [inline]
handle_irq+0x18c/0x2e7 arch/x86/kernel/irq_64.c:77
do_IRQ+0x78/0x190 arch/x86/kernel/irq.c:245
common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:642
</IRQ>
RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:793 [inline]
RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168
[inline]
RIP: 0010:_raw_spin_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:192
RSP: 0018:ffff8801c8937910 EFLAGS: 00000282 ORIG_RAX: ffffffffffffffd8
RAX: dffffc0000000000 RBX: ffff8801daf2c580 RCX: 0000000000000000
RDX: 1ffffffff11a315f RSI: 0000000000000001 RDI: ffffffff88d18af8
RBP: ffff8801c8937918 R08: ffffed003b5e58b1 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801d320a040
R13: ffff8801c1840580 R14: 0000000000000000 R15: ffff8801d320a040
finish_lock_switch kernel/sched/core.c:2603 [inline]
finish_task_switch+0x1ca/0x810 kernel/sched/core.c:2701
context_switch kernel/sched/core.c:2851 [inline]
__schedule+0x809/0x1e30 kernel/sched/core.c:3490
schedule+0xef/0x430 kernel/sched/core.c:3549
exit_to_usermode_loop+0x220/0x310 arch/x86/entry/common.c:152
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
do_syscall_32_irqs_on arch/x86/entry/common.c:338 [inline]
do_fast_syscall_32+0xcc3/0xf9b arch/x86/entry/common.c:394
entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7f54cb9
RSP: 002b:00000000ffab99c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
RAX: 000000000000000c RBX: 00000000000000fb RCX: 00000000ffab99f0
RDX: 000000000000000c RSI: 0000000000035b55 RDI: 0000000000000000
RBP: 0000000000000660 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
BUG: sleeping function called from invalid context at mm/slab.h:421
in_atomic(): 1, irqs_disabled(): 1, pid: 4839, name: syz-executor3
INFO: lockdep is turned off.
irq event stamp: 6410532
hardirqs last enabled at (6410531): [<ffffffff876ecf87>]
__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline]
hardirqs last enabled at (6410531): [<ffffffff876ecf87>]
_raw_spin_unlock_irq+0x27/0x70 kernel/locking/spinlock.c:192
hardirqs last disabled at (6410532): [<ffffffff87800905>]
interrupt_entry+0xb5/0xf0 arch/x86/entry/entry_64.S:625
softirqs last enabled at (6410102): [<ffffffff81a21db7>] spin_unlock_bh
include/linux/spinlock.h:355 [inline]
softirqs last enabled at (6410102): [<ffffffff81a21db7>]
wb_wakeup_delayed+0xa7/0xf0 mm/backing-dev.c:284
softirqs last disabled at (6410098): [<ffffffff81a21d77>] spin_lock_bh
include/linux/spinlock.h:315 [inline]
softirqs last disabled at (6410098): [<ffffffff81a21d77>]
wb_wakeup_delayed+0x67/0xf0 mm/backing-dev.c:281
CPU: 1 PID: 4839 Comm: syz-executor3 Not tainted 4.17.0-rc2+ #1
RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:793 [inline]
RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168
[inline]
RIP: 0010:_raw_spin_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:192
RSP: 0018:ffff8801c8937910 EFLAGS: 00000282 ORIG_RAX: ffffffffffffffd8
RAX: dffffc0000000000 RBX: ffff8801daf2c580 RCX: 0000000000000000
RDX: 1ffffffff11a315f RSI: 0000000000000001 RDI: ffffffff88d18af8
RBP: ffff8801c8937918 R08: ffffed003b5e58b1 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801d320a040
R13: ffff8801c1840580 R14: 0000000000000000 R15: ffff8801d320a040
finish_lock_switch kernel/sched/core.c:2603 [inline]
finish_task_switch+0x1ca/0x810 kernel/sched/core.c:2701
context_switch kernel/sched/core.c:2851 [inline]
__schedule+0x809/0x1e30 kernel/sched/core.c:3490
schedule+0xef/0x430 kernel/sched/core.c:3549
exit_to_usermode_loop+0x220/0x310 arch/x86/entry/common.c:152
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
do_syscall_32_irqs_on arch/x86/entry/common.c:338 [inline]
do_fast_syscall_32+0xcc3/0xf9b arch/x86/entry/common.c:394
entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7f54cb9
RSP: 002b:00000000ffab99c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
RAX: 000000000000000c RBX: 00000000000000fb RCX: 00000000ffab99f0
RDX: 000000000000000c RSI: 0000000000035b55 RDI: 0000000000000000
RBP: 0000000000000660 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
random: crng init done


Tested on net commit
25eb0ea7174c6e84f21fa59dccbddd0318b17b12 (Thu Apr 26 02:55:33 2018 +0000)
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

compiler: gcc (GCC) 8.0.1 20180413 (experimental)
Patch: https://syzkaller.appspot.com/x/patch.diff?id=4808508071477248
https://syzkaller.appspot.com/x/log.txt?id=4683663035858944

Dmitry Vyukov

unread,
Apr 26, 2018, 12:50:49 PM4/26/18
to syzbot, Paolo Abeni, syzkaller-bugs
Okay, this is nasty.
I don't know. You can either submit 3 more in a row, or include the
fix into your patch.

Wait, there seems to be a window in net when "netfilter: ebtables:
don't attempt to allocate 0-sized compat array" is present, but
"random: set up the NUMA crng instances after the CRNG is fully
initialized" is not yet. So maybe testing on

commit 3f1e53abff84cf40b1adb3455d480dd295bf42e8
netfilter: ebtables: don't attempt to allocate 0-sized compat array

will do.
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/000000000000ebb7b3056ac2f045%40google.com.

Paolo Abeni

unread,
Apr 27, 2018, 3:36:09 AM4/27/18
to syzbot, syzkall...@googlegroups.com
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 3f1e53abff84

ultimate attempt, testing vs a reasonably 'stable' hash

syzbot

unread,
Apr 27, 2018, 3:57:02 AM4/27/18
to pab...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger
crash:

Reported-and-tested-by:
syzbot+4e42a0...@syzkaller.appspotmail.com

Note: the tag will also help syzbot to understand when the bug is fixed.
3f1e53abff84cf40b1adb3455d480dd295bf42e8 (Wed Apr 4 19:13:30 2018 +0000)
netfilter: ebtables: don't attempt to allocate 0-sized compat array

compiler: gcc (GCC) 8.0.1 20180413 (experimental)
Patch: https://syzkaller.appspot.com/x/patch.diff?id=5088798710956032
Kernel config:
https://syzkaller.appspot.com/x/.config?id=-4044605596339572004

---
There is no WARRANTY for the result, to the extent permitted by applicable
law.
Except when otherwise stated in writing syzbot provides the result "AS IS"
without warranty of any kind, either expressed or implied, but not limited
to,
the implied warranties of merchantability and fittness for a particular
purpose.
The entire risk as to the quality of the result is with you. Should the
result
prove defective, you assume the cost of all necessary servicing, repair or
correction.
Reply all
Reply to author
Forward
0 new messages