WARNING in refcount_sub_and

syzbot

unread,

Oct 24, 2017, 10:49:24 AM10/24/17

to da...@davemloft.net, linux-...@vger.kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, nho...@tuxdriver.com, syzkall...@googlegroups.com, vyas...@gmail.com

Hello,

syzkaller hit the following crash on
49ca1943a7adb429b11b8e05d81bc821694b76c7
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master
compiler: gcc (GCC) 7.1.1 20170620
.config is attached
Raw console output is attached.
C reproducer is attached
syzkaller reproducer is attached. See https://goo.gl/kgGztJ
for information about syzkaller reproducers

refcount_t: underflow; use-after-free.
------------[ cut here ]------------
WARNING: CPU: 1 PID: 2992 at lib/refcount.c:186
refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 2992 Comm: syzkaller263121 Not tainted 4.14.0-rc5+ #91
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:52
panic+0x1e4/0x417 kernel/panic.c:181
__warn+0x1c4/0x1d9 kernel/panic.c:542
report_bug+0x211/0x2d0 lib/bug.c:183
fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186
RSP: 0018:ffff8801d1f4e9c8 EFLAGS: 00010282
RAX: 0000000000000026 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000026 RSI: 1ffff1003a3e9cf9 RDI: ffffed003a3e9d2d
RBP: ffff8801d1f4ea58 R08: 0000000000000000 R09: 1ffff1003a3e9ccb
R10: ffff8801d1f4e7f8 R11: ffffffff85b2cb78 R12: 1ffff1003a3e9d3a
R13: 00000000ffffff01 R14: 0000000000000100 R15: ffff8801d1d16a3c
sctp_wfree+0x183/0x620 net/sctp/socket.c:7927
skb_release_head_state+0x124/0x200 net/core/skbuff.c:616
skb_release_all+0x15/0x60 net/core/skbuff.c:629
__kfree_skb net/core/skbuff.c:645 [inline]
consume_skb+0x153/0x490 net/core/skbuff.c:705
sctp_chunk_destroy net/sctp/sm_make_chunk.c:1437 [inline]
sctp_chunk_put+0x29c/0x420 net/sctp/sm_make_chunk.c:1464
sctp_chunk_free+0x53/0x60 net/sctp/sm_make_chunk.c:1451
__sctp_outq_teardown+0xa5b/0x1230 net/sctp/outqueue.c:273
sctp_outq_free+0x15/0x20 net/sctp/outqueue.c:293
sctp_association_free+0x2d0/0x930 net/sctp/associola.c:357
sctp_cmd_delete_tcb net/sctp/sm_sideeffect.c:920 [inline]
sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1336 [inline]
sctp_side_effects net/sctp/sm_sideeffect.c:1203 [inline]
sctp_do_sm+0x271b/0x6a30 net/sctp/sm_sideeffect.c:1174
sctp_primitive_SHUTDOWN+0xa0/0xd0 net/sctp/primitive.c:104
sctp_close+0x3c6/0x980 net/sctp/socket.c:1533
inet_release+0xed/0x1c0 net/ipv4/af_inet.c:426
sock_release+0x8d/0x1e0 net/socket.c:597
sock_close+0x16/0x20 net/socket.c:1126
__fput+0x327/0x7e0 fs/file_table.c:210
____fput+0x15/0x20 fs/file_table.c:244
task_work_run+0x199/0x270 kernel/task_work.c:112
get_signal+0x1343/0x16d0 kernel/signal.c:2164
do_signal+0x94/0x1ee0 arch/x86/kernel/signal.c:808
exit_to_usermode_loop+0x214/0x310 arch/x86/entry/common.c:158
prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
syscall_return_slowpath+0x42f/0x510 arch/x86/entry/common.c:266
entry_SYSCALL_64_fastpath+0xbc/0xbe
RIP: 0033:0x446539
RSP: 002b:00007f402614bdc8 EFLAGS: 00000202 ORIG_RAX: 0000000000000120
RAX: fffffffffffffff2 RBX: 0000000000000000 RCX: 0000000000446539
RDX: 0000000020137ffc RSI: 0000000020b53ff0 RDI: 0000000000000003
RBP: 0000000000000000 R08: 00007f402614c700 R09: 00007f402614c700
R10: 0000000000080000 R11: 0000000000000202 R12: 0000000000000000
R13: 00000000007efe7f R14: 00007f402614c9c0 R15: 0000000000000000
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..

---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzk...@googlegroups.com.

syzbot will keep track of this bug report.
Once a fix for this bug is committed, please reply to this email with:
#syz fix: exact-commit-title
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.

config.txt

raw.log

repro.txt

repro.c

ChunYu Wang

unread,

Oct 26, 2017, 4:53:27 AM10/26/17

to syzbot, da...@davemloft.net, LKML, linux...@vger.kernel.org, net...@vger.kernel.org, nho...@tuxdriver.com, syzkall...@googlegroups.com, vyas...@gmail.com

Hi all,

I am failed to reproduce it on target kernel with the reproducer file
or replaying the target syzkaller description log file, do I made
something wrong or there exists more subjects then the line in
repro.txt:

#{Threaded:true Collide:true Repeat:false Procs:1 Sandbox:namespace
Fault:false FaultCall:-1 FaultNth:0 EnableTun:false UseTmpDir:true
HandleSegv:false WaitRepeat:false Debug:false Repro:false}

Thanks
- ChunYu

---
2017/10/26 04:49:15 reproducing crash 'hang': testing program
(duration=10s, {Threaded:true Collide:true Repeat:true Procs:32
Sandbox:namespace Fault:false FaultCall:-1 FaultNth:0 EnableTun:true
UseTmpDir:true HandleSegv:true WaitRepeat:true Debug:false
Repro:true}): mmap-socket$inet_sctp-listen-accept4-listen-sendto$inet
2017/10/26 04:49:15 running command: ssh []string{"-p", "22", "-o",
"ConnectionAttempts=10", "-o", "ConnectTimeout=10", "-o",
"BatchMode=yes", "-o", "UserKnownHostsFile=/dev/null", "-o",
"IdentitiesOnly=yes", "-o", "StrictHostKeyChecking=no", "-o",
"LogLevel=error", "-i", "/home/chunwang/.ssh/id_rsa",
"ro...@10.73.5.213", "cd /home/user/tmp/syz && exec
/home/user/tmp/syz/syz-execprog -executor
/home/user/tmp/syz/syz-executor -arch=amd64 -cover=0 -procs=32
-repeat=0 -sandbox namespace -threaded=true -collide=true
/home/user/tmp/syz/syzkaller140727249"}
2017/10/26 04:49:25 reproducing crash 'hang': program did not crash
2017/10/26 04:49:25 reproducing crash 'hang': single: failed to
extract reproducer
2017/10/26 04:49:25 reproducing crash 'hang': bisect: bisecting 1
programs with base timeout 10s
2017/10/26 04:49:25 reproducing crash 'hang': bisect: bisecting 1 programs
2017/10/26 04:49:25 reproducing crash 'hang': bisect: executing all 1 programs

Dmitry Vyukov

unread,

Oct 26, 2017, 10:49:26 AM10/26/17

to ChunYu Wang, syzbot, David Miller, LKML, linux...@vger.kernel.org, netdev, Neil Horman, syzkall...@googlegroups.com, Vladislav Yasevich

On Thu, Oct 26, 2017 at 10:53 AM, ChunYu Wang <chun...@redhat.com> wrote:
> Hi all,
>
> I am failed to reproduce it on target kernel with the reproducer file
> or replaying the target syzkaller description log file, do I made
> something wrong or there exists more subjects then the line in
> repro.txt:
>
> #{Threaded:true Collide:true Repeat:false Procs:1 Sandbox:namespace
> Fault:false FaultCall:-1 FaultNth:0 EnableTun:false UseTmpDir:true
> HandleSegv:false WaitRepeat:false Debug:false Repro:false}

Hi ChunYu,

I've just re-tested the C repro and was able to trigger the bug in a second.
I've checked out 49ca1943a7adb429b11b8e05d81bc821694b76c7, copied the
provided config, run make olddefconfig, built with gcc-7 (you can get
the exact one here
https://storage.googleapis.com/syzkaller/gcc-7.tar.gz). Then run in
qemu (most of the flags are probably irrelevant):

qemu-system-x86_64 -hda wheezy.img -net
user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic -nographic -kernel
arch/x86/boot/bzImage -append "kvm-intel.nested=1
kvm-intel.unrestricted_guest=1 kvm-intel.ept=1
kvm-intel.flexpriority=1 kvm-intel.vpid=1
kvm-intel.emulate_invalid_guest_state=1 kvm-intel.eptad=1
kvm-intel.enable_shadow_vmcs=1 kvm-intel.pml=1
kvm-intel.enable_apicv=1 console=ttyS0 root=/dev/sda
earlyprintk=serial slub_debug=UZ vsyscall=native rodata=n oops=panic
panic_on_warn=1 panic=86400" -enable-kvm -pidfile vm_pid -m 2G -smp 4
-cpu host -usb -usbdevice mouse -usbdevice tablet -soundhw all

And running the provided C program instantly spewed the following.

Is there anything you did differently? I would like to understand
common reasons why syzbot reproducers don't work and outline them
here:
https://github.com/google/syzkaller/blob/master/docs/syzbot.md

Thanks

[ 588.444300] refcount_t: underflow; use-after-free.
[ 588.445812] ------------[ cut here ]------------
[ 588.447026] WARNING: CPU: 1 PID: 3086 at lib/refcount.c:186
refcount_sub_and_test+0x167/0x1b0
[ 588.449082] Kernel panic - not syncing: panic_on_warn set ...
[ 588.449082]
[ 588.450737] CPU: 1 PID: 3086 Comm: a.out Not tainted 4.14.0-rc5+ #9
[ 588.452160] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[ 588.454059] Call Trace:
[ 588.454658] dump_stack+0x194/0x257
[ 588.455538] ? arch_local_irq_restore+0x53/0x53
[ 588.456630] panic+0x1e4/0x417
[ 588.457367] ? __warn+0x1d9/0x1d9
[ 588.458171] ? show_regs_print_info+0x65/0x65
[ 588.459234] ? refcount_sub_and_test+0x167/0x1b0
[ 588.460262] __warn+0x1c4/0x1d9
[ 588.460958] ? refcount_sub_and_test+0x167/0x1b0
[ 588.461965] report_bug+0x211/0x2d0
[ 588.462756] fixup_bug+0x40/0x90
[ 588.463597] do_trap+0x260/0x390
[ 588.464304] do_error_trap+0x120/0x390
[ 588.465105] ? vprintk_emit+0x49b/0x590
[ 588.465929] ? do_trap+0x390/0x390
[ 588.466661] ? refcount_sub_and_test+0x167/0x1b0
[ 588.467646] ? vprintk_emit+0x3ea/0x590
[ 588.468475] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 588.469482] do_invalid_op+0x1b/0x20
[ 588.470262] invalid_op+0x18/0x20
[ 588.470988] RIP: 0010:refcount_sub_and_test+0x167/0x1b0
[ 588.472080] RSP: 0018:ffff88006550e9c8 EFLAGS: 00010282
[ 588.473224] RAX: 0000000000000026 RBX: 0000000000000001 RCX: 0000000000000000
[ 588.474643] RDX: 0000000000000026 RSI: 1ffff1000caa1cf9 RDI: ffffed000caa1d2d
[ 588.476091] RBP: ffff88006550ea58 R08: 0000000000000000 R09: 1ffff1000caa1ccb
[ 588.477520] R10: ffff88006550e7f8 R11: ffffffff85b2cb78 R12: 1ffff1000caa1d3a
[ 588.478967] R13: 00000000ffffff01 R14: 0000000000000100 R15: ffff88006a7f4a7c
[ 588.480413] ? refcount_sub_and_test+0x167/0x1b0
[ 588.481337] ? refcount_inc+0x50/0x50
[ 588.482081] ? __sctp_outq_teardown+0xa5b/0x1230
[ 588.483004] ? sctp_association_free+0x2d0/0x930
[ 588.484291] ? sctp_do_sm+0x271b/0x6a30
[ 588.485247] ? sctp_primitive_SHUTDOWN+0xa0/0xd0
[ 588.486295] ? sctp_close+0x3c6/0x980
[ 588.487058] ? inet_release+0xed/0x1c0
[ 588.488370] ? sock_release+0x8d/0x1e0
[ 588.489080] ? sock_close+0x16/0x20
[ 588.489759] sctp_wfree+0x183/0x620
[ 588.490430] ? entry_SYSCALL_64_fastpath+0xbc/0xbe
[ 588.491323] ? __sctp_write_space+0x910/0x910
[ 588.492177] skb_release_head_state+0x124/0x200
[ 588.493078] skb_release_all+0x15/0x60
[ 588.493938] consume_skb+0x153/0x490
[ 588.494605] ? sctp_chunk_put+0x99/0x420
[ 588.495388] ? alloc_skb_with_frags+0x750/0x750
[ 588.496119] ? sctp_chunk_hold+0x20/0x20
[ 588.496757] ? sctp_sched_dequeue_common+0x2aa/0x5d0
[ 588.497554] ? refcount_sub_and_test+0x115/0x1b0
[ 588.498296] ? refcount_inc+0x50/0x50
[ 588.498888] ? trace_hardirqs_off+0xd/0x10
[ 588.499567] ? quarantine_put+0xeb/0x190
[ 588.500215] sctp_chunk_put+0x29c/0x420
[ 588.500836] ? sctp_chunk_hold+0x20/0x20
[ 588.501491] ? sctp_transport_dst_confirm+0x50/0x50
[ 588.502266] ? sctp_sched_fcfs_dequeue+0x198/0x290
[ 588.503027] ? sctp_sched_dequeue_common+0x5d0/0x5d0
[ 588.504001] sctp_chunk_free+0x53/0x60
[ 588.504692] __sctp_outq_teardown+0xa5b/0x1230
[ 588.505493] ? sctp_inq_set_th_handler+0x1b0/0x1b0
[ 588.506322] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 588.507264] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 588.508089] ? check_preempt_wakeup+0x1320/0x1320
[ 588.508804] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 588.509632] ? default_wake_function+0x30/0x50
[ 588.510368] ? autoremove_wake_function+0x78/0x350
[ 588.511101] ? lock_acquire+0x1d5/0x580
[ 588.511686] ? lock_acquire+0x1d5/0x580
[ 588.512299] ? finish_wait+0x490/0x490
[ 588.512888] ? lock_acquire+0x1d5/0x580
[ 588.513583] ? lock_acquire+0x1d5/0x580
[ 588.514178] ? lock_acquire+0x1d5/0x580
[ 588.514774] ? __wake_up_common_lock+0x1c2/0x310
[ 588.515526] ? lock_acquire+0x1d5/0x580
[ 588.516121] ? sock_def_wakeup+0x1f9/0x350
[ 588.516752] ? lock_downgrade+0x990/0x990
[ 588.517385] ? lock_release+0xa40/0xa40
[ 588.517982] ? trace_raw_output_tick_stop+0x130/0x130
[ 588.518766] sctp_outq_free+0x15/0x20
[ 588.519391] sctp_association_free+0x2d0/0x930
[ 588.520136] ? sctp_asconf_queue_teardown+0x700/0x700
[ 588.521000] ? sock_def_wakeup+0x222/0x350
[ 588.521702] ? sk_dst_check+0x560/0x560
[ 588.522387] ? bpf_prog_kallsyms_find+0xbd/0x440
[ 588.523228] sctp_do_sm+0x271b/0x6a30
[ 588.523880] ? sctp_do_8_2_transport_strike.isra.16+0x8a0/0x8a0
[ 588.524728] ? do_raw_spin_trylock+0x190/0x190
[ 588.525378] ? SyS_setgroups+0x250/0x250
[ 588.525951] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 588.526690] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 588.527857] ? lock_acquire+0x1d5/0x580
[ 588.528737] ? lock_acquire+0x1d5/0x580
[ 588.529615] ? skb_dequeue+0x12a/0x180
[ 588.530476] ? lock_downgrade+0x990/0x990
[ 588.531445] ? do_raw_spin_trylock+0x190/0x190
[ 588.532466] ? lock_release+0xa40/0xa40
[ 588.533432] ? trace_hardirqs_on+0xd/0x10
[ 588.534373] sctp_primitive_SHUTDOWN+0xa0/0xd0
[ 588.535412] sctp_close+0x3c6/0x980
[ 588.536208] ? sctp_apply_peer_addr_params+0xf30/0xf30
[ 588.537322] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 588.538409] ? lock_acquire+0x1d5/0x580
[ 588.539262] ? locks_remove_file+0x3fa/0x5a0
[ 588.540192] ? cmp_ex_search+0x76/0xa0
[ 588.541006] ? fcntl_setlk+0x10c0/0x10c0
[ 588.541858] ? bsearch+0x83/0xa0
[ 588.542563] ? __fsnotify_parent+0xb4/0x3a0
[ 588.543572] ? ip_mc_drop_socket+0x1ce/0x230
[ 588.544336] inet_release+0xed/0x1c0
[ 588.544975] sock_release+0x8d/0x1e0
[ 588.545618] ? sock_release+0x1e0/0x1e0
[ 588.546300] sock_close+0x16/0x20
[ 588.546908] __fput+0x327/0x7e0
[ 588.547495] ? fput+0x140/0x140
[ 588.548067] ? __bad_area_nosemaphore+0x1f4/0x3e0
[ 588.548897] ____fput+0x15/0x20
[ 588.549461] task_work_run+0x199/0x270
[ 588.550127] ? task_work_cancel+0x210/0x210
[ 588.550859] ? __do_page_fault+0x3d6/0xd60
[ 588.551602] get_signal+0x1343/0x16d0
[ 588.552253] ? mm_fault_error+0x2c0/0x2c0
[ 588.552964] ? ptrace_notify+0x130/0x130
[ 588.553746] ? __do_page_fault+0xd60/0xd60
[ 588.554473] ? do_page_fault+0xee/0x720
[ 588.555168] ? lock_downgrade+0x990/0x990
[ 588.555896] ? __do_page_fault+0xd60/0xd60
[ 588.556634] ? do_raw_spin_trylock+0x190/0x190
[ 588.557411] ? lock_acquire+0x1d5/0x580
[ 588.558091] ? lock_acquire+0x1d5/0x580
[ 588.558777] do_signal+0x94/0x1ee0
[ 588.559400] ? lock_acquire+0x1d5/0x580
[ 588.560070] ? lock_acquire+0x1d5/0x580
[ 588.560721] ? put_unused_fd+0x62/0x70
[ 588.561360] ? lock_downgrade+0x990/0x990
[ 588.562044] ? setup_sigcontext+0x7d0/0x7d0
[ 588.562755] ? do_raw_spin_trylock+0x190/0x190
[ 588.563526] ? task_work_add+0x10e/0x1b0
[ 588.564254] ? __put_unused_fd+0x183/0x250
[ 588.564943] ? alloc_fdtable+0x280/0x280
[ 588.565605] ? cpumask_weight.constprop.3+0x45/0x45
[ 588.566421] ? _copy_to_user+0xa2/0xc0
[ 588.567081] ? _raw_spin_unlock+0x22/0x30
[ 588.567755] ? fput+0xd2/0x140
[ 588.568273] ? SYSC_accept4+0x4ec/0x850
[ 588.568915] ? kernel_accept+0x2f0/0x2f0
[ 588.569575] exit_to_usermode_loop+0x214/0x310
[ 588.570318] ? trace_event_raw_event_sys_exit+0x260/0x260
[ 588.571219] ? __do_page_fault+0xd60/0xd60
[ 588.571911] syscall_return_slowpath+0x42f/0x510
[ 588.572675] ? finish_task_switch+0x1aa/0x740
[ 588.573397] ? prepare_exit_to_usermode+0x2d0/0x2d0
[ 588.574297] ? prepare_exit_to_usermode+0x1a0/0x2d0
[ 588.575119] ? perf_trace_sys_enter+0xc20/0xc20
[ 588.575883] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 588.576656] entry_SYSCALL_64_fastpath+0xbc/0xbe
[ 588.577445] RIP: 0033:0x44cfe9
[ 588.577963] RSP: 002b:00007ff97ff43db8 EFLAGS: 00000293 ORIG_RAX:
0000000000000120
[ 588.579224] RAX: fffffffffffffff2 RBX: 0000000000000000 RCX: 000000000044cfe9
[ 588.580397] RDX: 0000000020137ffc RSI: 0000000020b53ff0 RDI: 0000000000000003
[ 588.581557] RBP: 00007ff97ff43dd0 R08: 00007ff97ff44700 R09: 0000000000000000
[ 588.582718] R10: 0000000000080000 R11: 0000000000000293 R12: 0000000000000000
[ 588.583964] R13: 0000000000000000 R14: 00007ff97ff449c0 R15: 00007ff97ff44700
[ 588.585673] Kernel Offset: disabled
[ 588.586263] Rebooting in 86400 seconds..

Xin Long

unread,

Oct 26, 2017, 11:49:50 AM10/26/17

to Dmitry Vyukov, ChunYu Wang, syzbot, David Miller, LKML, linux...@vger.kernel.org, netdev, Neil Horman, syzkall...@googlegroups.com, Vladislav Yasevich

Just wondering where we can get wheezy.img, if I can't download
somewhere, can you provide one if possible ?

I made some imgs before, with kernel built with the .config mail-list
usually gave, the guest always failed to boot.

Thanks.

Dmitry Vyukov

unread,

Oct 26, 2017, 12:14:12 PM10/26/17

to Xin Long, ChunYu Wang, syzbot, David Miller, LKML, linux...@vger.kernel.org, netdev, Neil Horman, syzkall...@googlegroups.com, Vladislav Yasevich

Makes sense. Added image/key links here:
https://github.com/google/syzkaller/blob/master/docs/syzbot.md#crash-does-not-reproduce

Here are commands to start qemu, ssh into the VM. This just worked for
me to reproduce the crash.

qemu-system-x86_64 -hda wheezy.img -net
user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic -nographic -kernel
arch/x86/boot/bzImage -append "kvm-intel.nested=1
kvm-intel.unrestricted_guest=1 kvm-intel.ept=1
kvm-intel.flexpriority=1 kvm-intel.vpid=1
kvm-intel.emulate_invalid_guest_state=1 kvm-intel.eptad=1
kvm-intel.enable_shadow_vmcs=1 kvm-intel.pml=1
kvm-intel.enable_apicv=1 console=ttyS0 root=/dev/sda

earlyprintk=serial vsyscall=native rodata=n oops=panic panic_on_warn=1
panic=86400" -enable-kvm -m 2G -smp 4 -cpu host -usb -usbdevice mouse
-usbdevice tablet -soundhw all

ssh -i wheezy.img.key -p 10022 -o UserKnownHostsFile=/dev/null -o
StrictHostKeyChecking=no -o IdentitiesOnly=yes root@localhost

Xin Long

unread,

Oct 26, 2017, 12:56:19 PM10/26/17

to Dmitry Vyukov, ChunYu Wang, syzbot, David Miller, LKML, linux...@vger.kernel.org, netdev, Neil Horman, syzkall...@googlegroups.com, Vladislav Yasevich

Works, and be able to reproduce the issue. Thanks Dmitry.

Another thing is (you might also notice):
https://paste.fedoraproject.org/paste/N~htmOMPUSiIXGUeLH7yIw
This call trace always comes up after kernel has started.

Xin Long

unread,

Oct 26, 2017, 2:39:30 PM10/26/17

to Dmitry Vyukov, ChunYu Wang, syzbot, David Miller, LKML, linux...@vger.kernel.org, netdev, Neil Horman, syzkall...@googlegroups.com, Vladislav Yasevich

Fix for this crash:
@@ -8276,6 +8279,7 @@ static void sctp_sock_migrate(struct sock
*oldsk, struct sock *newsk,
struct sk_buff *skb, *tmp;
struct sctp_ulpevent *event;
struct sctp_bind_hashbucket *head;
+ struct sctp_chunk *chunk;

/* Migrate socket buffer sizes and all the socket level options to the
* new socket.
@@ -8379,7 +8383,12 @@ static void sctp_sock_migrate(struct sock
*oldsk, struct sock *newsk,
* paths won't try to lock it and then oldsk.
*/
lock_sock_nested(newsk, SINGLE_DEPTH_NESTING);
+ list_for_each_entry(chunk, &assoc->outqueue.out_chunk_list, list)
+ skb_orphan(chunk->skb);
+
sctp_assoc_migrate(assoc, newsk);
+ list_for_each_entry(chunk, &assoc->outqueue.out_chunk_list, list)
+ sctp_set_owner_w(chunk);

Other lists in assoc->outqueue probably need to do the similar, will
check for sure later.

ChunYu Wang

unread,

Oct 26, 2017, 10:30:06 PM10/26/17

to Dmitry Vyukov, syzbot, David Miller, LKML, linux...@vger.kernel.org, netdev, Neil Horman, syzkall...@googlegroups.com, Vladislav Yasevich

Maybe I have just made some mistakes on understanding the reproduction
methods, will try it again.

Thanks,
- ChunYu

On Thu, Oct 26, 2017 at 10:49 PM, Dmitry Vyukov <dvy...@google.com> wrote:

Dmitry Vyukov

unread,

Oct 27, 2017, 2:08:12 AM10/27/17

to ChunYu Wang, syzbot, David Miller, LKML, linux...@vger.kernel.org, netdev, Neil Horman, syzkall...@googlegroups.com, Vladislav Yasevich

On Fri, Oct 27, 2017 at 4:30 AM, ChunYu Wang <chun...@redhat.com> wrote:
> Maybe I have just made some mistakes on understanding the reproduction
> methods, will try it again.

This is reproducible with the C program. If bot posts it, it was able
to reproduce the bug with the compiled C program. If it was not able
to reproduce with a C program, then it will post just syzkaller
program.

To answer your question re running these programs. To reproduce one
needs to save this to a file:

#{Threaded:true Collide:true Repeat:false Procs:1 Sandbox:namespace
Fault:false FaultCall:-1 FaultNth:0 EnableTun:false UseTmpDir:true
HandleSegv:false WaitRepeat:false Debug:false Repro:false}

mmap(&(0x7f0000000000/0xb50000)=nil, 0xb50000, 0x3, 0x32,
0xffffffffffffffff, 0x0)
r0 = socket$inet_sctp(0x2, 0x1, 0x84)
listen(r0, 0x11000000000000c8)
accept4(r0, &(0x7f0000b54000-0x10)=@ethernet={0x0, @local={[0x0, 0x0,
0x0, 0x0, 0x0], 0x0}, [0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0]},
&(0x7f0000138000-0x4)=0x10, 0x80000)
listen(r0, 0x0)
sendto$inet(r0,
&(0x7f0000002000-0x68)="3755cecb8ecfa33eced658b46a028cba4565dff33dff05002377",
0x1a, 0x4, &(0x7f0000944000)={0x2, 0x3, @loopback=0x7f000001, [0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0]}, 0x10)

and then run:

./syz-execprog -sandbox=namespace saved.prog.file

If syz-executor is not in the current dir, then also add -executor
/path/to/syz-executor.
-threaded and -collide flags are true by default, so it's not
necessary to add them in this case.

If it does not reproduce, it may be useful to run:

./syz-execprog -sandbox=namespace -procs=8 -repeat=0 saved.prog.file
i.e. repeat executing it in infinite loop with 8 parallel processes,
as lots of bugs are caused by races.

Dmitry Vyukov

unread,

Oct 27, 2017, 2:09:41 AM10/27/17

to Xin Long, ChunYu Wang, syzbot, David Miller, LKML, linux...@vger.kernel.org, netdev, Neil Horman, syzkall...@googlegroups.com, Vladislav Yasevich

Great!

> Another thing is (you might also notice):
> https://paste.fedoraproject.org/paste/N~htmOMPUSiIXGUeLH7yIw
> This call trace always comes up after kernel has started.

Yes, I've noticed this one. It seems to happen on a first incoming
network connection (ssh/scp). I have not seen it before.

Eric Dumazet

unread,

Oct 27, 2017, 5:36:44 AM10/27/17

to Dmitry Vyukov, Xin Long, ChunYu Wang, syzbot, David Miller, LKML, linux...@vger.kernel.org, netdev, Neil Horman, syzkall...@googlegroups.com, Vladislav Yasevich

On Fri, 2017-10-27 at 08:09 +0200, Dmitry Vyukov wrote:

> Yes, I've noticed this one. It seems to happen on a first incoming
> network connection (ssh/scp). I have not seen it before.

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=timers/core&id=52f737c2da40259ac9962170ce608b6fb1b55ee4

( Google-Bug-Id: 68003409 )

Dmitry Vyukov

unread,

Oct 27, 2017, 5:43:38 AM10/27/17

to Eric Dumazet, Xin Long, ChunYu Wang, syzbot, David Miller, LKML, linux...@vger.kernel.org, netdev, Neil Horman, syzkall...@googlegroups.com, Vladislav Yasevich

Good!
I've noticed that it does not happen on latest Linus tree, now I know why.

Eric Biggers

unread,

Mar 10, 2018, 7:24:24 PM3/10/18

to syzbot, syzkall...@googlegroups.com

No longer occurring, fix was commit e46ad8d885901:

#syz fix: sctp: reset owner sk for data chunks on out queues when migrating a sock

- Eric

Reply all

Reply to author

Forward

WARNING in refcount_sub_and_test

syzbot

ChunYu Wang

Dmitry Vyukov

Xin Long

Dmitry Vyukov

Xin Long

Xin Long

ChunYu Wang

Dmitry Vyukov

Dmitry Vyukov

Eric Dumazet

Dmitry Vyukov

Eric Biggers