use-after-free in sctp_do_sm

275 views
Skip to first unread message

Dmitry Vyukov

unread,
Nov 24, 2015, 4:16:18 AM11/24/15
to vyas...@gmail.com, linux...@vger.kernel.org, netdev, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Hello,

The following program triggers use-after-free in sctp_do_sm:

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <syscall.h>
#include <string.h>
#include <stdint.h>

int main()
{
long r0 = syscall(SYS_socket, 0xaul, 0x80805ul, 0x0ul, 0, 0, 0);
long r1 = syscall(SYS_mmap, 0x20000000ul, 0x10000ul, 0x3ul,
0x32ul, 0xfffffffffffffffful, 0x0ul);
memcpy((void*)0x20002fe4,
"\x0a\x00\x33\xe7\xeb\x9d\xcf\x61\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xc5\xc8\x88\x64",
28);
long r3 = syscall(SYS_bind, r0, 0x20002fe4ul, 0x1cul, 0, 0, 0);
memcpy((void*)0x20000faa,
"\x9b\x01\x7d\xcd\xb8\x6a\xc7\x3d\x09\x3a\x07\x00\xa7\xc4\xe9\xee\x0a\xd6\xec\xde\x26\x75\x5f\x22\xae\x4e\x33\x00\xb0\x76\x10\x70\xd6\xca\x19\xbc\x15\x83\xcf\x2e\xbc\x99\x0c\x5e\x83\x89\xc1\x44\x9c\x6e\x74\xd8\x5d\x5d\xd0\xf0\xdf\x47\xc0\x00\x71\x0b\x55\x4c\xab\xf0\xd8\x90\xd5\x92\x8c\x6e\x33\x22\x15\x5b\x19\xfb\xed\xdd\xa6\xac\xcb\x60\xcf\xe2\xde\xed\xdb\x95\x5c\xaa\x20\xa3",
94);
memcpy((void*)0x2000033a,
"\x02\x00\x33\xe2\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
128);
long r6 = syscall(SYS_sendto, r0, 0x20000faaul, 0x5eul,
0x81ul, 0x2000033aul, 0x80ul);
return 0;
}


==================================================================
BUG: KASAN: use-after-free in sctp_do_sm+0x42f6/0x4f60 at addr ffff880036fa80a8
Read of size 4 by task a.out/5664
=============================================================================
BUG kmalloc-4096 (Tainted: G B ): kasan: bad access detected
-----------------------------------------------------------------------------

INFO: Allocated in sctp_association_new+0x6f/0x1ea0 age=8 cpu=1 pid=5664
[< none >] kmem_cache_alloc_trace+0x1cf/0x220 ./mm/slab.c:3707
[< none >] sctp_association_new+0x6f/0x1ea0
[< none >] sctp_sendmsg+0x1954/0x28e0
[< none >] inet_sendmsg+0x316/0x4f0 ./net/ipv4/af_inet.c:802
[< inline >] __sock_sendmsg_nosec ./net/socket.c:641
[< inline >] __sock_sendmsg ./net/socket.c:651
[< none >] sock_sendmsg+0xca/0x110 ./net/socket.c:662
[< none >] SYSC_sendto+0x208/0x350 ./net/socket.c:1841
[< none >] SyS_sendto+0x40/0x50 ./net/socket.c:1862
[< none >] entry_SYSCALL_64_fastpath+0x16/0x7a

INFO: Freed in sctp_association_put+0x150/0x250 age=14 cpu=1 pid=5664
[< none >] kfree+0x199/0x1b0 ./mm/slab.c:1211
[< none >] sctp_association_put+0x150/0x250
[< none >] sctp_association_free+0x498/0x630
[< none >] sctp_do_sm+0xd8b/0x4f60
[< none >] sctp_primitive_SHUTDOWN+0xa9/0xd0
[< none >] sctp_close+0x616/0x790
[< none >] inet_release+0xed/0x1c0 ./net/ipv4/af_inet.c:471
[< none >] inet6_release+0x50/0x70 ./net/ipv6/af_inet6.c:416
[< inline >] constant_test_bit ././arch/x86/include/asm/bitops.h:321
[< none >] sock_release+0x8d/0x200 ./net/socket.c:601
[< none >] sock_close+0x16/0x20 ./net/socket.c:1188
[< none >] __fput+0x21d/0x6e0 ./fs/file_table.c:265
[< none >] ____fput+0x15/0x20 ./fs/file_table.c:84
[< none >] task_work_run+0x163/0x1f0 ./include/trace/events/rcu.h:20
[< inline >] __list_add ./include/linux/list.h:42
[< inline >] list_add_tail ./include/linux/list.h:76
[< inline >] list_move_tail ./include/linux/list.h:168
[< inline >] reparent_leader ./kernel/exit.c:618
[< inline >] forget_original_parent ./kernel/exit.c:669
[< inline >] exit_notify ./kernel/exit.c:697
[< none >] do_exit+0x809/0x2b90 ./kernel/exit.c:878
[< none >] do_group_exit+0x108/0x320 ./kernel/exit.c:985

INFO: Slab 0xffffea0000dbea00 objects=7 used=1 fp=0xffff880036fa8000
flags=0x100000000004080
INFO: Object 0xffff880036fa8000 @offset=0 fp=0xffff880036fad668
CPU: 1 PID: 5664 Comm: a.out Tainted: G B 4.4.0-rc1+ #81
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
00000000ffffffff ffff880061d6f700 ffffffff825d3336 ffff88003e806d00
ffff880036fa8000 ffff880036fa8000 ffff880061d6f730 ffffffff81618784
ffff88003e806d00 ffffea0000dbea00 ffff880036fa8000 0000000000000000

Call Trace:
[<ffffffff8162131e>] __asan_report_load4_noabort+0x3e/0x40
[<ffffffff8475ac76>] sctp_do_sm+0x42f6/0x4f60
[<ffffffff847b50e9>] sctp_primitive_SHUTDOWN+0xa9/0xd0
[<ffffffff847a1426>] sctp_close+0x616/0x790
[<ffffffff8409bb0d>] inet_release+0xed/0x1c0 ./net/ipv4/af_inet.c:471
[<ffffffff84192cc0>] inet6_release+0x50/0x70 ./net/ipv6/af_inet6.c:416
[< inline >] constant_test_bit ././arch/x86/include/asm/bitops.h:321
[<ffffffff83dc78cd>] sock_release+0x8d/0x200 ./net/socket.c:601
[<ffffffff83dc7a56>] sock_close+0x16/0x20 ./net/socket.c:1188
[<ffffffff81662f5d>] __fput+0x21d/0x6e0 ./fs/file_table.c:265
[<ffffffff816634a5>] ____fput+0x15/0x20 ./fs/file_table.c:84
[<ffffffff812a33d3>] task_work_run+0x163/0x1f0 ./include/trace/events/rcu.h:20
[< inline >] __list_add ./include/linux/list.h:42
[< inline >] list_add_tail ./include/linux/list.h:76
[< inline >] list_move_tail ./include/linux/list.h:168
[< inline >] reparent_leader ./kernel/exit.c:618
[< inline >] forget_original_parent ./kernel/exit.c:669
[< inline >] exit_notify ./kernel/exit.c:697
[<ffffffff812505d9>] do_exit+0x809/0x2b90 ./kernel/exit.c:878
[<ffffffff81252ad8>] do_group_exit+0x108/0x320 ./kernel/exit.c:985
[<ffffffff81252d0d>] SyS_exit_group+0x1d/0x20 ./kernel/exit.c:1002
[<ffffffff84bf0c36>] entry_SYSCALL_64_fastpath+0x16/0x7a
==================================================================


I am on commit 90b55590c43258a157a2a143748455dcc50fbb53 of net-next (Nov 22).


Thanks

Dmitry Vyukov

unread,
Nov 24, 2015, 4:31:52 AM11/24/15
to vyas...@gmail.com, linux...@vger.kernel.org, netdev, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski
strace output for your convenience:

socket(PF_INET6, SOCK_SEQPACKET|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 3
mmap(0x20000000, 65536, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000
bind(3, {sa_family=AF_INET6, sin6_port=htons(13287),
inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=1640996331,
sin6_scope_id=1686685893}, 28) = 0
sendto(3, "\233\1}\315\270j\307=\t:\7\0\247\304\351\356\n\326\354\336&u_\"\256N3\0\260v\20p"...,
94, MSG_OOB|MSG_EOR, {sa_family=AF_INET, sin_port=htons(13282),
sin_addr=inet_addr("127.0.0.1")}, 128) = 94
exit_group(0) = ?

Dmitry Vyukov

unread,
Nov 24, 2015, 5:10:53 AM11/24/15
to Vladislav Yasevich, linux...@vger.kernel.org, netdev, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski
The right commit is:

commit 7d267278a9ece963d77eefec61630223fce08c6c
Author: Rainer Weikusat
Date: Fri Nov 20 22:07:23 2015 +0000
unix: avoid use-after-free in ep_remove_wait_queue

Neil Horman

unread,
Nov 24, 2015, 3:46:17 PM11/24/15
to Dmitry Vyukov, Vladislav Yasevich, linux...@vger.kernel.org, netdev, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski
This commit doesn't seem to exist

Neil

> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majo...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

Eric Dumazet

unread,
Nov 24, 2015, 4:08:05 PM11/24/15
to Neil Horman, Dmitry Vyukov, Vladislav Yasevich, linux...@vger.kernel.org, netdev, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski
It does, in David Miller net tree :

commit 7d267278a9ece963d77eefec61630223fce08c6c
Author: Rainer Weikusat <rwei...@mobileactivedefense.com>

David Miller

unread,
Nov 24, 2015, 4:12:23 PM11/24/15
to nho...@tuxdriver.com, dvy...@google.com, vyas...@gmail.com, linux...@vger.kernel.org, net...@vger.kernel.org, syzk...@googlegroups.com, k...@google.com, gli...@google.com, sasha...@oracle.com, edum...@google.com, ma...@google.com
From: Neil Horman <nho...@tuxdriver.com>
Date: Tue, 24 Nov 2015 15:45:54 -0500

>> The right commit is:
>>
>> commit 7d267278a9ece963d77eefec61630223fce08c6c
>> Author: Rainer Weikusat
>> Date: Fri Nov 20 22:07:23 2015 +0000
>> unix: avoid use-after-free in ep_remove_wait_queue
> This commit doesn't seem to exist

It's in the 'net' tree. Which hasn't been pulled into 'net-next' for
a few days.

Vlad Yasevich

unread,
Nov 25, 2015, 10:12:31 AM11/25/15
to Neil Horman, Dmitry Vyukov, linux...@vger.kernel.org, netdev, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski
I don't think this matters... I think what's happening is that a close is happening on a
socket still in connection initialization phase and we've never handled that particularly
well...

Net-next kernel with mem debugging hangs on boot for me with a ton of printks suppressed.
Will try the net kernel to see if that's better

-vlad

Dmitry Vyukov

unread,
Nov 28, 2015, 10:51:16 AM11/28/15
to syzkaller, Neil Horman, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski
This also seems to lead the the following WARNINGS:

------------[ cut here ]------------
WARNING: CPU: 3 PID: 21734 at kernel/jump_label.c:77
__static_key_slow_dec+0xfb/0x120()
jump label: negative count!
Modules linked in:
CPU: 3 PID: 21734 Comm: executor Tainted: G B W 4.4.0-rc2+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
00000000ffffffff ffff88006083f660 ffffffff82719fc6 ffff88006083f6d0
ffff88003bbf8000 ffffffff85a612e0 ffff88006083f6a0 ffffffff81244ec9
ffffffff8152c54b ffffed000c107ed6 ffffffff85a612e0 000000000000004d
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82719fc6>] dump_stack+0x68/0x92 lib/dump_stack.c:50
[<ffffffff81244ec9>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:460
[<ffffffff81244fd9>] warn_slowpath_fmt+0xa9/0xd0 kernel/panic.c:472
[<ffffffff8152c54b>] __static_key_slow_dec+0xfb/0x120 kernel/jump_label.c:76
[<ffffffff8152c5c1>] static_key_slow_dec+0x51/0x90 kernel/jump_label.c:100
[<ffffffff84962d9b>] net_disable_timestamp+0x3b/0x50 net/core/dev.c:1709
[<ffffffff84914d43>] sock_disable_timestamp+0x93/0xb0 net/core/sock.c:444
[<ffffffff8491f82c>] sk_destruct+0xec/0x440 net/core/sock.c:1457
[<ffffffff8491fbd7>] __sk_free+0x57/0x200 net/core/sock.c:1476
[<ffffffff8491fdb0>] sk_free+0x30/0x40 net/core/sock.c:1487
[< inline >] sock_put include/net/sock.h:1623
[<ffffffff854c8a18>] sctp_close+0x628/0x790 net/sctp/socket.c:1546
[<ffffffff84d4b3ed>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:413
[<ffffffff84e70240>] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:406
[<ffffffff84909bbd>] sock_release+0x8d/0x1d0 net/socket.c:571
[<ffffffff84909d16>] sock_close+0x16/0x20 net/socket.c:1022
[<ffffffff81663a00>] __fput+0x220/0x770 fs/file_table.c:208
[<ffffffff81663fd5>] ____fput+0x15/0x20 fs/file_table.c:244
[<ffffffff8129f673>] task_work_run+0x163/0x1f0 kernel/task_work.c:115
[< inline >] exit_task_work include/linux/task_work.h:21
[<ffffffff8124d9e9>] do_exit+0x809/0x2ae0 kernel/exit.c:750
[<ffffffff8124fe38>] do_group_exit+0x108/0x320 kernel/exit.c:880
[<ffffffff81271df7>] get_signal+0x597/0x1630 kernel/signal.c:2307
[<ffffffff8114c77f>] do_signal+0x7f/0x18e0 arch/x86/kernel/signal.c:709
[<ffffffff81003901>] exit_to_usermode_loop+0xf1/0x1a0
arch/x86/entry/common.c:247
[< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:282
[<ffffffff8100616f>] syscall_return_slowpath+0x19f/0x210
arch/x86/entry/common.c:344
[<ffffffff85955362>] int_ret_from_sys_call+0x25/0x9f
arch/x86/entry/entry_64.S:281
---[ end trace 3e42717665ff2020 ]---


These WARNINGS always go with the original use-after-free reports. And
I was not able to reproduce this WARNING with commented out
sctp_association_destroy.

For the reference here is syzkaller program that triggers the WARNING.

r0 = socket(0xa, 0x1, 0x84)
mmap(&(0x7f0000000000)=nil, (0x10000), 0x3, 0x32, 0xffffffffffffffff, 0x0)
bind(r0, &(0x7f0000000000)="0a0033e049d02e70000000000000000000000000000000014c37ffc4",
0x1c)
connect(r0, &(0x7f0000001000)="020033d97f000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
0x80)
setsockopt$sock_int(r0, 0x1, 0x1d, &(0x7f0000001000+0x336)=0x1, 0x4)
listen(r0, 0xbb3)
r1 = accept(r0, &(0x7f0000003000+0xfd6)=nil, &(0x7f0000004000-0x2)=nil)

Marcelo Ricardo Leitner

unread,
Dec 3, 2015, 8:05:30 AM12/3/15
to Dmitry Vyukov, vyas...@gmail.com, linux...@vger.kernel.org, netdev, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Hi,

On Tue, Nov 24, 2015 at 10:15:57AM +0100, Dmitry Vyukov wrote:
>
> Call Trace:
> [<ffffffff8162131e>] __asan_report_load4_noabort+0x3e/0x40
> [<ffffffff8475ac76>] sctp_do_sm+0x42f6/0x4f60
> [<ffffffff847b50e9>] sctp_primitive_SHUTDOWN+0xa9/0xd0
> [<ffffffff847a1426>] sctp_close+0x616/0x790
> [<ffffffff8409bb0d>] inet_release+0xed/0x1c0 ./net/ipv4/af_inet.c:471
> [<ffffffff84192cc0>] inet6_release+0x50/0x70 ./net/ipv6/af_inet6.c:416
> [< inline >] constant_test_bit ././arch/x86/include/asm/bitops.h:321
> [<ffffffff83dc78cd>] sock_release+0x8d/0x200 ./net/socket.c:601
> [<ffffffff83dc7a56>] sock_close+0x16/0x20 ./net/socket.c:1188
> [<ffffffff81662f5d>] __fput+0x21d/0x6e0 ./fs/file_table.c:265
> [<ffffffff816634a5>] ____fput+0x15/0x20 ./fs/file_table.c:84
> [<ffffffff812a33d3>] task_work_run+0x163/0x1f0 ./include/trace/events/rcu.h:20
> [< inline >] __list_add ./include/linux/list.h:42

By any chance, did you have the pr_debug()s enabled?
Because that would trigger a use-after-free on debug_post_sfx()
macro expansion when the asoc is freed:

#define debug_post_sfx() \
pr_debug("%s[post-sfx]: error:%d, asoc:%p[%s]\n", __func__, error, \
asoc, sctp_state_tbl[(asoc && sctp_id2assoc(ep->base.sk, \
sctp_assoc2id(asoc))) ? asoc->state : SCTP_STATE_CLOSED])

Marcelo

Dmitry Vyukov

unread,
Dec 3, 2015, 8:46:03 AM12/3/15
to syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
No, I don't. But pr_debug always computes its arguments. See no_printk
in printk.h. So this use-after-free happens for all users.

Eric Dumazet

unread,
Dec 3, 2015, 9:48:45 AM12/3/15
to Dmitry Vyukov, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
>
> No, I don't. But pr_debug always computes its arguments. See no_printk
> in printk.h. So this use-after-free happens for all users.

Hmm.

pr_debug() should be a nop unless either DEBUG or CONFIG_DYNAMIC_DEBUG are set

On our production kernels, pr_debug() is a nop.

Can you double check ? Thanks !

Dmitry Vyukov

unread,
Dec 3, 2015, 10:56:04 AM12/3/15
to Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Why should it be nop? no_printk thing in printk.h pretty much
explicitly makes it not a nop...

Double-checked: debug_post_sfx leads to some generated code:

debug_post_sfx();
ffffffff8229f256: 48 8b 85 58 fe ff ff mov -0x1a8(%rbp),%rax
ffffffff8229f25d: 48 85 c0 test %rax,%rax
ffffffff8229f260: 74 24 je
ffffffff8229f286 <sctp_do_sm+0x176>
ffffffff8229f262: 8b b0 a8 00 00 00 mov 0xa8(%rax),%esi
ffffffff8229f268: 48 8b 85 60 fe ff ff mov -0x1a0(%rbp),%rax
ffffffff8229f26f: 44 89 85 74 fe ff ff mov %r8d,-0x18c(%rbp)
ffffffff8229f276: 48 8b 78 20 mov 0x20(%rax),%rdi
ffffffff8229f27a: e8 71 28 01 00 callq
ffffffff822b1af0 <sctp_id2assoc>
ffffffff8229f27f: 44 8b 85 74 fe ff ff mov -0x18c(%rbp),%r8d

return error;
}
ffffffff8229f286: 48 81 c4 a0 01 00 00 add $0x1a0,%rsp
ffffffff8229f28d: 44 89 c0 mov %r8d,%eax
ffffffff8229f290: 5b pop %rbx
ffffffff8229f291: 41 5c pop %r12
ffffffff8229f293: 41 5d pop %r13
ffffffff8229f295: 41 5e pop %r14
ffffffff8229f297: 41 5f pop %r15
ffffffff8229f299: 5d pop %rbp
ffffffff8229f29a: c3 retq

Marcelo Ricardo Leitner

unread,
Dec 3, 2015, 11:15:18 AM12/3/15
to Dmitry Vyukov, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Thu, Dec 03, 2015 at 04:55:44PM +0100, Dmitry Vyukov wrote:
> On Thu, Dec 3, 2015 at 3:48 PM, Eric Dumazet <edum...@google.com> wrote:
> >>
> >> No, I don't. But pr_debug always computes its arguments. See no_printk
> >> in printk.h. So this use-after-free happens for all users.
> >
> > Hmm.
> >
> > pr_debug() should be a nop unless either DEBUG or CONFIG_DYNAMIC_DEBUG are set
> >
> > On our production kernels, pr_debug() is a nop.
> >
> > Can you double check ? Thanks !
>
>
> Why should it be nop? no_printk thing in printk.h pretty much
> explicitly makes it not a nop...
>
> Double-checked: debug_post_sfx leads to some generated code:

Oups. I was under that impression too, that it would do sanity-check
while being optimized out.

I'll think on a fix for this.

Thanks,
Marcelo

Marcelo Ricardo Leitner

unread,
Dec 3, 2015, 11:51:40 AM12/3/15
to Dmitry Vyukov, syzkaller, Neil Horman, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski
These two are unrelated, actually.

Do you know if this accept() returned something? Seems so.
Seems to be originated on
sctp_v6_create_accept_sk() -> sctp_copy_sock():

void sctp_copy_sock(struct sock *newsk, struct sock *sk,
struct sctp_association *asoc)
{
struct inet_sock *inet = inet_sk(sk);
struct inet_sock *newinet;

newsk->sk_type = sk->sk_type;
newsk->sk_bound_dev_if = sk->sk_bound_dev_if;
newsk->sk_flags = sk->sk_flags; <---

As it enabled SO_TIMESTAMP on listening socket, this flag will be copied and
will trigger the second net_disable_timestamp() by the time the second
socket is destroyed, because it never had its enable counterpart called.

This also happens via sctp peeloff operation.

Marcelo

Eric Dumazet

unread,
Dec 3, 2015, 12:02:01 PM12/3/15
to Dmitry Vyukov, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
This is a serious concern, because we let in the past lot of patches
converting traditional

#ifdef DEBUG
# define some_hand_coded_ugly_debug() printk( ...._
#else
# define some_hand_coded_ugly_debug()
#endif

On the premise pr_debug() would be a nop.

It seems it is not always the case. This is a very serious problem.

We probably have hundred of potential bugs, because few people
actually make sure all debugging stuff is correct,
like comments can be wrong because they are not updated properly as time flies.

It is definitely a nop for many cases.

+void eric_test_pr_debug(struct sock *sk)
+{
+ if (atomic_read(&sk->sk_omem_alloc))
+ pr_debug("%s: optmem leakage for sock %p\n",
+ __func__, sk);
+}

->

0000000000004740 <eric_test_pr_debug>:
4740: e8 00 00 00 00 callq 4745 <eric_test_pr_debug+0x5>
4741: R_X86_64_PC32 __fentry__-0x4
4745: 55 push %rbp
4746: 8b 87 24 01 00 00 mov 0x124(%rdi),%eax //
atomic_read() but nothing follows
474c: 48 89 e5 mov %rsp,%rbp
474f: 5d pop %rbp
4750: c3 retq

Dmitry Vyukov

unread,
Dec 3, 2015, 12:13:00 PM12/3/15
to Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
I would expect that it is nop when argument evaluation does not have
side-effects. For example, for a load of a variable compiler will most
likely elide it (though, it does not have to elide it, because the
load is spelled in the code, so it can also legally emit the load and
doesn't use the result).
But if argument computation has side-effect (or compiler can't prove
otherwise), it must emit code. It must emit code for function calls
when the function is defined in a different translation unit, and for
volatile accesses (most likely including atomic accesses), etc

Marcelo Ricardo Leitner

unread,
Dec 3, 2015, 12:44:05 PM12/3/15
to Vlad Yasevich, syzkaller, Neil Horman, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski, Dmitry Vyukov
Vlad, others,

It's been a long time but this was introduced by commit 914e1c8b6980
("sctp: Inherit all socket options from parent correctly."). This is not
very consistent with how other protocols work and it will be hard to
keep tracking a negative mask of flags that we can't copy.

I reviewed the list of options and I'm thinking that only
SO_BINDTODEVICE is worth copying, leaving the others for the application
to re-set, as it is for other protocols. So I'm thinking on simply:

- newsk->sk_flags = sk->sk_flags;
+ newsk->sk_flags = sk->sk_flags & SO_BINDTODEVICE;

in the above.

What do you think?

Marcelo

Eric Dumazet

unread,
Dec 3, 2015, 12:59:13 PM12/3/15
to Marcelo Ricardo Leitner, Vlad Yasevich, syzkaller, Neil Horman, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski, Dmitry Vyukov
On Thu, 2015-12-03 at 15:43 -0200, Marcelo Ricardo Leitner wrote:

> Vlad, others,
>
> It's been a long time but this was introduced by commit 914e1c8b6980
> ("sctp: Inherit all socket options from parent correctly."). This is not
> very consistent with how other protocols work and it will be hard to
> keep tracking a negative mask of flags that we can't copy.
>
> I reviewed the list of options and I'm thinking that only
> SO_BINDTODEVICE is worth copying, leaving the others for the application
> to re-set, as it is for other protocols. So I'm thinking on simply:
>
> - newsk->sk_flags = sk->sk_flags;
> + newsk->sk_flags = sk->sk_flags & SO_BINDTODEVICE;
>
> in the above.
>
> What do you think?

I think SO_BINDTODEVICE is not a flag ;)

#define SO_BINDTODEVICE 25


Marcelo

unread,
Dec 3, 2015, 1:06:41 PM12/3/15
to Eric Dumazet, Vlad Yasevich, syzkaller, Neil Horman, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski, Dmitry Vyukov
Oops, indeed!
Idea persists.
Thx!
--
Sent from mobile. Please excuse my brevity.

Vlad Yasevich

unread,
Dec 3, 2015, 1:35:40 PM12/3/15
to Marcelo, Eric Dumazet, syzkaller, Neil Horman, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski, Dmitry Vyukov
Hmm... sk_clone_lock() appears to copy the flags as well, so it would
appear the tcp accept() sockets would also have timestamping set.

I can see how we probably shouldn't being copying sk_flags as there isn't
much there that need to be set.

-vlad


Marcelo

unread,
Dec 3, 2015, 1:43:24 PM12/3/15
to Vlad Yasevich, Eric Dumazet, syzkaller, Neil Horman, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, Maciej Żenczykowski, Dmitry Vyukov
Ahh right, through a memcpy. I completely missed that.

And later on it does:
if (sock_needs_netstamp(sk) &&
newsk->sk_flags & SK_FLAGS_TIMESTAMP)
net_enable_timestamp();

> I can see how we probably shouldn't being copying sk_flags as there isn't
> much there that need to be set.

I take that back then, we can enable timestamp like the above instead.
I'll test and post a patch soon.

Thanks,
Marcelo

Aaron Conole

unread,
Dec 3, 2015, 1:52:17 PM12/3/15
to Dmitry Vyukov, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Joe Perches
Dmitry Vyukov <dvy...@google.com> writes:
> On Thu, Dec 3, 2015 at 6:02 PM, Eric Dumazet <edum...@google.com> wrote:
>> On Thu, Dec 3, 2015 at 7:55 AM, Dmitry Vyukov <dvy...@google.com> wrote:
>>> On Thu, Dec 3, 2015 at 3:48 PM, Eric Dumazet <edum...@google.com> wrote:
>>>>>
>>>>> No, I don't. But pr_debug always computes its arguments. See no_printk
>>>>> in printk.h. So this use-after-free happens for all users.
>>>>
>>>> Hmm.
>>>>
>>>> pr_debug() should be a nop unless either DEBUG or
>>>> CONFIG_DYNAMIC_DEBUG are set
>>>>
>>>> On our production kernels, pr_debug() is a nop.
>>>>
>>>> Can you double check ? Thanks !
>>>
>>>
>>> Why should it be nop? no_printk thing in printk.h pretty much
>>> explicitly makes it not a nop...

Because it was until commit 5264f2f75d8. It also violates my reading of
the following from printk.h:

* All of these will print unconditionally, although note that pr_debug()
* and other debug macros are compiled out unless either DEBUG is defined
* or CONFIG_DYNAMIC_DEBUG is set.
+1

>> #ifdef DEBUG
>> # define some_hand_coded_ugly_debug() printk( ...._
>> #else
>> # define some_hand_coded_ugly_debug()
>> #endif
>>
>> On the premise pr_debug() would be a nop.
>>
>> It seems it is not always the case. This is a very serious problem.

+1
This isn't 100% true. As you state, in order to reach the return 0, all
side effects must be evaluated. Load generally does not have side
effects, so it can be safely elided, but function() must be emitted.

However, that is _not_ required to get the desired warning emission on a
printf argument function, see http://pastebin.com/UHuaydkj for an
example.

I think that as a minimum, the following patch should be evaluted, but am
unsure to whom I should submit it (after I test):

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 9729565..cd24d2d 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -286,7 +286,7 @@ extern asmlinkage void dump_stack(void) __cold;
printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
#else
#define pr_debug(fmt, ...) \
- no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
+ ({ if(0) printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); 0;})
#endif

/*

Joe Perches

unread,
Dec 3, 2015, 2:07:06 PM12/3/15
to Aaron Conole, Dmitry Vyukov, Andrew Morton, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Thu, 2015-12-03 at 13:52 -0500, Aaron Conole wrote:
> Dmitry Vyukov <dvy...@google.com> writes:
> > On Thu, Dec 3, 2015 at 6:02 PM, Eric Dumazet <edum...@google.com> wrote:
> > > On Thu, Dec 3, 2015 at 7:55 AM, Dmitry Vyukov <dvy...@google.com> wrote:
> > > > On Thu, Dec 3, 2015 at 3:48 PM, Eric Dumazet wrote:
> > > > > >
> > > > > > No, I don't. But pr_debug always computes its arguments. See no_printk
> > > > > > in printk.h. So this use-after-free happens for all users.
> > > > >
> > > > > Hmm.
> > > > >
> > > > > pr_debug() should be a nop unless either DEBUG or
> > > > > CONFIG_DYNAMIC_DEBUG are set
> > > > >
> > > > > On our production kernels, pr_debug() is a nop.
> > > > >
> > > > > Can you double check ? Thanks !
> > > >
> > > >
> > > > Why should it be nop? no_printk thing in printk.h pretty much
> > > > explicitly makes it not a nop...
>
> Because it was until commit 5264f2f75d8. It also violates my reading of
> the following from printk.h:
>
>  * All of these will print unconditionally, although note that pr_debug()
>  * and other debug macros are compiled out unless either DEBUG is defined
>  * or CONFIG_DYNAMIC_DEBUG is set.
>
> > > >
> > > > Double-checked: debug_post_sfx leads to some generated code:
> > > >
> > > >         debug_post_sfx();
> > > > ffffffff8229f256:       48 8b 85 58 fe ff ff    mov    -0x1a8(%rbp),%rax
> > > > ffffffff8229f25d:       48 85 c0                test   %rax,%rax
> > > > ffffffff8229f260:       74 24                   je
> > > > ffffffff8229f286
> > > > ffffffff8229f262:       8b b0 a8 00 00 00       mov    0xa8(%rax),%esi
> > > > ffffffff8229f268:       48 8b 85 60 fe ff ff    mov    -0x1a0(%rbp),%rax
> > > > ffffffff8229f26f:       44 89 85 74 fe ff ff    mov    %r8d,-0x18c(%rbp)
> > > > ffffffff8229f276:       48 8b 78 20             mov    0x20(%rax),%rdi
> > > > ffffffff8229f27a:       e8 71 28 01 00          callq
> > > > ffffffff822b1af0
> > > 0000000000004740 :
> > >     4740: e8 00 00 00 00       callq  4745
> > > 4741: R_X86_64_PC32 __fentry__-0x4
> > >     4745: 55                   push   %rbp
> > >     4746: 8b 87 24 01 00 00     mov    0x124(%rdi),%eax     //
> > > atomic_read()  but nothing follows
> > >     474c: 48 89 e5             mov    %rsp,%rbp
> > >     474f: 5d                   pop    %rbp
> > >     4750: c3                   retq
> >
> >
> >
> > I would expect that it is nop when argument evaluation does not have
> > side-effects. For example, for a load of a variable compiler will most
> > likely elide it (though, it does not have to elide it, because the
> > load is spelled in the code, so it can also legally emit the load and
> > doesn't use the result).
> > But if argument computation has side-effect (or compiler can't prove
> > otherwise), it must emit code. It must emit code for function calls
> > when the function is defined in a different translation unit, and for
> > volatile accesses (most likely including atomic accesses), etc
>
> This isn't 100% true. As you state, in order to reach the return 0, all
> side effects must be evaluated. Load generally does not have side
> effects, so it can be safely elided, but function() must be emitted.
>
> However, that is _not_ required to get the desired warning emission on a
> printf argument function, see http://pastebin.com/UHuaydkj for an
> example.
>
> I think that as a minimum, the following patch should be evaluted, but am
> unsure to whom I should submit it (after I test):

Andrew Morton <ak...@linux-foundation.org> (cc'd)

> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 9729565..cd24d2d 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -286,7 +286,7 @@ extern asmlinkage void dump_stack(void) __cold;
>         printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
>  #else
>  #define pr_debug(fmt, ...) \
> -       no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
> +       ({ if(0) printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); 0;})

More common is to use do {} while (0) instead of a
statement expression.

I think it'd be good to change pr_debug and variants to
do { if (0) no_printk(...) } while (0)
or some other form that completely eliminates all the
side-effects/function evaluations.

I think the same should be true when CONFIG_PRINTK is
not enabled.

https://lkml.org/lkml/2014/12/3/696

Jason Baron

unread,
Dec 3, 2015, 2:32:14 PM12/3/15
to Aaron Conole, Dmitry Vyukov, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Joe Perches
Agreed - the intention here is certainly to have no side effects. It
looks like 'no_printk()' is used in quite a few other places that would
benefit from this change. So we probably want a generic
'really_no_printk()' macro.

Thanks,

-Jason

>
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 9729565..cd24d2d 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -286,7 +286,7 @@ extern asmlinkage void dump_stack(void) __cold;
> printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
> #else
> #define pr_debug(fmt, ...) \
> - no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
> + ({ if(0) printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); 0;})
> #endif
>
> /*
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in

Joe Perches

unread,
Dec 3, 2015, 3:03:07 PM12/3/15
to Jason Baron, Aaron Conole, Dmitry Vyukov, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Thu, 2015-12-03 at 14:32 -0500, Jason Baron wrote:
> On 12/03/2015 01:52 PM, Aaron Conole wrote:
> > I think that as a minimum, the following patch should be evaluted,
> > but am unsure to whom I should submit it (after I test):
[]
> Agreed - the intention here is certainly to have no side effects. It
> looks like 'no_printk()' is used in quite a few other places that would
> benefit from this change. So we probably want a generic
> 'really_no_printk()' macro.

https://lkml.org/lkml/2012/6/17/231

Jason Baron

unread,
Dec 3, 2015, 3:10:16 PM12/3/15
to Joe Perches, Aaron Conole, Dmitry Vyukov, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
I don't see this in the tree. Also maybe we should just convert
no_printk() to do what your 'eliminated_printk()'. So we can convert all
users with this change?

Thanks,

-Jason

Joe Perches

unread,
Dec 3, 2015, 3:24:11 PM12/3/15
to Jason Baron, Aaron Conole, Dmitry Vyukov, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Thu, 2015-12-03 at 15:10 -0500, Jason Baron wrote:
> On 12/03/2015 03:03 PM, Joe Perches wrote:
> > On Thu, 2015-12-03 at 14:32 -0500, Jason Baron wrote:
> > > On 12/03/2015 01:52 PM, Aaron Conole wrote:
> > > > I think that as a minimum, the following patch should be evaluted,
> > > > but am unsure to whom I should submit it (after I test):
> > []
> > > Agreed - the intention here is certainly to have no side effects. It
> > > looks like 'no_printk()' is used in quite a few other places that would
> > > benefit from this change. So we probably want a generic
> > > 'really_no_printk()' macro.
> >
> > https://lkml.org/lkml/2012/6/17/231
>
> I don't see this in the tree.

It never got applied.

> Also maybe we should just convert
> no_printk() to do what your 'eliminated_printk()'.

Some of them at least.

> So we can convert all users with this change?

I don't think so, I think there are some
function evaluation/side effects that are
required.  I believe some do hardware I/O.

It'd be good to at least isolate them.

I'm not sure how to find them via some
automated tool/mechanism though.

I asked Julia Lawall about it once in this
thread: https://lkml.org/lkml/2014/12/3/696

Jason Baron

unread,
Dec 3, 2015, 3:42:59 PM12/3/15
to Joe Perches, Aaron Conole, Dmitry Vyukov, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Seems rather fragile to have side effects that we rely
upon hidden in a printk().

Just convert them and see what breaks :)

Joe Perches

unread,
Dec 3, 2015, 3:51:30 PM12/3/15
to Jason Baron, Aaron Conole, Dmitry Vyukov, Andrew Morton, LKML, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
(adding lkml as this is likely better discussed there)
Yup.

> Just convert them and see what breaks :)

I appreciate your optimism.  It's very 1995.
Try it and see what happens.

Dmitry Vyukov

unread,
Dec 4, 2015, 5:40:23 AM12/4/15
to Joe Perches, Jason Baron, Aaron Conole, Andrew Morton, LKML, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Whatever is the resolution for pr_debug, we still need to fix this
particular use-after-free. It affects stability of debug builds, gives
invalid debug output, prevents us from finding more bugs in SCTP. And
maybe somebody uses CONFIG_DYNAMIC_DEBUG in production.

Dmitry Vyukov

unread,
Dec 4, 2015, 5:41:56 AM12/4/15
to Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin, LKML
FWIW I enabled CONFIG_DYNAMIC_DEBUG on my fuzzer. Not that it gives
any particular guarantees, but still can catch some of these.

Marcelo Ricardo Leitner

unread,
Dec 4, 2015, 7:55:51 AM12/4/15
to Dmitry Vyukov, Joe Perches, Jason Baron, Aaron Conole, Andrew Morton, LKML, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Agreed. I'm already working on a fix for this particular use-after-free.

Another interesting thing about this is that sctp_do_sm() is called for
nearly every movement that happens on a sctp socket. Said that, that
always-running IDR search hidden on that debug statement do have some
nasty performance impact, specially because it's serialized on a
spinlock. This wouldn't be happening if it was fully ellided and would
be ok if that pr_debug() was really being printed, but not as it is.
Kudos to this report that I could notice this. I'm trying to fix this on
SCTP-side as well.

Marcelo

Vlad Yasevich

unread,
Dec 4, 2015, 10:37:30 AM12/4/15
to Marcelo Ricardo Leitner, Dmitry Vyukov, Joe Perches, Jason Baron, Aaron Conole, Andrew Morton, LKML, Eric Dumazet, syzkaller, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
YUCK! I didn't really pay much attention to those debug macros before, but
debug_post_sfx() is truly awful.

This wasn't such a bad thing where these macros depended on CONFIG_SCTP_DEBUG,
but now that they are always built, we need fix them.

-vlad

Aaron Conole

unread,
Dec 4, 2015, 10:51:42 AM12/4/15
to Vlad Yasevich, Marcelo Ricardo Leitner, Dmitry Vyukov, Joe Perches, Jason Baron, Andrew Morton, LKML, Eric Dumazet, syzkaller, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
I've proposed a patch to linux-kernel to fix them, but I don't think
it's really as bad as folks imagine. Ubuntu, RHEL, and Fedora all use
DYNAMIC_DEBUG configuration option, which means that the code is getting
emitted anyway (correctly, I'll add) and is shunted out by a dynamic
debug flag. So for the average user, it's not even really a blip.

That does mean there's a cool side-effect of the entire print-macro setup
which implies we execute less code when running with DYNAMIC_DEBUG=y in
the "normal" case. "Turn on the dynamic debugging config and watch
everything get better" isn't the worst mantra, is it? :)

Dmitry Vyukov

unread,
Dec 4, 2015, 11:12:32 AM12/4/15
to Joe Perches, Jason Baron, Aaron Conole, Andrew Morton, LKML, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Thu, Dec 3, 2015 at 9:51 PM, Joe Perches <j...@perches.com> wrote:
But Aaron says that DYNAMIC_DEBUG is enabled in most major
distributions, and all these side-effects don't happen with
DYNAMIC_DEBUG. This suggests that we can make these side-effects not
happen without DYNAMIC_DEBUG as well.
Or I am missing something here?

Jason Baron

unread,
Dec 4, 2015, 11:47:53 AM12/4/15
to Dmitry Vyukov, Joe Perches, Aaron Conole, Andrew Morton, LKML, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
When DYNAMIC_DEBUG is enabled we have this wrapper from
include/linux/dynamic_debug.h:

if (unlikely(descriptor.flags & _DPRINTK_FLAGS_PRINT))
<do debug stuff>

So the compiler is not emitting the side-effects in this
case.

>This suggests that we can make these side-effects not
> happen without DYNAMIC_DEBUG as well.
> Or I am missing something here?
>

When DYNAMIC_DEBUG is disabled we are instead replacing
pr_debug() with the 'no_printk()' function as you've pointed
out. We are changing this to emit no code at all:

http://marc.info/?l=linux-kernel&m=144918276518878&w=2

Thanks,

-Jason

Joe Perches

unread,
Dec 4, 2015, 12:03:15 PM12/4/15
to Jason Baron, Dmitry Vyukov, Aaron Conole, Andrew Morton, LKML, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Fri, 2015-12-04 at 11:47 -0500, Jason Baron wrote:
> When DYNAMIC_DEBUG is enabled we have this wrapper from
> include/linux/dynamic_debug.h:
>
> if (unlikely(descriptor.flags & _DPRINTK_FLAGS_PRINT))
> <do debug stuff>
>
> So the compiler is not emitting the side-effects in this
> case.

Huh?  Do I misunderstand what you are writing?

You are testing a variable that is not generally set
so the call is not being performed in the general case,
but the compiler can not elide the code.

If the variable was enabled via the control file, the
__dynamic_pr_debug would be performed with the
use-after-free.

Jason Baron

unread,
Dec 4, 2015, 12:11:06 PM12/4/15
to Joe Perches, Dmitry Vyukov, Aaron Conole, Andrew Morton, LKML, Eric Dumazet, syzkaller, Vladislav Yasevich, linux...@vger.kernel.org, netdev, Kostya Serebryany, Alexander Potapenko, Sasha Levin


On 12/04/2015 12:03 PM, Joe Perches wrote:
> On Fri, 2015-12-04 at 11:47 -0500, Jason Baron wrote:
>> When DYNAMIC_DEBUG is enabled we have this wrapper from
>> include/linux/dynamic_debug.h:
>>
>> if (unlikely(descriptor.flags & _DPRINTK_FLAGS_PRINT))
>> <do debug stuff>
>>
>> So the compiler is not emitting the side-effects in this
>> case.
>
> Huh? Do I misunderstand what you are writing?

Yes, I wasn't terribly clear - I was trying to say that the
'side-effects', in this case the debug code and use-after-free, are
hidden behind the branch. They aren't invoked unless we enable the debug
statement.

Thanks,

-Jason

Marcelo Ricardo Leitner

unread,
Dec 4, 2015, 12:14:58 PM12/4/15
to net...@vger.kernel.org, Vlad Yasevich, Neil Horman, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Maciej Żenczykowski, Dmitry Vyukov
Hi,

These a couple of fixes regarding sctp/packet timestamps.

Dmitry Vyukov <dvy...@google.com> reported the counter leak on missing
net_enable_timestamp() (2nd patch) and further testing here revealed the
other two issues.

Please consider these to -stable.

Thanks!

Marcelo Ricardo Leitner (3):
sctp: use the same clock as if sock source timestamps were on
sctp: update the netstamp_needed counter when copying sockets
sctp: also copy sk_tsflags when copying the socket

include/net/sock.h | 2 ++
net/core/sock.c | 2 --
net/sctp/sm_make_chunk.c | 4 ++--
net/sctp/socket.c | 4 ++++
4 files changed, 8 insertions(+), 4 deletions(-)

--
2.5.0

Marcelo Ricardo Leitner

unread,
Dec 4, 2015, 12:15:00 PM12/4/15
to net...@vger.kernel.org, Vlad Yasevich, Neil Horman, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Maciej Żenczykowski, Dmitry Vyukov
SCTP echoes a cookie o INIT ACK chunks that contains a timestamp, for
detecting stale cookies. This cookie is echoed back to the server by the
client and then that timestamp is checked.

Thing is, if the listening socket is using packet timestamping, the
cookie is encoded with ktime_get() value and checked against
ktime_get_real(), as done by __net_timestamp().

The fix is to sctp also use ktime_get_real(), so we can compare bananas
with bananas later no matter if packet timestamping was enabled or not.

Fixes: 52db882f3fc2 ("net: sctp: migrate cookie life from timeval to ktime")
Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>
---
net/sctp/sm_make_chunk.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 763e06a55155b2a9e0a9d918ecc1fe2dd6d9e0c0..5d6a03fad3789a12290f5f14c5a7efa69c98f41a 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1652,7 +1652,7 @@ static sctp_cookie_param_t *sctp_pack_cookie(const struct sctp_endpoint *ep,

/* Set an expiration time for the cookie. */
cookie->c.expiration = ktime_add(asoc->cookie_life,
- ktime_get());
+ ktime_get_real());

/* Copy the peer's init packet. */
memcpy(&cookie->c.peer_init[0], init_chunk->chunk_hdr,
@@ -1780,7 +1780,7 @@ no_hmac:
if (sock_flag(ep->base.sk, SOCK_TIMESTAMP))
kt = skb_get_ktime(skb);
else
- kt = ktime_get();
+ kt = ktime_get_real();

if (!asoc && ktime_before(bear_cookie->expiration, kt)) {
/*
--
2.5.0

Marcelo Ricardo Leitner

unread,
Dec 4, 2015, 12:15:04 PM12/4/15
to net...@vger.kernel.org, Vlad Yasevich, Neil Horman, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Maciej Żenczykowski, Dmitry Vyukov
Dmitry Vyukov reported that SCTP was triggering a WARN on socket destroy
related to disabling sock timestamp.

When SCTP accepts an association or peel one off, it copies sock flags
but forgot to call net_enable_timestamp() if a packet timestamping flag
was copied, leading to extra calls to net_disable_timestamp() whenever
such clones were closed.

The fix is to call net_enable_timestamp() whenever we copy a sock with
that flag on, like tcp does.

Reported-by: Dmitry Vyukov <dvy...@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>
---
include/net/sock.h | 2 ++
net/core/sock.c | 2 --
net/sctp/socket.c | 3 +++
3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 52d27ee924f47867026d8f65c65551a9137219d3..b1d475b5db6825e13df3e3e147fed8654e1cf086 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -740,6 +740,8 @@ enum sock_flags {
SOCK_SELECT_ERR_QUEUE, /* Wake select on error queue */
};

+#define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
+
static inline void sock_copy_flags(struct sock *nsk, struct sock *osk)
{
nsk->sk_flags = osk->sk_flags;
diff --git a/net/core/sock.c b/net/core/sock.c
index e31dfcee1729aa23bdd2ed692fda1b90bd75afb8..d01c8f42dbb2f040fd48009b2767bd4e80aea8ab 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -433,8 +433,6 @@ static bool sock_needs_netstamp(const struct sock *sk)
}
}

-#define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
-
static void sock_disable_timestamp(struct sock *sk, unsigned long flags)
{
if (sk->sk_flags & flags) {
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 03c8256063ec6355fcce034366aa5d005d75b5f7..4c9282bdd06790a0cca7f7c33986e7eb6c541398 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -7199,6 +7199,9 @@ void sctp_copy_sock(struct sock *newsk, struct sock *sk,
newinet->mc_ttl = 1;
newinet->mc_index = 0;
newinet->mc_list = NULL;
+
+ if (newsk->sk_flags & SK_FLAGS_TIMESTAMP)
+ net_enable_timestamp();
}

static inline void sctp_copy_descendant(struct sock *sk_to,
--
2.5.0

Marcelo Ricardo Leitner

unread,
Dec 4, 2015, 12:15:07 PM12/4/15
to net...@vger.kernel.org, Vlad Yasevich, Neil Horman, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Maciej Żenczykowski, Dmitry Vyukov
As we are keeping timestamps on when copying the socket, we also have to
copy sk_tsflags.

This is needed since b9f40e21ef42 ("net-timestamp: move timestamp flags
out of sk_flags").

Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>
---
net/sctp/socket.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 4c9282bdd06790a0cca7f7c33986e7eb6c541398..1a32ecdb8bae98de2e76591f0f5ffee1441ff04d 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -7167,6 +7167,7 @@ void sctp_copy_sock(struct sock *newsk, struct sock *sk,
newsk->sk_type = sk->sk_type;
newsk->sk_bound_dev_if = sk->sk_bound_dev_if;
newsk->sk_flags = sk->sk_flags;
+ newsk->sk_tsflags = sk->sk_tsflags;
newsk->sk_no_check_tx = sk->sk_no_check_tx;
newsk->sk_no_check_rx = sk->sk_no_check_rx;
newsk->sk_reuse = sk->sk_reuse;
--
2.5.0

Marcelo Ricardo Leitner

unread,
Dec 4, 2015, 12:48:28 PM12/4/15
to Dmitry Vyukov, net...@vger.kernel.org, Vlad Yasevich, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Hi Dmitry,

Can you please test this patch?
I'll re-post with proper subject if it works.

Thanks.

---8<---

Dmitry Vyukov reported a use-after-free in the code expanded by the
macro debug_post_sfx, which is caused by the use of the asoc pointer
after it was freed within sctp_side_effect() scope.

This patch fixes it by allowing sctp_side_effect to clear that asoc
pointer when the TCB is freed.

The macro is already prepared to handle such NULL pointer.

Reported-by: Dmitry Vyukov <dvy...@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>
---
net/sctp/sm_sideeffect.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 6098d4c42fa91287d3cde36ac05d860f76d4fe32..05594dcd93e0d649cace5215d225bef2713f9310 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -63,7 +63,7 @@ static int sctp_cmd_interpreter(sctp_event_t event_type,
static int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype,
sctp_state_t state,
struct sctp_endpoint *ep,
- struct sctp_association *asoc,
+ struct sctp_association **asoc,
void *event_arg,
sctp_disposition_t status,
sctp_cmd_seq_t *commands,
@@ -1123,7 +1123,7 @@ int sctp_do_sm(struct net *net, sctp_event_t event_type, sctp_subtype_t subtype,
debug_post_sfn();

error = sctp_side_effects(event_type, subtype, state,
- ep, asoc, event_arg, status,
+ ep, &asoc, event_arg, status,
&commands, gfp);
debug_post_sfx();

@@ -1136,7 +1136,7 @@ int sctp_do_sm(struct net *net, sctp_event_t event_type, sctp_subtype_t subtype,
static int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype,
sctp_state_t state,
struct sctp_endpoint *ep,
- struct sctp_association *asoc,
+ struct sctp_association **asoc,
void *event_arg,
sctp_disposition_t status,
sctp_cmd_seq_t *commands,
@@ -1151,7 +1151,7 @@ static int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype,
* disposition SCTP_DISPOSITION_CONSUME.
*/
if (0 != (error = sctp_cmd_interpreter(event_type, subtype, state,
- ep, asoc,
+ ep, *asoc,
event_arg, status,
commands, gfp)))
goto bail;
@@ -1175,6 +1175,7 @@ static int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype,

case SCTP_DISPOSITION_DELETE_TCB:
/* This should now be a command. */
+ *asoc = NULL;
break;

case SCTP_DISPOSITION_CONSUME:
--
2.5.0

Dmitry Vyukov

unread,
Dec 4, 2015, 3:25:56 PM12/4/15
to Marcelo Ricardo Leitner, netdev, Vlad Yasevich, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Fri, Dec 4, 2015 at 6:48 PM, Marcelo Ricardo Leitner
<marcelo...@gmail.com> wrote:
> Hi Dmitry,
>
> Can you please test this patch?
> I'll re-post with proper subject if it works.

Still happening with the same stacks.

Vlad Yasevich

unread,
Dec 4, 2015, 3:31:07 PM12/4/15
to Marcelo Ricardo Leitner, net...@vger.kernel.org, Neil Horman, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Maciej Żenczykowski, Dmitry Vyukov
On 12/04/2015 12:14 PM, Marcelo Ricardo Leitner wrote:
> SCTP echoes a cookie o INIT ACK chunks that contains a timestamp, for
> detecting stale cookies. This cookie is echoed back to the server by the
> client and then that timestamp is checked.
>
> Thing is, if the listening socket is using packet timestamping, the
> cookie is encoded with ktime_get() value and checked against
> ktime_get_real(), as done by __net_timestamp().
>
> The fix is to sctp also use ktime_get_real(), so we can compare bananas
> with bananas later no matter if packet timestamping was enabled or not.
>
> Fixes: 52db882f3fc2 ("net: sctp: migrate cookie life from timeval to ktime")
> Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>

Acked-by: Vlad Yasevich <vyas...@gmail.com>

-vlad

Vlad Yasevich

unread,
Dec 4, 2015, 3:33:04 PM12/4/15
to Marcelo Ricardo Leitner, net...@vger.kernel.org, Neil Horman, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Maciej Żenczykowski, Dmitry Vyukov
On 12/04/2015 12:14 PM, Marcelo Ricardo Leitner wrote:
> Dmitry Vyukov reported that SCTP was triggering a WARN on socket destroy
> related to disabling sock timestamp.
>
> When SCTP accepts an association or peel one off, it copies sock flags
> but forgot to call net_enable_timestamp() if a packet timestamping flag
> was copied, leading to extra calls to net_disable_timestamp() whenever
> such clones were closed.
>
> The fix is to call net_enable_timestamp() whenever we copy a sock with
> that flag on, like tcp does.
>
> Reported-by: Dmitry Vyukov <dvy...@google.com>
> Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>

Acked-by: Vlad Yasevich <vyas...@gmail.com>

-vlad

Vlad Yasevich

unread,
Dec 4, 2015, 3:33:28 PM12/4/15
to Marcelo Ricardo Leitner, net...@vger.kernel.org, Neil Horman, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Maciej Żenczykowski, Dmitry Vyukov
On 12/04/2015 12:14 PM, Marcelo Ricardo Leitner wrote:
> As we are keeping timestamps on when copying the socket, we also have to
> copy sk_tsflags.
>
> This is needed since b9f40e21ef42 ("net-timestamp: move timestamp flags
> out of sk_flags").
>
> Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>

Acked-by: Vlad Yasevich <vyas...@gmail.com>

-vlad

Marcelo Ricardo Leitner

unread,
Dec 4, 2015, 4:34:33 PM12/4/15
to Dmitry Vyukov, netdev, Vlad Yasevich, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Fri, Dec 04, 2015 at 09:25:35PM +0100, Dmitry Vyukov wrote:
> On Fri, Dec 4, 2015 at 6:48 PM, Marcelo Ricardo Leitner
> <marcelo...@gmail.com> wrote:
> > Hi Dmitry,
> >
> > Can you please test this patch?
> > I'll re-post with proper subject if it works.
>
> Still happening with the same stacks.

Then there may be another one, I'm afraid.

I'm using the testapp you shared in the first email, with that debug line
enabled and added a new one:
+ pr_debug("%p %d\n", asoc, asoc ? asoc->state : 0);
debug_post_sfx();
(should have used %x, but ok)

Also enabled slub_debug=PUZ, and I get:

without the patch:
[ 87.873640] sctp: ffff8800b71533d8 1
[ 87.873647] sctp: sctp_do_sm[post-sfx]: error:0,
asoc:ffff8800b71533d8[STATE_CLOSED]
[ 87.873739] sctp: ffff8800b71533d8 1
[ 87.873742] sctp: sctp_do_sm[post-sfx]: error:0,
asoc:ffff8800b71533d8[STATE_CLOSED]
[ 87.875149] sctp: ffff8800b71533d8 1802201963
[ 87.875238] sctp: sctp_do_sm[post-sfx]: error:0,
asoc:ffff8800b71533d8[STATE_CLOSED]

1802201963 = 0x6b6b6b6b, poison

with the patch:
[ 81.071265] sctp: ffff880137571148 1
[ 81.071273] sctp: sctp_do_sm[post-sfx]: error:0,
asoc:ffff880137571148[STATE_CLOSED]
[ 81.071372] sctp: ffff880137571148 1
[ 81.071375] sctp: sctp_do_sm[post-sfx]: error:0,
asoc:ffff880137571148[STATE_CLOSED]
[ 81.072423] sctp: (null) 0
[ 81.072427] sctp: sctp_do_sm[post-sfx]: error:0, asoc:
(null)[STATE_CLOSED]

This one, at least, is gone with this patch.

Marcelo

Dmitry Vyukov

unread,
Dec 4, 2015, 4:38:38 PM12/4/15
to Marcelo Ricardo Leitner, netdev, Vlad Yasevich, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Fri, Dec 4, 2015 at 10:34 PM, Marcelo Ricardo Leitner
I will try to extract reproducer next week.

Vlad Yasevich

unread,
Dec 5, 2015, 11:39:57 AM12/5/15
to Marcelo Ricardo Leitner, Dmitry Vyukov, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Hi Marcelo

I think you also need to catch the SCTP_DISPOSITION_ABORT and update
the pointer. There are some issues there though as some functions report
that code without actually destroying the association. This happens when
the ABORT chunk may be dropped.

I think this might be why we still see the issue.

-vlad

David Miller

unread,
Dec 5, 2015, 10:24:06 PM12/5/15
to marcelo...@gmail.com, net...@vger.kernel.org, vyas...@gmail.com, nho...@tuxdriver.com, eric.d...@gmail.com, syzk...@googlegroups.com, linux...@vger.kernel.org, k...@google.com, gli...@google.com, sasha...@oracle.com, ma...@google.com, dvy...@google.com
From: Marcelo Ricardo Leitner <marcelo...@gmail.com>
Date: Fri, 4 Dec 2015 15:14:02 -0200

> These a couple of fixes regarding sctp/packet timestamps.
>
> Dmitry Vyukov <dvy...@google.com> reported the counter leak on missing
> net_enable_timestamp() (2nd patch) and further testing here revealed the
> other two issues.
>
> Please consider these to -stable.

Thanks for this, applied and queued up for -stable.

Dmitry Vyukov

unread,
Dec 7, 2015, 6:26:30 AM12/7/15
to Vlad Yasevich, Marcelo Ricardo Leitner, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Marcelo,

Is this info enough for you to cook another fix?

Marcelo Ricardo Leitner

unread,
Dec 7, 2015, 8:15:31 AM12/7/15
to Dmitry Vyukov, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Hi, I think so. I was really wondering how you could trigger that issue
without the timestamp fix and Vlad's comment does shed some light on it.

I'll do more tests later today, but what did you have connecting to the
listening socket? Somehow you made that accept() call to return..

Marcelo

Dmitry Vyukov

unread,
Dec 7, 2015, 8:21:07 AM12/7/15
to Marcelo Ricardo Leitner, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Mon, Dec 7, 2015 at 2:15 PM, Marcelo Ricardo Leitner
Local connect in another thread I guess.

Marcelo Ricardo Leitner

unread,
Dec 7, 2015, 1:52:22 PM12/7/15
to Dmitry Vyukov, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Mon, Dec 07, 2015 at 02:20:47PM +0100, Dmitry Vyukov wrote:
> On Mon, Dec 7, 2015 at 2:15 PM, Marcelo Ricardo Leitner
> <marcelo...@gmail.com> wrote:
> > On Mon, Dec 07, 2015 at 12:26:09PM +0100, Dmitry Vyukov wrote:
> >> On Sat, Dec 5, 2015 at 5:39 PM, Vlad Yasevich <vyas...@gmail.com> wrote:
...
> >> > Hi Marcelo
> >> >
> >> > I think you also need to catch the SCTP_DISPOSITION_ABORT and update
> >> > the pointer. There are some issues there though as some functions report
> >> > that code without actually destroying the association. This happens when
> >> > the ABORT chunk may be dropped.
> >> >
> >> > I think this might be why we still see the issue.
> >>
> >>
> >> Marcelo,
> >>
> >> Is this info enough for you to cook another fix?
> >
> > Hi, I think so. I was really wondering how you could trigger that issue
> > without the timestamp fix and Vlad's comment does shed some light on it.
> >
> > I'll do more tests later today, but what did you have connecting to the
> > listening socket? Somehow you made that accept() call to return..
>
> Local connect in another thread I guess.

Vlad, I reviewed the places on which it returns SCTP_DISPOSITION_ABORT,
and if I didn't miss something in there all of them either issue
SCTP_CMD_ASSOC_FAILED or SCTP_CMD_INIT_FAILED before returning it, thus
delaying DELETE_TCB and with that the asoc free. There is one place,
though, that may not do it that way, it's sctp_sf_abort_violation(), but
then that code only runs if asoc is already NULL by then.

Dmitry, still no luck here, cannot reproduce another hit.
I'm using sctp_test and a custom test of mine, both on localhost so I
would catch it in server or client side, nothing..

I need more info. Please enable the pr_debug() on debug_post_sfn() macro
and see which status is being reported when you trigger the issue.
And/or share a traffic capture so we can see what's going on with the
association.

Marcelo

Vlad Yasevich

unread,
Dec 7, 2015, 2:33:55 PM12/7/15
to Marcelo Ricardo Leitner, Dmitry Vyukov, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On 12/07/2015 01:52 PM, Marcelo Ricardo Leitner wrote:
> On Mon, Dec 07, 2015 at 02:20:47PM +0100, Dmitry Vyukov wrote:
>> On Mon, Dec 7, 2015 at 2:15 PM, Marcelo Ricardo Leitner
>> <marcelo...@gmail.com> wrote:
>>> On Mon, Dec 07, 2015 at 12:26:09PM +0100, Dmitry Vyukov wrote:
>>>> On Sat, Dec 5, 2015 at 5:39 PM, Vlad Yasevich <vyas...@gmail.com> wrote:
> ...
>>>>> Hi Marcelo
>>>>>
>>>>> I think you also need to catch the SCTP_DISPOSITION_ABORT and update
>>>>> the pointer. There are some issues there though as some functions report
>>>>> that code without actually destroying the association. This happens when
>>>>> the ABORT chunk may be dropped.
>>>>>
>>>>> I think this might be why we still see the issue.
>>>>
>>>>
>>>> Marcelo,
>>>>
>>>> Is this info enough for you to cook another fix?
>>>
>>> Hi, I think so. I was really wondering how you could trigger that issue
>>> without the timestamp fix and Vlad's comment does shed some light on it.
>>>
>>> I'll do more tests later today, but what did you have connecting to the
>>> listening socket? Somehow you made that accept() call to return..
>>
>> Local connect in another thread I guess.
>
> Vlad, I reviewed the places on which it returns SCTP_DISPOSITION_ABORT,
> and if I didn't miss something in there all of them either issue
> SCTP_CMD_ASSOC_FAILED or SCTP_CMD_INIT_FAILED before returning it, thus
> delaying DELETE_TCB and with that the asoc free.

They delay it from the perspective of the command interpreter since the command
to delete the TCB happens a little later, but status code is checked after all
commands are processed and command processing doesn't change it. So the 'status'
code would still be SCTP_DISPOSITION_ABORT after DELETE_TCB command was processed.
So, I think we may still have an use-after-free issue here.

> There is one place,
> though, that may not do it that way, it's sctp_sf_abort_violation(), but
> then that code only runs if asoc is already NULL by then.

I don't believe so. The violation state function can run with a non-NULL association
if we are encountering protocol violations after the association is established.

-vlad

Marcelo Ricardo Leitner

unread,
Dec 7, 2015, 2:50:37 PM12/7/15
to Vlad Yasevich, Dmitry Vyukov, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Gotcha! That's pretty much it then. From that point of view now, there
shouldn't be a case that it returns _ABORT without freeing the asoc in
the same loop. (more below)

> > There is one place,
> > though, that may not do it that way, it's sctp_sf_abort_violation(), but
> > then that code only runs if asoc is already NULL by then.
>
> I don't believe so. The violation state function can run with a non-NULL association
> if we are encountering protocol violations after the association is established.

Yup, that's correct. I just tried to reference one case on which it
would return _ABORT without issuing any of those _FAILEDs before doing
so (meaning the association could still be valid) but that in that case,
the asoc was already NULL.

Dmitry, please give this one a run, as I still cannot reproduce your use
case..

---8<---

commit b63ad8dc45257dd6c536ac0227fcc623efd9328b
Author: Marcelo Ricardo Leitner <marcelo...@gmail.com>
Date: Fri Dec 4 15:30:23 2015 -0200

sctp: fix use-after-free in pr_debug statement

Dmitry Vyukov reported a use-after-free in the code expanded by the
macro debug_post_sfx, which is caused by the use of the asoc pointer
after it was freed within sctp_side_effect() scope.

This patch fixes it by allowing sctp_side_effect to clear that asoc
pointer when the TCB is freed.

As Vlad explained, we also have to cover the SCTP_DISPOSITION_ABORT case
because it will trigger DELETE_TCB too on that same loop.

The macro is already prepared to handle such NULL pointer.

Reported-by: Dmitry Vyukov <dvy...@google.com>

diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 6098d4c42fa9..be23d5c2074f 100644
@@ -1174,11 +1174,12 @@ static int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype,
break;

case SCTP_DISPOSITION_DELETE_TCB:
+ case SCTP_DISPOSITION_ABORT:
/* This should now be a command. */
+ *asoc = NULL;
break;

case SCTP_DISPOSITION_CONSUME:
- case SCTP_DISPOSITION_ABORT:
/*
* We should no longer have much work to do here as the
* real work has been done as explicit commands above.

Vlad Yasevich

unread,
Dec 7, 2015, 3:38:02 PM12/7/15
to Marcelo Ricardo Leitner, Dmitry Vyukov, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
I think it is possible to hit the 'discard:' tag in that function while still
having a valid association. That happens when ABORT chunk is required to be
authenticated. This that case, instead of generating an ABORT and terminating the
current association, we just drop the packet, but still report an _ABORT disposition code.

This probably need to change if we are going to catch the _ABORT disposition and
clear the asoc pointer.

-vlad

Marcelo Ricardo Leitner

unread,
Dec 7, 2015, 3:52:17 PM12/7/15
to Vlad Yasevich, Dmitry Vyukov, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Oups. Nice one. I'll switch it to SCTP_DISPOSITION_DISCARD if it hits
that if() then. Thanks Vlad.

Marcelo

Dmitry Vyukov

unread,
Dec 8, 2015, 12:31:11 PM12/8/15
to Marcelo Ricardo Leitner, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
So I am waiting for a new patch, right?
Can you please combine all changes into a single patch (as far as I
understand the previous one must be applied on top of the first one)?

Marcelo Ricardo Leitner

unread,
Dec 8, 2015, 12:40:44 PM12/8/15
to Dmitry Vyukov, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
The patches were combined already, but this last pick by Vlad is just
not yet patched. It's not necessary for your testing and I didn't want
to interrupt it in case you were already testing it.

You can use my last patch here, from 2 emails ago, the one which
contains this line:
- case SCTP_DISPOSITION_ABORT:

Marcelo

Dmitry Vyukov

unread,
Dec 8, 2015, 2:22:23 PM12/8/15
to Marcelo Ricardo Leitner, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Tue, Dec 8, 2015 at 6:40 PM, Marcelo Ricardo Leitner
You are right. I missed that they are combined. Testing with it now.

Dmitry Vyukov

unread,
Dec 9, 2015, 9:41:50 AM12/9/15
to Marcelo Ricardo Leitner, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Use-after-free still happens.
I am on commit aa53685549a2cfb5f175b0c4a20bc9aa1e5a1b85 (Dec 8) plus
the following sctp-related changes:

diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 6098d4c..be23d5c 100644
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 03c8256..4c9282b 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -7199,6 +7199,9 @@ void sctp_copy_sock(struct sock *newsk, struct sock *sk,
newinet->mc_ttl = 1;
newinet->mc_index = 0;
newinet->mc_list = NULL;
+
+ if (newsk->sk_flags & SK_FLAGS_TIMESTAMP)
+ net_enable_timestamp();
}



The new program is:

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <syscall.h>
#include <string.h>
#include <stdint.h>
#include <pthread.h>

long r0;

void *thr(void *arg)
{
memcpy((void*)0x20001000,
"\x02\x00\x33\xd9\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
128);
long r5 = syscall(SYS_connect, r0, 0x20001000ul, 0x80ul, 0, 0, 0);
return 0;
}

int main()
{
r0 = syscall(SYS_socket, 0xaul, 0x1ul, 0x84ul, 0, 0, 0);
long r1 = syscall(SYS_mmap, 0x20000000ul, 0x10000ul, 0x3ul,
0x32ul, 0xfffffffffffffffful, 0x0ul);
memcpy((void*)0x20000000,
"\x0a\x00\x33\xe0\x49\xd0\x2e\x70\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x4c\x37\xff\xc4",
28);
long r3 = syscall(SYS_bind, r0, 0x20000000ul, 0x1cul, 0, 0, 0);
pthread_t th;
pthread_create(&th, 0, thr, 0);
*(uint32_t*)0x20002ff8 = 0x6;
*(uint32_t*)0x20002ffc = 0x0;
long r8 = syscall(SYS_setsockopt, r0, 0x1ul, 0xdul,
0x20002ff8ul, 0x8ul, 0);
*(uint64_t*)0x20003ffd = 0x0;
long r10 = syscall(SYS_sendfile, r0, r0, 0x20003ffdul, 0xc0ul, 0, 0);
memcpy((void*)0x20004f90,
"\x88\x24\x1a\xa0\xa9\x55\x4a\x24\x5b\xe8\x4f\x5d\x46\x39\x42\x26\x62\xc3\xd5\xd1\x1c\x00\xf1\x73\x4c\x11\x8d\x48\xbd\x25\x4f\xd3\xc1\xef\xc7\xbf\x1d\x0c\xe1\xf2\xc6\x64\x9d\xb5\x98\x5e\xc0\x1b\x7e\x83\xee\x06\x79\x10\x3b\xeb\x3c\x89\x9e\x30\xb6\xb5\xbd\xf9\xaa\xc1\xe0\x47\xdf\xed\x94\xda\xc5\xcb\x21\x32\x66\xbd\xc9\xa5\x84\xbc\x32\x8f\xce\x8e\xff\x1f\x76\x63\x67\x2f\x40\xc7\x42\xa3\x60\x17\xd6\x05\x45\xc2\x10\xd1\x53\x5f\x0d\x02\xcd\xf1\x44\x30",
112);
memcpy((void*)0x20004f80,
"\x0a\x00\x33\xdc\x14\x4d\x5b\xd1\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xdd\x01\xf8\xfd\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
128);
long r13 = syscall(SYS_sendto, r0, 0x20004f90ul, 0x70ul,
0x0ul, 0x20004f80ul, 0x80ul);
long r14 = syscall(SYS_listen, r0, 0x3ul, 0, 0, 0, 0);
long r15 = syscall(SYS_accept4, r0, 0x20003f80ul,
0x20003ab4ul, 0x80800ul, 0, 0);
*(uint64_t*)0x20003000 = 0x2;
*(uint64_t*)0x20003008 = 0x2;
*(uint64_t*)0x20003010 = 0x1;
*(uint64_t*)0x20003018 = 0x7;
*(uint64_t*)0x20003020 = 0x7;
*(uint64_t*)0x20003028 = 0x5f;
*(uint64_t*)0x20003030 = 0x9;
*(uint64_t*)0x20003038 = 0x88;
long r24 = syscall(SYS_setsockopt, r0, 0xfffffffffffffff7ul,
0x8ul, 0x20003000ul, 0x40ul, 0);
long r25 = syscall(SYS_dup3, r15, r0, 0x80000ul, 0, 0, 0);
memcpy((void*)0x20006000,
"\xd9\x4f\xbe\x3f\x43\x89\x02\x0d\x1e\x84\x8d\x16\xe8\xdf\xdd\x27\x1f\xfe\xc6\x4a\xfa\x93\x00\xb9\xaf\xd7\x5e\xf1\x1f\x88\xc4\x57\x12\x70\xb4\xc5\xa6\xfc\xb9\x99\xd2\x80\x30\x2a\x53\xda\xd2\x57\x6d\xdc",
50);
long r27 = syscall(SYS_setsockopt, r25, 0x117ul, 0x1ul,
0x20006000ul, 0x32ul, 0);
long r28 = syscall(SYS_close, r15, 0, 0, 0, 0, 0);
return 0;
}


The use-after-free reports:

==================================================================
BUG: KASAN: use-after-free in sctp_do_sm+0x530e/0x5d90 at addr ffff880069d4c808
Read of size 4 by task a.out/8211
=============================================================================
BUG kmalloc-4096 (Tainted: G B ): kasan: bad access detected
-----------------------------------------------------------------------------

INFO: Allocated in sctp_association_new+0xbd/0x21d0 age=9 cpu=3 pid=8211
[< none >] ___slab_alloc+0x648/0x8c0 mm/slub.c:2468
[< none >] __slab_alloc+0x4c/0x90 mm/slub.c:2497
[< inline >] slab_alloc_node mm/slub.c:2560
[< inline >] slab_alloc mm/slub.c:2602
[< none >] kmem_cache_alloc_trace+0x23c/0x3f0 mm/slub.c:2619
[< inline >] kmalloc include/linux/slab.h:458
[< inline >] kzalloc include/linux/slab.h:602
[< none >] sctp_association_new+0xbd/0x21d0 net/sctp/associola.c:302
[< none >] __sctp_connect+0x5e8/0xd80 net/sctp/socket.c:1161
[< none >] sctp_connect+0xdc/0x130 net/sctp/socket.c:3874
[< none >] inet_dgram_connect+0x136/0x2a0 net/ipv4/af_inet.c:528
[< none >] SYSC_connect+0x263/0x380 net/socket.c:1542
[< none >] SyS_connect+0x24/0x30 net/socket.c:1523
[< none >] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

INFO: Freed in sctp_association_put+0x179/0x2c0 age=11 cpu=3 pid=8211
[< none >] __slab_free+0x21e/0x3e0 mm/slub.c:2678
[< inline >] slab_free mm/slub.c:2833
[< none >] kfree+0x26f/0x3e0 mm/slub.c:3662
[< inline >] sctp_association_destroy net/sctp/associola.c:424
[< none >] sctp_association_put+0x179/0x2c0 net/sctp/associola.c:860
[< none >] sctp_association_free+0x416/0x5d0 net/sctp/associola.c:402
[< inline >] sctp_cmd_delete_tcb net/sctp/sm_sideeffect.c:867
[< inline >] sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1287
[< inline >] sctp_side_effects net/sctp/sm_sideeffect.c:1153
[< none >] sctp_do_sm+0x1364/0x5d90 net/sctp/sm_sideeffect.c:1125
[< none >] sctp_primitive_ABORT+0xa9/0xd0 net/sctp/primitive.c:119
[< none >] sctp_close+0x2ad/0x9b0 net/sctp/socket.c:1517
[< none >] inet_release+0x111/0x270 net/ipv4/af_inet.c:413
[< none >] inet6_release+0x55/0x90 net/ipv6/af_inet6.c:406
[< none >] sock_release+0x96/0x260 net/socket.c:571
[< none >] sock_close+0x16/0x20 net/socket.c:1022
[< none >] __fput+0x244/0x860 fs/file_table.c:208
[< none >] ____fput+0x15/0x20 fs/file_table.c:244
[< none >] task_work_run+0x130/0x240 kernel/task_work.c:115
[< inline >] exit_task_work include/linux/task_work.h:21
[< none >] do_exit+0x885/0x3050 kernel/exit.c:750
[< none >] do_group_exit+0xec/0x390 kernel/exit.c:880

INFO: Slab 0xffffea0001a75200 objects=7 used=2 fp=0xffff880069d4c760
flags=0x5fffc0000004080
INFO: Object 0xffff880069d4c760 @offset=18272 fp=0xffff880069d4b588
CPU: 3 PID: 8211 Comm: a.out Tainted: G B 4.4.0-rc4+ #158
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
0000000000000003 ffff880032a8f3e0 ffffffff82e0f6d8 0000000041b58ab3
ffffffff87aa2c7d ffffffff82e0f626 ffff880031901740 ffffffff87ac3e19
ffff88003e806a00 0000000000000008 ffff880069d4c760 ffff880032a8f3e0

Call Trace:
[<ffffffff818450f4>] __asan_report_load4_noabort+0x54/0x70
mm/kasan/report.c:294
[< inline >] sctp_assoc2id include/net/sctp/sctp.h:323
[<ffffffff864927fe>] sctp_do_sm+0x530e/0x5d90 net/sctp/sm_sideeffect.c:1128
[<ffffffff864f9899>] sctp_primitive_ABORT+0xa9/0xd0 net/sctp/primitive.c:119
[<ffffffff864e55ad>] sctp_close+0x2ad/0x9b0 net/sctp/socket.c:1517
[<ffffffff85bfe691>] inet_release+0x111/0x270 net/ipv4/af_inet.c:413
[<ffffffff85d60ce5>] inet6_release+0x55/0x90 net/ipv6/af_inet6.c:406
[<ffffffff856b3b96>] sock_release+0x96/0x260 net/socket.c:571
[<ffffffff856b3d76>] sock_close+0x16/0x20 net/socket.c:1022
[<ffffffff8189d304>] __fput+0x244/0x860 fs/file_table.c:208
[<ffffffff8189d9b5>] ____fput+0x15/0x20 fs/file_table.c:244
[<ffffffff813e2dc0>] task_work_run+0x130/0x240 kernel/task_work.c:115
[< inline >] exit_task_work include/linux/task_work.h:21
[<ffffffff8137d1e5>] do_exit+0x885/0x3050 kernel/exit.c:750
[<ffffffff8137fb0c>] do_group_exit+0xec/0x390 kernel/exit.c:880
[<ffffffff813aa957>] get_signal+0x677/0x1bf0 kernel/signal.c:2307
[<ffffffff8118645e>] do_signal+0x7e/0x20a0 arch/x86/kernel/signal.c:712
[<ffffffff81003a1e>] exit_to_usermode_loop+0xfe/0x1e0
arch/x86/entry/common.c:247
[< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:282
[<ffffffff8100733b>] syscall_return_slowpath+0x16b/0x240
arch/x86/entry/common.c:344
[<ffffffff86a92662>] int_ret_from_sys_call+0x25/0x9f
arch/x86/entry/entry_64.S:281
==================================================================

Marcelo Ricardo Leitner

unread,
Dec 9, 2015, 10:04:02 AM12/9/15
to Dmitry Vyukov, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Wed, Dec 09, 2015 at 03:41:29PM +0100, Dmitry Vyukov wrote:
> On Tue, Dec 8, 2015 at 8:22 PM, Dmitry Vyukov <dvy...@google.com> wrote:
> > On Tue, Dec 8, 2015 at 6:40 PM, Marcelo Ricardo Leitner
> > <marcelo...@gmail.com> wrote:
...
> >> The patches were combined already, but this last pick by Vlad is just
> >> not yet patched. It's not necessary for your testing and I didn't want
> >> to interrupt it in case you were already testing it.
> >>
> >> You can use my last patch here, from 2 emails ago, the one which
> >> contains this line:
> >> - case SCTP_DISPOSITION_ABORT:
> >
> >
> > You are right. I missed that they are combined. Testing with it now.
>
>
>
>
> Use-after-free still happens.
> I am on commit aa53685549a2cfb5f175b0c4a20bc9aa1e5a1b85 (Dec 8) plus
> the following sctp-related changes:

Changes are fine. Ugh. Ok, I'll try your new reproducer here.

Marcelo

Marcelo Ricardo Leitner

unread,
Dec 9, 2015, 11:41:14 AM12/9/15
to Dmitry Vyukov, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Heh I wasn't going to reproduce this by myself anytime soon, I think.
It's using the same socket to connect to itself, and only happens if the
connect() gets there before the listen() call. Figured this out because
I could only reproduce it under strace at first.

Please give this other patch a try. A state command
(sctp_sf_cookie_wait_prm_abort) was issuing SCTP_CMD_INIT_FAILED, which
leads to SCTP_CMD_DELETE_TCB, but returning SCTP_DISPOSITION_CONSUME,
which fooled the patch.

---8<---
commit 9f84d50e36cee0ce66e4ce9b3b1665e0a1dbcdd3
Author: Marcelo Ricardo Leitner <marcelo...@gmail.com>
Date: Fri Dec 4 15:30:23 2015 -0200

sctp: fix use-after-free in pr_debug statement

Dmitry Vyukov reported a use-after-free in the code expanded by the
macro debug_post_sfx, which is caused by the use of the asoc pointer
after it was freed within sctp_side_effect() scope.

This patch fixes it by allowing sctp_side_effect to clear that asoc
pointer when the TCB is freed.

As Vlad explained, we also have to cover the SCTP_DISPOSITION_ABORT case
because it will trigger DELETE_TCB too on that same loop.

Also, there was a place issuing SCTP_CMD_INIT_FAILED but returning
SCTP_DISPOSITION_CONSUME, which would fool the scheme above. Fix it by
returning SCTP_DISPOSITION_ABORT instead.

The macro is already prepared to handle such NULL pointer.

Reported-by: Dmitry Vyukov <dvy...@google.com>

diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 6098d4c42fa9..be23d5c2074f 100644
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 6f46aa16cb76..d801e151498a 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -4959,12 +4959,10 @@ sctp_disposition_t sctp_sf_cookie_wait_prm_abort(
sctp_cmd_seq_t *commands)
{
struct sctp_chunk *abort = arg;
- sctp_disposition_t retval;

/* Stop T1-init timer */
sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_STOP,
SCTP_TO(SCTP_EVENT_TIMEOUT_T1_INIT));
- retval = SCTP_DISPOSITION_CONSUME;

sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(abort));

@@ -4983,7 +4981,7 @@ sctp_disposition_t sctp_sf_cookie_wait_prm_abort(
sctp_add_cmd_sf(commands, SCTP_CMD_INIT_FAILED,
SCTP_PERR(SCTP_ERROR_USER_ABORT));

- return retval;
+ return SCTP_DISPOSITION_ABORT;
}

/*

Dmitry Vyukov

unread,
Dec 11, 2015, 8:35:53 AM12/11/15
to Marcelo Ricardo Leitner, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Wed, Dec 9, 2015 at 5:41 PM, Marcelo Ricardo Leitner
Still happens...
I am on commit aa53685549a2cfb5f175b0c4a20bc9aa1e5a1b85 with your
latest patch applied.
Can you figure out what happens now from the report below? If not I
can create a repro, it's just somewhat time consuming.


BUG: KASAN: use-after-free in sctp_do_sm+0x4bca/0x4db0 at addr ffff880067c600a8
Read of size 4 by task syzkaller_execu/10266
=============================================================================
BUG kmalloc-4096 (Tainted: G W ): kasan: bad access detected
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: Allocated in sctp_association_new+0x6f/0x1da0 age=53 cpu=2 pid=10265
[< none >] ___slab_alloc+0x489/0x4e0 mm/slub.c:2468
[< none >] __slab_alloc+0x4c/0x90 mm/slub.c:2497
[< inline >] slab_alloc_node mm/slub.c:2560
[< inline >] slab_alloc mm/slub.c:2602
[< none >] kmem_cache_alloc_trace+0x264/0x2f0 mm/slub.c:2619
[< inline >] kmalloc include/linux/slab.h:458
[< inline >] kzalloc include/linux/slab.h:602
[< none >] sctp_association_new+0x6f/0x1da0 net/sctp/associola.c:302
[< none >] sctp_unpack_cookie+0x8b0/0x11c0
net/sctp/sm_make_chunk.c:1812
[< none >] sctp_sf_do_5_1D_ce+0x3ca/0x1410 net/sctp/sm_statefuns.c:702
[< none >] sctp_do_sm+0x20d/0x4db0 net/sctp/sm_sideeffect.c:1122
[< none >] sctp_endpoint_bh_rcv+0x38d/0x830 net/sctp/endpointola.c:486
[< none >] sctp_inq_push+0x12c/0x190 net/sctp/inqueue.c:95
[< none >] sctp_rcv+0x1d3b/0x2840 net/sctp/input.c:270
[< none >] ip_local_deliver_finish+0x2b0/0xa50 net/ipv4/ip_input.c:216
[< inline >] NF_HOOK_THRESH include/linux/netfilter.h:226
[< inline >] NF_HOOK include/linux/netfilter.h:249
[< none >] ip_local_deliver+0x1c4/0x2f0 net/ipv4/ip_input.c:257
[< inline >] dst_input include/net/dst.h:465
[< none >] ip_rcv_finish+0x5ea/0x1730 net/ipv4/ip_input.c:365
[< inline >] NF_HOOK_THRESH include/linux/netfilter.h:226
[< inline >] NF_HOOK include/linux/netfilter.h:249
[< none >] ip_rcv+0x963/0x1080 net/ipv4/ip_input.c:455
[< none >] __netif_receive_skb_core+0x1636/0x2f90 net/core/dev.c:3943
[< none >] __netif_receive_skb+0x2a/0x160 net/core/dev.c:3978

INFO: Freed in sctp_association_put+0x150/0x250 age=0 cpu=2 pid=10266
[< none >] __slab_free+0x1fc/0x320 mm/slub.c:2678
[< inline >] slab_free mm/slub.c:2833
[< none >] kfree+0x26a/0x290 mm/slub.c:3662
[< inline >] sctp_association_destroy net/sctp/associola.c:424
[< none >] sctp_association_put+0x150/0x250 net/sctp/associola.c:860
[< none >] sctp_association_free+0x3dc/0x520 net/sctp/associola.c:402
[< inline >] sctp_cmd_delete_tcb net/sctp/sm_sideeffect.c:867
[< inline >] sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1287
[< inline >] sctp_side_effects net/sctp/sm_sideeffect.c:1153
[< none >] sctp_do_sm+0x175c/0x4db0 net/sctp/sm_sideeffect.c:1125
[< none >] sctp_primitive_ABORT+0xa9/0xd0 net/sctp/primitive.c:119
[< none >] sctp_close+0x274/0x7b0 net/sctp/socket.c:1517
[< none >] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:413
[< none >] sock_release+0x8d/0x1d0 net/socket.c:571
[< none >] sock_close+0x16/0x20 net/socket.c:1022
[< none >] __fput+0x233/0x780 fs/file_table.c:208
[< none >] ____fput+0x15/0x20 fs/file_table.c:244
[< none >] task_work_run+0x16b/0x200 kernel/task_work.c:115
[< inline >] exit_task_work include/linux/task_work.h:21
[< none >] do_exit+0x8bb/0x2b20 kernel/exit.c:750
[< none >] do_group_exit+0x108/0x320 kernel/exit.c:880
[< none >] get_signal+0x5e4/0x1500 kernel/signal.c:2307

INFO: Slab 0xffffea00019f1800 objects=7 used=1 fp=0xffff880067c60000
flags=0x5fffc0000004080
INFO: Object 0xffff880067c60000 @offset=0 fp=0xffff880067c623b0
CPU: 2 PID: 10266 Comm: syzkaller_execu Tainted: G B W
4.4.0-rc4+ #160
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
00000000ffffffff ffff880038777458 ffffffff82899b0d ffff88003e806a00
ffff880067c60000 ffff880067c60000 ffff880038777488 ffffffff816c5564
ffff88003e806a00 ffffea00019f1800 ffff880067c60000 ffff880067c60000
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82899b0d>] dump_stack+0x6f/0xa2 lib/dump_stack.c:50
[<ffffffff816c5564>] print_trailer+0xf4/0x150 mm/slub.c:659
[<ffffffff816cbd1f>] object_err+0x2f/0x40 mm/slub.c:689
[< inline >] print_address_description mm/kasan/report.c:138
[<ffffffff816ce6dd>] kasan_report_error+0x25d/0x560 mm/kasan/report.c:251
[< inline >] kasan_report mm/kasan/report.c:274
[<ffffffff816cea9e>] __asan_report_load4_noabort+0x3e/0x40
mm/kasan/report.c:294
[< inline >] sctp_assoc2id include/net/sctp/sctp.h:323
[<ffffffff8566f14a>] sctp_do_sm+0x4bca/0x4db0 net/sctp/sm_sideeffect.c:1128
[<ffffffff856c6289>] sctp_primitive_ABORT+0xa9/0xd0 net/sctp/primitive.c:119
[<ffffffff856b1d94>] sctp_close+0x274/0x7b0 net/sctp/socket.c:1517
[<ffffffff84f29dcd>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:413
[<ffffffff84ae520d>] sock_release+0x8d/0x1d0 net/socket.c:571
[<ffffffff84ae5366>] sock_close+0x16/0x20 net/socket.c:1022
[<ffffffff81715f73>] __fput+0x233/0x780 fs/file_table.c:208
[<ffffffff81716545>] ____fput+0x15/0x20 fs/file_table.c:244
[<ffffffff8134679b>] task_work_run+0x16b/0x200 kernel/task_work.c:115
[< inline >] exit_task_work include/linux/task_work.h:21
[<ffffffff812f4d3b>] do_exit+0x8bb/0x2b20 kernel/exit.c:750
[<ffffffff812f7118>] do_group_exit+0x108/0x320 kernel/exit.c:880
[<ffffffff813196c4>] get_signal+0x5e4/0x1500 kernel/signal.c:2307
[<ffffffff811507a3>] do_signal+0x83/0x1c90 arch/x86/kernel/signal.c:712
[<ffffffff81003901>] exit_to_usermode_loop+0xf1/0x1a0
arch/x86/entry/common.c:247
[< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:282
[<ffffffff8100631f>] syscall_return_slowpath+0x19f/0x210
arch/x86/entry/common.c:344
[<ffffffff85b676e2>] int_ret_from_sys_call+0x25/0x9f
arch/x86/entry/entry_64.S:281

Marcelo Ricardo Leitner

unread,
Dec 11, 2015, 8:51:28 AM12/11/15
to Dmitry Vyukov, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
I can imagine. I don't know how this fuzzer works, but it would be nice
if this reproducer extractor could be executed easier. So far, we have
identified 3 different issues already leading to this bug:
- 1st, the handling on DELETE_TCB
- 2nd, the handling on DISPOSITION_ABORT
- 3rd, the bad combination on internal state-machine command to a return
value

I can and will review it again, but it's doing nasty stuff like using
the same socket to connect to itself. It's hard to imagine all those
combinations in mind that might lead to that use-after-free.

Keep you posted.. thanks.

Marcelo

Marcelo Ricardo Leitner

unread,
Dec 11, 2015, 9:03:38 AM12/11/15
to Dmitry Vyukov, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Found a similar place in abort primitive handling like in this last
patch update, it's probably the issue you're still triggering.

Also found another place that may lead to this use after free, in case
we receive a packet with a chunk that has no data.

Oh my.. :)

Marcelo

Dmitry Vyukov

unread,
Dec 11, 2015, 9:30:30 AM12/11/15
to Marcelo Ricardo Leitner, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Fri, Dec 11, 2015 at 3:03 PM, Marcelo Ricardo Leitner
It would be very nice, but it is not always trivial.

Fuzzer pretty much tried to trigger everything that is triggerable
from user-space. Sometimes what it does can make no sense. But it is
still super-important for contexts like Android, where programs can be
as malicious as you can imagine and the system heavily relies on
kernel for protection.

>> identified 3 different issues already leading to this bug:
>> - 1st, the handling on DELETE_TCB
>> - 2nd, the handling on DISPOSITION_ABORT
>> - 3rd, the bad combination on internal state-machine command to a return
>> value
>>
>> I can and will review it again, but it's doing nasty stuff like using the
>> same socket to connect to itself. It's hard to imagine all those
>> combinations in mind that might lead to that use-after-free.
>>
>> Keep you posted.. thanks.
>
> Found a similar place in abort primitive handling like in this last
> patch update, it's probably the issue you're still triggering.
>
> Also found another place that may lead to this use after free, in case
> we receive a packet with a chunk that has no data.

I see that sctp_cmd_interpreter does:

sctp_cmd_delete_tcb(commands, asoc);
asoc = NULL;

Won't it be simpler to pass sctp_association ** to this function and
let it clear it whenever it decides to free the objects, rather than
try to duplicate its logic on higher level. Just a blind thought.



> Oh my.. :)
>
> Marcelo
>

Marcelo Ricardo Leitner

unread,
Dec 11, 2015, 10:56:05 AM12/11/15
to Dmitry Vyukov, Vlad Yasevich, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
That's like a short-circuit between the two logics, it's already
somewhat duplicated. I'm afraid that these other still returning
DISPOSITION_CONSUME may not be aware that the assoc is going away in
short term, maybe we have some other bug there too.

If/when we simplify sctp_side_effects() and get ride of that switch
case, that's probably how it will work, though.

Marcelo

Vlad Yasevich

unread,
Dec 11, 2015, 1:37:56 PM12/11/15
to Marcelo Ricardo Leitner, Dmitry Vyukov, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Yes. This is what I was worried about... Anything that triggers
a DELTE_TCB command has to return a code that we can trap.

The other way is to do what Dmitri suggested, but even there, we
need to be very careful.

-vlad
>
> Marcelo
>

David Laight

unread,
Dec 14, 2015, 4:52:49 AM12/14/15
to Vlad Yasevich, Marcelo Ricardo Leitner, Dmitry Vyukov, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
From: Vlad Yasevich
> Sent: 11 December 2015 18:38
...
> > Found a similar place in abort primitive handling like in this last
> > patch update, it's probably the issue you're still triggering.
> >
> > Also found another place that may lead to this use after free, in case
> > we receive a packet with a chunk that has no data.
> >
> > Oh my.. :)
>
> Yes. This is what I was worried about... Anything that triggers
> a DELTE_TCB command has to return a code that we can trap.
>
> The other way is to do what Dmitri suggested, but even there, we
> need to be very careful.

I'm always wary of anything that queues actions up for later processing.
It is far too easy (as found here) to end up processing actions
in invalid states, or to process actions in 'unusual' orders when
specific events happen close together.

I wonder how much fallout there'd be from getting the sctp code
to immediately action things, instead of queuing the actions for later.
It would certainly remove a lot of the unusual combinations of events.

David


Vlad Yasevich

unread,
Dec 14, 2015, 9:25:57 AM12/14/15
to David Laight, Marcelo Ricardo Leitner, Dmitry Vyukov, netdev, Eric Dumazet, syzkaller, linux...@vger.kernel.org, Kostya Serebryany, Alexander Potapenko, Sasha Levin
We've bandied this idea around for a while, but no one has had the time
to tackle this. This would be rather time-consuming task, but in the end
might be a good idea.

-vlad

> David
>
>

Marcelo Ricardo Leitner

unread,
Jan 8, 2016, 8:01:04 AM1/8/16
to net...@vger.kernel.org, linux...@vger.kernel.org, dvy...@google.com, vyas...@gmail.com, eric.d...@gmail.com, syzk...@googlegroups.com, k...@google.com, gli...@google.com, sasha...@oracle.com
Couldn't get syzkaller working over here, so I still need your help on
testing this. I expect this will be the last cycle, though.

If it does generate another trace, I'll need the reproducer too because
I can't find anything else just with code review.

Thanks

--8<--

Dmitry Vyukov reported a use-after-free in the code expanded by the
macro debug_post_sfx, which is caused by the use of the asoc pointer
after it was freed within sctp_side_effect() scope.

This patch fixes it by allowing sctp_side_effect to clear that asoc
pointer when the TCB is freed.

As Vlad explained, we also have to cover the SCTP_DISPOSITION_ABORT case
because it will trigger DELETE_TCB too on that same loop.

Also, there were places issuing SCTP_CMD_INIT_FAILED and ASSOC_FAILED
but returning SCTP_DISPOSITION_CONSUME, which would fool the scheme
above. Fix it by returning SCTP_DISPOSITION_ABORT instead.

The macro is already prepared to handle such NULL pointer.

Reported-by: Dmitry Vyukov <dvy...@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>
---
net/sctp/sm_sideeffect.c | 11 ++++++-----
net/sctp/sm_statefuns.c | 17 ++++-------------
2 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 4f170ad38ff4f7d345d8e3a3fee7d691df64d9cb..2e21384697c2a6a5fd045142bcd9c39992d3867f 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -63,7 +63,7 @@ static int sctp_cmd_interpreter(sctp_event_t event_type,
static int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype,
sctp_state_t state,
struct sctp_endpoint *ep,
- struct sctp_association *asoc,
+ struct sctp_association **asoc,
void *event_arg,
sctp_disposition_t status,
sctp_cmd_seq_t *commands,
@@ -1125,7 +1125,7 @@ int sctp_do_sm(struct net *net, sctp_event_t event_type, sctp_subtype_t subtype,
debug_post_sfn();

error = sctp_side_effects(event_type, subtype, state,
- ep, asoc, event_arg, status,
+ ep, &asoc, event_arg, status,
&commands, gfp);
debug_post_sfx();

@@ -1138,7 +1138,7 @@ int sctp_do_sm(struct net *net, sctp_event_t event_type, sctp_subtype_t subtype,
static int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype,
sctp_state_t state,
struct sctp_endpoint *ep,
- struct sctp_association *asoc,
+ struct sctp_association **asoc,
void *event_arg,
sctp_disposition_t status,
sctp_cmd_seq_t *commands,
@@ -1153,7 +1153,7 @@ static int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype,
* disposition SCTP_DISPOSITION_CONSUME.
*/
if (0 != (error = sctp_cmd_interpreter(event_type, subtype, state,
- ep, asoc,
+ ep, *asoc,
event_arg, status,
commands, gfp)))
goto bail;
@@ -1176,11 +1176,12 @@ static int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype,
break;

case SCTP_DISPOSITION_DELETE_TCB:
+ case SCTP_DISPOSITION_ABORT:
/* This should now be a command. */
+ *asoc = NULL;
break;

case SCTP_DISPOSITION_CONSUME:
- case SCTP_DISPOSITION_ABORT:
/*
* We should no longer have much work to do here as the
* real work has been done as explicit commands above.
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 22c2bf367d7e8c7025065f33eabfd7e93a7f4021..f1f08c8f277bd8719299d1ed21eb23e36d55f7e2 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -2976,7 +2976,7 @@ sctp_disposition_t sctp_sf_eat_data_6_2(struct net *net,
SCTP_INC_STATS(net, SCTP_MIB_IN_DATA_CHUNK_DISCARDS);
goto discard_force;
case SCTP_IERROR_NO_DATA:
- goto consume;
+ return SCTP_DISPOSITION_ABORT;
case SCTP_IERROR_PROTO_VIOLATION:
return sctp_sf_abort_violation(net, ep, asoc, chunk, commands,
(u8 *)chunk->subh.data_hdr, sizeof(sctp_datahdr_t));
@@ -3043,9 +3043,6 @@ discard_noforce:
sctp_add_cmd_sf(commands, SCTP_CMD_GEN_SACK, force);

return SCTP_DISPOSITION_DISCARD;
-consume:
- return SCTP_DISPOSITION_CONSUME;
-
}

/*
@@ -3093,7 +3090,7 @@ sctp_disposition_t sctp_sf_eat_data_fast_4_4(struct net *net,
case SCTP_IERROR_BAD_STREAM:
break;
case SCTP_IERROR_NO_DATA:
- goto consume;
+ return SCTP_DISPOSITION_ABORT;
case SCTP_IERROR_PROTO_VIOLATION:
return sctp_sf_abort_violation(net, ep, asoc, chunk, commands,
(u8 *)chunk->subh.data_hdr, sizeof(sctp_datahdr_t));
@@ -3119,7 +3116,6 @@ sctp_disposition_t sctp_sf_eat_data_fast_4_4(struct net *net,
SCTP_TO(SCTP_EVENT_TIMEOUT_T2_SHUTDOWN));
}

-consume:
return SCTP_DISPOSITION_CONSUME;
}

@@ -4825,9 +4821,6 @@ sctp_disposition_t sctp_sf_do_9_1_prm_abort(
* if necessary to fill gaps.
*/
struct sctp_chunk *abort = arg;
- sctp_disposition_t retval;
-
- retval = SCTP_DISPOSITION_CONSUME;

if (abort)
sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(abort));
@@ -4845,7 +4838,7 @@ sctp_disposition_t sctp_sf_do_9_1_prm_abort(
SCTP_INC_STATS(net, SCTP_MIB_ABORTEDS);
SCTP_DEC_STATS(net, SCTP_MIB_CURRESTAB);

- return retval;
+ return SCTP_DISPOSITION_ABORT;
}

/* We tried an illegal operation on an association which is closed. */
@@ -4960,12 +4953,10 @@ sctp_disposition_t sctp_sf_cookie_wait_prm_abort(
sctp_cmd_seq_t *commands)
{
struct sctp_chunk *abort = arg;
- sctp_disposition_t retval;

/* Stop T1-init timer */
sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_STOP,
SCTP_TO(SCTP_EVENT_TIMEOUT_T1_INIT));
- retval = SCTP_DISPOSITION_CONSUME;

if (abort)
sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(abort));
@@ -4985,7 +4976,7 @@ sctp_disposition_t sctp_sf_cookie_wait_prm_abort(
sctp_add_cmd_sf(commands, SCTP_CMD_INIT_FAILED,
SCTP_PERR(SCTP_ERROR_USER_ABORT));

- return retval;
+ return SCTP_DISPOSITION_ABORT;
}

/*
--
2.5.0

Vlad Yasevich

unread,
Jan 11, 2016, 12:00:06 PM1/11/16
to Marcelo Ricardo Leitner, net...@vger.kernel.org, linux...@vger.kernel.org, dvy...@google.com, eric.d...@gmail.com, syzk...@googlegroups.com, k...@google.com, gli...@google.com, sasha...@oracle.com
On 01/08/2016 08:00 AM, Marcelo Ricardo Leitner wrote:
> Couldn't get syzkaller working over here, so I still need your help on
> testing this. I expect this will be the last cycle, though.
>
> If it does generate another trace, I'll need the reproducer too because
> I can't find anything else just with code review.
>
> Thanks

Looks to me like you got all of them.

>
> --8<--
>
> Dmitry Vyukov reported a use-after-free in the code expanded by the
> macro debug_post_sfx, which is caused by the use of the asoc pointer
> after it was freed within sctp_side_effect() scope.
>
> This patch fixes it by allowing sctp_side_effect to clear that asoc
> pointer when the TCB is freed.
>
> As Vlad explained, we also have to cover the SCTP_DISPOSITION_ABORT case
> because it will trigger DELETE_TCB too on that same loop.
>
> Also, there were places issuing SCTP_CMD_INIT_FAILED and ASSOC_FAILED
> but returning SCTP_DISPOSITION_CONSUME, which would fool the scheme
> above. Fix it by returning SCTP_DISPOSITION_ABORT instead.
>
> The macro is already prepared to handle such NULL pointer.
>
> Reported-by: Dmitry Vyukov <dvy...@google.com>
> Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>

Acked-by: Vlad Yasevich <vyas...@gmail.com>

Thanks
-vlad

David Miller

unread,
Jan 11, 2016, 5:13:51 PM1/11/16
to marcelo...@gmail.com, net...@vger.kernel.org, linux...@vger.kernel.org, dvy...@google.com, vyas...@gmail.com, eric.d...@gmail.com, syzk...@googlegroups.com, k...@google.com, gli...@google.com, sasha...@oracle.com
From: Marcelo Ricardo Leitner <marcelo...@gmail.com>
Date: Fri, 8 Jan 2016 11:00:54 -0200

> Dmitry Vyukov reported a use-after-free in the code expanded by the
> macro debug_post_sfx, which is caused by the use of the asoc pointer
> after it was freed within sctp_side_effect() scope.
>
> This patch fixes it by allowing sctp_side_effect to clear that asoc
> pointer when the TCB is freed.
>
> As Vlad explained, we also have to cover the SCTP_DISPOSITION_ABORT case
> because it will trigger DELETE_TCB too on that same loop.
>
> Also, there were places issuing SCTP_CMD_INIT_FAILED and ASSOC_FAILED
> but returning SCTP_DISPOSITION_CONSUME, which would fool the scheme
> above. Fix it by returning SCTP_DISPOSITION_ABORT instead.
>
> The macro is already prepared to handle such NULL pointer.
>
> Reported-by: Dmitry Vyukov <dvy...@google.com>
> Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>

Applied, thank you.

Dmitry Vyukov

unread,
Jan 12, 2016, 3:41:39 AM1/12/16
to David Miller, Marcelo Ricardo Leitner, netdev, linux...@vger.kernel.org, Vladislav Yasevich, Eric Dumazet, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Tested with this patch for half a day. I did not see any reports
related to pr_debug. Let's consider this as fixed.
Thanks!
Reply all
Reply to author
Forward
0 new messages