KASAN: use-after-free Read in ucma_close (2)

39 views
Skip to first unread message

syzbot

unread,
Sep 10, 2020, 10:09:25 AM9/10/20
to dled...@redhat.com, j...@ziepe.ca, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 34d4ddd3 Merge tag 'linux-kselftest-5.9-rc5' of git://git...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1002ea2d900000
kernel config: https://syzkaller.appspot.com/x/.config?x=a9075b36a6ae26c9
dashboard link: https://syzkaller.appspot.com/bug?extid=cc6fc752b3819e082d0c
compiler: gcc (GCC) 10.1.0-syz 20200507
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1600e053900000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+cc6fc7...@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: use-after-free in ucma_close+0x2a4/0x310 drivers/infiniband/core/ucma.c:1839
Read of size 4 at addr ffff8880a748b538 by task syz-executor.0/7260

CPU: 0 PID: 7260 Comm: syz-executor.0 Not tainted 5.9.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x198/0x1fd lib/dump_stack.c:118
print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
ucma_close+0x2a4/0x310 drivers/infiniband/core/ucma.c:1839
__fput+0x285/0x920 fs/file_table.c:281
task_work_run+0xdd/0x190 kernel/task_work.c:141
tracehook_notify_resume include/linux/tracehook.h:188 [inline]
exit_to_user_mode_loop kernel/entry/common.c:163 [inline]
exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190
syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x416f01
Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 04 1b 00 00 c3 48 83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007ffd5e376f90 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000416f01
RDX: 0000000000000001 RSI: 0000000000000080 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 00007ffd5e377080 R11: 0000000000000293 R12: 0000000001190ed0
R13: 000000000007fa65 R14: ffffffffffffffff R15: 000000000118cfec

Allocated by task 7261:
kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
kasan_set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:461
kmem_cache_alloc_trace+0x174/0x2c0 mm/slab.c:3550
kmalloc include/linux/slab.h:554 [inline]
kzalloc include/linux/slab.h:666 [inline]
ucma_alloc_ctx+0x4b/0x480 drivers/infiniband/core/ucma.c:212
ucma_create_id+0x11b/0x590 drivers/infiniband/core/ucma.c:502
ucma_write+0x288/0x350 drivers/infiniband/core/ucma.c:1768
vfs_write+0x2b0/0x730 fs/read_write.c:576
ksys_write+0x1ee/0x250 fs/read_write.c:631
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 7261:
kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355
__kasan_slab_free+0xd8/0x120 mm/kasan/common.c:422
__cache_free mm/slab.c:3418 [inline]
kfree+0x10e/0x2b0 mm/slab.c:3756
ucma_free_ctx+0x7f6/0xae0 drivers/infiniband/core/ucma.c:600
ucma_destroy_id+0x30c/0x460 drivers/infiniband/core/ucma.c:644
ucma_write+0x288/0x350 drivers/infiniband/core/ucma.c:1768
vfs_write+0x2b0/0x730 fs/read_write.c:576
ksys_write+0x1ee/0x250 fs/read_write.c:631
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff8880a748b400
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 312 bytes inside of
512-byte region [ffff8880a748b400, ffff8880a748b600)
The buggy address belongs to the page:
page:000000002b52c09c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xa748b
flags: 0xfffe0000000200(slab)
raw: 00fffe0000000200 ffffea0002416c48 ffff8880aa041750 ffff8880aa040600
raw: 0000000000000000 ffff8880a748b000 0000000100000004 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff8880a748b400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880a748b480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880a748b500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880a748b580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880a748b600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Hillf Danton

unread,
Sep 11, 2020, 12:17:02 AM9/11/20
to syzbot, dled...@redhat.com, j...@ziepe.ca, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, hda...@sina.com, syzkall...@googlegroups.com

Thu, 10 Sep 2020 07:09:24 -0700
Detect race destroying ctx in order to avoid UAF.

--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -625,6 +625,10 @@ static ssize_t ucma_destroy_id(struct uc
return PTR_ERR(ctx);

mutex_lock(&ctx->file->mut);
+ if (ctx->destroying == 1) {
+ mutex_unlock(&ctx->file->mut);
+ return -ENXIO;
+ }
ctx->destroying = 1;
mutex_unlock(&ctx->file->mut);

@@ -1826,6 +1830,8 @@ static int ucma_close(struct inode *inod

mutex_lock(&file->mut);
list_for_each_entry_safe(ctx, tmp, &file->ctx_list, list) {
+ if (ctx->destroying == 1)
+ continue;
ctx->destroying = 1;
mutex_unlock(&file->mut);


syzbot

unread,
Sep 11, 2020, 4:13:06 AM9/11/20
to anant.th...@gmail.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING: bad unlock balance detected!

=====================================
WARNING: bad unlock balance detected!
5.9.0-rc4-syzkaller #0 Not tainted
-------------------------------------
syz-executor.2/8340 is trying to release lock (
==================================================================
BUG: KASAN: use-after-free in print_lockdep_cache+0x70/0xf5 kernel/locking/lockdep.c:677
Read of size 8 at addr ffff88809827f480 by task syz-executor.2/8340

CPU: 1 PID: 8340 Comm: syz-executor.2 Not tainted 5.9.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x198/0x1fd lib/dump_stack.c:118
print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
print_lockdep_cache+0x70/0xf5 kernel/locking/lockdep.c:677
print_unlock_imbalance_bug.part.0+0x8b/0xe6 kernel/locking/lockdep.c:4471
print_unlock_imbalance_bug kernel/locking/lockdep.c:4040 [inline]
__lock_release kernel/locking/lockdep.c:4715 [inline]
lock_release.cold+0x21/0x4a kernel/locking/lockdep.c:5026
__mutex_unlock_slowpath+0x81/0x610 kernel/locking/mutex.c:1228
ucma_destroy_id+0x21e/0x460 drivers/infiniband/core/ucma.c:629
ucma_write+0x288/0x350 drivers/infiniband/core/ucma.c:1768
vfs_write+0x2b0/0x730 fs/read_write.c:576
ksys_write+0x1ee/0x250 fs/read_write.c:631
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45d5b9
Code: 5d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f581a554c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000038340 RCX: 000000000045d5b9
RDX: 0000000000000018 RSI: 00000000200000c0 RDI: 0000000000000005
RBP: 000000000118d020 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118cfec
R13: 00007fff6e89cf5f R14: 00007f581a5559c0 R15: 000000000118cfec

Allocated by task 8340:
kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
kasan_set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:461
kmem_cache_alloc_trace+0x174/0x2c0 mm/slab.c:3550
kmalloc include/linux/slab.h:554 [inline]
ucma_open+0x4a/0x270 drivers/infiniband/core/ucma.c:1800
misc_open+0x372/0x4a0 drivers/char/misc.c:141
chrdev_open+0x266/0x770 fs/char_dev.c:414
do_dentry_open+0x4b9/0x11b0 fs/open.c:817
do_open fs/namei.c:3251 [inline]
path_openat+0x1b9a/0x2730 fs/namei.c:3368
do_filp_open+0x17e/0x3c0 fs/namei.c:3395
do_sys_openat2+0x16d/0x420 fs/open.c:1168
do_sys_open fs/open.c:1184 [inline]
__do_sys_openat fs/open.c:1200 [inline]
__se_sys_openat fs/open.c:1195 [inline]
__x64_sys_openat+0x13f/0x1f0 fs/open.c:1195
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 8333:
kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355
__kasan_slab_free+0xd8/0x120 mm/kasan/common.c:422
__cache_free mm/slab.c:3418 [inline]
kfree+0x10e/0x2b0 mm/slab.c:3756
ucma_close+0x26d/0x310 drivers/infiniband/core/ucma.c:1856
__fput+0x285/0x920 fs/file_table.c:281
task_work_run+0xdd/0x190 kernel/task_work.c:141
tracehook_notify_resume include/linux/tracehook.h:188 [inline]
exit_to_user_mode_loop kernel/entry/common.c:163 [inline]
exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190
syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265
entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff88809827f400
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 128 bytes inside of
512-byte region [ffff88809827f400, ffff88809827f600)
The buggy address belongs to the page:
page:00000000fda0c912 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x9827f
flags: 0xfffe0000000200(slab)
raw: 00fffe0000000200 ffffea00025f37c8 ffffea00024ee548 ffff8880aa040600
raw: 0000000000000000 ffff88809827f000 0000000100000004 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff88809827f380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88809827f400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88809827f480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88809827f500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88809827f580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


Tested on:

commit: 581cb3a2 Merge tag 'f2fs-for-5.9-rc5' of git://git.kernel...
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=11c1ffcd900000

Jason Gunthorpe

unread,
Sep 11, 2020, 7:57:53 AM9/11/20
to Hillf Danton, syzbot, dled...@redhat.com, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com
On Fri, Sep 11, 2020 at 12:16:40PM +0800, Hillf Danton wrote:
> Detect race destroying ctx in order to avoid UAF.
>
> +++ b/drivers/infiniband/core/ucma.c
> @@ -625,6 +625,10 @@ static ssize_t ucma_destroy_id(struct uc
> return PTR_ERR(ctx);
>
> mutex_lock(&ctx->file->mut);
> + if (ctx->destroying == 1) {
> + mutex_unlock(&ctx->file->mut);
> + return -ENXIO;
> + }
> ctx->destroying = 1;
> mutex_unlock(&ctx->file->mut);
>
> @@ -1826,6 +1830,8 @@ static int ucma_close(struct inode *inod
>
> mutex_lock(&file->mut);
> list_for_each_entry_safe(ctx, tmp, &file->ctx_list, list) {
> + if (ctx->destroying == 1)
> + continue;
> ctx->destroying = 1;
> mutex_unlock(&file->mut);
>

ucma_destroy_id() is called from write() and ucma_close is release(),
so there is no way these can race?

Jason

Jason Gunthorpe

unread,
Sep 11, 2020, 8:02:25 AM9/11/20
to syzbot, dled...@redhat.com, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com
On Thu, Sep 10, 2020 at 07:09:24AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 34d4ddd3 Merge tag 'linux-kselftest-5.9-rc5' of git://git...
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1002ea2d900000
> kernel config: https://syzkaller.appspot.com/x/.config?x=a9075b36a6ae26c9
> dashboard link: https://syzkaller.appspot.com/bug?extid=cc6fc752b3819e082d0c
> compiler: gcc (GCC) 10.1.0-syz 20200507
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1600e053900000

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git 308571debccd7004acf02ea1b7163a96ad772292

Jason

Hillf Danton

unread,
Sep 11, 2020, 11:21:17 AM9/11/20
to Jason Gunthorpe, Hillf Danton, syzbot, dled...@redhat.com, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com
Sound good but what's reported is uaf in the close path, which is
impossible without another thread releasing the ctx a step ahead
the closer.
Can we call it a race if that's true?

syzbot

unread,
Sep 11, 2020, 11:49:05 AM9/11/20
to dled...@redhat.com, j...@ziepe.ca, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
KASAN: use-after-free Read in __destroy_id

==================================================================
BUG: KASAN: use-after-free in __destroy_id+0x9f5/0xc60 drivers/infiniband/core/ucma.c:620
Read of size 4 at addr ffff88808e210128 by task syz-executor.2/11716

CPU: 1 PID: 11716 Comm: syz-executor.2 Not tainted 5.9.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x18f/0x20d lib/dump_stack.c:118
print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
__destroy_id+0x9f5/0xc60 drivers/infiniband/core/ucma.c:620
ucma_destroy_id+0x172/0x240 drivers/infiniband/core/ucma.c:654
ucma_write+0x288/0x350 drivers/infiniband/core/ucma.c:1784
vfs_write+0x2b0/0x730 fs/read_write.c:576
ksys_write+0x1ee/0x250 fs/read_write.c:631
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45d5b9
Code: 5d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ff28d19dc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000038340 RCX: 000000000045d5b9
RDX: 0000000000000018 RSI: 00000000200000c0 RDI: 0000000000000005
RBP: 000000000118cf80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118cf4c
R13: 00007ffcebc2643f R14: 00007ff28d19e9c0 R15: 000000000118cf4c

Allocated by task 11716:
kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
kasan_set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:461
kmem_cache_alloc_trace+0x16e/0x2c0 mm/slab.c:3550
kmalloc include/linux/slab.h:554 [inline]
kzalloc include/linux/slab.h:666 [inline]
ucma_alloc_ctx+0x41/0x330 drivers/infiniband/core/ucma.c:211
ucma_create_id+0x10f/0x410 drivers/infiniband/core/ucma.c:497
ucma_write+0x288/0x350 drivers/infiniband/core/ucma.c:1784
vfs_write+0x2b0/0x730 fs/read_write.c:576
ksys_write+0x1ee/0x250 fs/read_write.c:631
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 11715:
kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355
__kasan_slab_free+0xd8/0x120 mm/kasan/common.c:422
__cache_free mm/slab.c:3418 [inline]
kfree+0x103/0x2c0 mm/slab.c:3756
ucma_free_ctx drivers/infiniband/core/ucma.c:599 [inline]
__destroy_id+0x8a2/0xc60 drivers/infiniband/core/ucma.c:628
ucma_close+0xe1/0x190 drivers/infiniband/core/ucma.c:1849
__fput+0x285/0x920 fs/file_table.c:281
task_work_run+0xdd/0x190 kernel/task_work.c:141
tracehook_notify_resume include/linux/tracehook.h:188 [inline]
exit_to_user_mode_loop kernel/entry/common.c:139 [inline]
exit_to_user_mode_prepare+0x195/0x1c0 kernel/entry/common.c:166
syscall_exit_to_user_mode+0x59/0x2b0 kernel/entry/common.c:241
entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff88808e210000
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 296 bytes inside of
512-byte region [ffff88808e210000, ffff88808e210200)
The buggy address belongs to the page:
page:000000007a3e4a58 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88808e210c00 pfn:0x8e210
flags: 0xfffe0000000200(slab)
raw: 00fffe0000000200 ffffea0002404fc8 ffffea00028aa008 ffff8880aa040600
raw: ffff88808e210c00 ffff88808e210000 0000000100000003 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff88808e210000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88808e210080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88808e210100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88808e210180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88808e210200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================


Tested on:

commit: 308571de RDMA/ucma: Do not use file->mut to lock destroying
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
console output: https://syzkaller.appspot.com/x/log.txt?x=112f7cdd900000
kernel config: https://syzkaller.appspot.com/x/.config?x=3d400a47d1416652

Jason Gunthorpe

unread,
Sep 11, 2020, 1:01:43 PM9/11/20
to Hillf Danton, syzbot, dled...@redhat.com, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com
Migrate is the cause, very tricky:

CPU0 CPU1
ucma_destroy_id()
ucma_migrate_id()
ucma_get_ctx()
xa_lock()
_ucma_find_context()
xa_erase()
xa_lock()
ctx->file = new_file
list_move()
xa_unlock()
ucma_put_ctx
ucma_close()
_destroy_id()

_destroy_id()
wait_for_completion()
// boom


ie the destrory_id() on the initial FD captures the ctx right before
migrate moves it, then the new FD closes calling destroy while the
other destroy is still running.

Sigh, I will rewrite migrate too..

Jason

Jason Gunthorpe

unread,
Sep 11, 2020, 2:19:41 PM9/11/20
to syzbot, dled...@redhat.com, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com
On Thu, Sep 10, 2020 at 07:09:24AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 34d4ddd3 Merge tag 'linux-kselftest-5.9-rc5' of git://git...
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1002ea2d900000
> kernel config: https://syzkaller.appspot.com/x/.config?x=a9075b36a6ae26c9
> dashboard link: https://syzkaller.appspot.com/bug?extid=cc6fc752b3819e082d0c
> compiler: gcc (GCC) 10.1.0-syz 20200507
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1600e053900000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+cc6fc7...@syzkaller.appspotmail.com

#syz test: https://github.com/jgunthorpe/linux ucma_migrate_fix

Jason

syzbot

unread,
Sep 11, 2020, 4:53:06 PM9/11/20
to dled...@redhat.com, j...@ziepe.ca, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+cc6fc7...@syzkaller.appspotmail.com

Tested on:

commit: 7c003f9a RDMA/ucma: Rework ucma_migrate_id() to avoid race..
git tree: https://github.com/jgunthorpe/linux ucma_migrate_fix
kernel config: https://syzkaller.appspot.com/x/.config?x=3c5f6ce8d5b68299
dashboard link: https://syzkaller.appspot.com/bug?extid=cc6fc752b3819e082d0c
compiler: gcc (GCC) 10.1.0-syz 20200507

Note: testing is done by a robot and is best-effort only.

Hillf Danton

unread,
Sep 11, 2020, 9:35:29 PM9/11/20
to Jason Gunthorpe, Hillf Danton, syzbot, dled...@redhat.com, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com
More trouble now understanding that the ctx is reported to be freed
in the write path, while if I dont misread the chart above, you're
trying to pull another closer after migrate into the race.

Jason Gunthorpe

unread,
Sep 14, 2020, 11:01:52 AM9/14/20
to Hillf Danton, syzbot, dled...@redhat.com, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, syzkall...@googlegroups.com
migrate moves the ctx between two struct file's, so the race is to be
destroying on fir the first struct file, move to the second struct
file, then close the second struct file.

Now close and destroy_id can race directly, which shouldn't be
allowed.

Jason
Reply all
Reply to author
Forward
0 new messages