KASAN: use-after-free Read in posix_lock_inode

33 views
Skip to first unread message

syzbot

unread,
Jan 2, 2019, 5:31:05 AM1/2/19
to bfi...@fieldses.org, jla...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Hello,

syzbot found the following crash on:

HEAD commit: e1ef035d272e Merge tag 'armsoc-defconfig' of git://git.ker..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16bb4c4b400000
kernel config: https://syzkaller.appspot.com/x/.config?x=9c6a26e22579190b
dashboard link: https://syzkaller.appspot.com/bug?extid=239d99847eb49ecb3899
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=128aa377400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+239d99...@syzkaller.appspotmail.com

IPv6: ADDRCONF(NETDEV_UP): vxcan1: link is not ready
IPv6: ADDRCONF(NETDEV_UP): vxcan1: link is not ready
8021q: adding VLAN 0 to HW filter on device batadv0
8021q: adding VLAN 0 to HW filter on device batadv0
==================================================================
BUG: KASAN: use-after-free in what_owner_is_waiting_for fs/locks.c:1000
[inline]
BUG: KASAN: use-after-free in posix_locks_deadlock fs/locks.c:1023 [inline]
BUG: KASAN: use-after-free in posix_lock_inode+0x1f9e/0x2750 fs/locks.c:1163
Read of size 8 at addr ffff88808791b000 by task syz-executor2/10100

CPU: 1 PID: 10100 Comm: syz-executor2 Not tainted 4.20.0+ #3
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1db/0x2d0 lib/dump_stack.c:113
print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
__asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135
what_owner_is_waiting_for fs/locks.c:1000 [inline]
posix_locks_deadlock fs/locks.c:1023 [inline]
posix_lock_inode+0x1f9e/0x2750 fs/locks.c:1163
posix_lock_file fs/locks.c:1346 [inline]
vfs_lock_file fs/locks.c:2314 [inline]
vfs_lock_file+0xc7/0xf0 fs/locks.c:2309
do_lock_file_wait.part.0+0xe5/0x260 fs/locks.c:2328
do_lock_file_wait fs/locks.c:2324 [inline]
fcntl_setlk+0x2f1/0xfe0 fs/locks.c:2413
do_fcntl+0x843/0x12b0 fs/fcntl.c:370
__do_sys_fcntl fs/fcntl.c:463 [inline]
__se_sys_fcntl fs/fcntl.c:448 [inline]
__x64_sys_fcntl+0x16d/0x1e0 fs/fcntl.c:448
do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457ec9
Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f58bbb50c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000048
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9
RDX: 0000000020000140 RSI: 0000000000000007 RDI: 0000000000000003
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f58bbb516d4
R13: 00000000004be5f0 R14: 00000000004ceab0 R15: 00000000ffffffff

Allocated by task 10100:
save_stack+0x45/0xd0 mm/kasan/common.c:73
set_track mm/kasan/common.c:85 [inline]
kasan_kmalloc mm/kasan/common.c:482 [inline]
kasan_kmalloc+0xcf/0xe0 mm/kasan/common.c:455
kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:397
kmem_cache_alloc+0x12d/0x710 mm/slab.c:3541
kmem_cache_zalloc include/linux/slab.h:730 [inline]
locks_alloc_lock+0x8e/0x2f0 fs/locks.c:344
fcntl_setlk+0xa9/0xfe0 fs/locks.c:2362
do_fcntl+0x843/0x12b0 fs/fcntl.c:370
__do_sys_fcntl fs/fcntl.c:463 [inline]
__se_sys_fcntl fs/fcntl.c:448 [inline]
__x64_sys_fcntl+0x16d/0x1e0 fs/fcntl.c:448
do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 10100:
save_stack+0x45/0xd0 mm/kasan/common.c:73
set_track mm/kasan/common.c:85 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/common.c:444
kasan_slab_free+0xe/0x10 mm/kasan/common.c:452
__cache_free mm/slab.c:3485 [inline]
kmem_cache_free+0x86/0x260 mm/slab.c:3747
locks_free_lock+0x27a/0x3f0 fs/locks.c:381
fcntl_setlk+0x7b5/0xfe0 fs/locks.c:2439
do_fcntl+0x843/0x12b0 fs/fcntl.c:370
__do_sys_fcntl fs/fcntl.c:463 [inline]
__se_sys_fcntl fs/fcntl.c:448 [inline]
__x64_sys_fcntl+0x16d/0x1e0 fs/fcntl.c:448
do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff88808791b000
which belongs to the cache file_lock_cache of size 264
The buggy address is located 0 bytes inside of
264-byte region [ffff88808791b000, ffff88808791b108)
The buggy address belongs to the page:
page:ffffea00021e46c0 count:1 mapcount:0 mapping:ffff8880aa16a1c0 index:0x0
flags: 0x1fffc0000000200(slab)
raw: 01fffc0000000200 ffffea0002333508 ffffea00021d76c8 ffff8880aa16a1c0
raw: 0000000000000000 ffff88808791b000 000000010000000c 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff88808791af00: fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc
ffff88808791af80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff88808791b000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88808791b080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88808791b100: fb fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

Jeff Layton

unread,
Jan 2, 2019, 9:24:31 AM1/2/19
to syzbot, bfi...@fieldses.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk, NeilBrown
The interesting bit is that the crash, alloc and free all seem to have
occurred in the same kernel task (PID 10100).

Here's the loop in what_owner_is_waiting_for():

----------------8<------------------
hash_for_each_possible(blocked_hash, fl, fl_link, posix_owner_key(block_fl)) {
if (posix_same_owner(fl, block_fl)) {
while (fl->fl_blocker) <<<<<< CRASH HERE
fl = fl->fl_blocker;
return fl;
}
}
----------------8<------------------

So fl got freed while we were walking down the chain of blocked locks.
At a quick glance, I'm now wondering whether the lockless optimization
to avoid the blocked_lock_lock in locks_delete_block is actually ok.

Neil, any thoughts?
--
Jeff Layton <jla...@kernel.org>

Jeff Layton

unread,
Jan 2, 2019, 1:20:44 PM1/2/19
to syzbot, bfi...@fieldses.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
On Wed, 2019-01-02 at 02:31 -0800, syzbot wrote:
I've given this a harder look and I really don't quite grok what
this output is telling me:

All 3 stack traces say they come from PID 10100, but the use-after-free
seems to occur well before the free could have occurred in the context
of the current fcntl call.

So, I guess that leaves the possibility that we freed a lock request
from an earlier fcntl call without removing it properly from the tree,
but (a) I don't see how that could happen, _and_ (b) why didn't that
trip the BUG_ONs in locks_free_lock?

I'll keep looking at this, but I'm a bit stumped at the moment.

Cheers,
--
Jeff Layton <jla...@kernel.org>

Dmitry Vyukov

unread,
Jan 2, 2019, 1:51:48 PM1/2/19
to Jeff Layton, syzbot, Bruce Fields, linux-fsdevel, LKML, syzkaller-bugs, Al Viro
Interestingly it is the case in all crashes for this bug. It may be
something inherent, or maybe just the particular program that
triggered this is such that these accesses happen in the same thread.

> So, I guess that leaves the possibility that we freed a lock request
> from an earlier fcntl call without removing it properly from the tree,
> but (a) I don't see how that could happen, _and_ (b) why didn't that
> trip the BUG_ONs in locks_free_lock?
>
> I'll keep looking at this, but I'm a bit stumped at the moment.

The simplest repro for this is:

# See https://goo.gl/kgGztJ for information about syzkaller reproducers.
#{"threaded":true,"collide":true,"repeat":true,"procs":6,"sandbox":"none","fault_call":-1,"tun":true,"tmpdir":true,"cgroups":true,"netdev":true,"resetnet":true,"segv":true}
r0 = epoll_create1(0x0)
fcntl$lock(r0, 0x7, &(0x7f0000000080))
fcntl$lock(r0, 0x7, &(0x7f0000000140)={0x1000000000001, 0x0, 0x1000000})

"collide":true means that the 2 fcntl's were executed in parallel. But
still the alloc/free/access always happened in the same thread, so the
thread interaction seems to be somewhat unusual.

Dmitry Vyukov

unread,
Jan 2, 2019, 1:54:14 PM1/2/19
to Jeff Layton, syzbot, Bruce Fields, linux-fsdevel, LKML, syzkaller-bugs, Al Viro
Looking at frequency of this crash, repro properties and some other
signals my money are on a race/atomicity violation with a narrow
inconsistency window. E.g. unlock something and then expect things
have not changed after re-lock, or remove from list and reset a
pointer few instructions later.

Jeff Layton

unread,
Jan 2, 2019, 4:42:24 PM1/2/19
to Dmitry Vyukov, syzbot, Bruce Fields, linux-fsdevel, LKML, syzkaller-bugs, Al Viro
Thanks Dmitry,

The good news is that it's quite reproducible. I used syz-prog2c on the
reproducer and got the attached program (build with -lpthread).

I didn't have KASAN enabled on my throwaway VM, but I got a GPF and
stack trace much like the one above when I ran this on a kernel with the
thundering herd set. A kernel based on a commit from earlier in the
merge window didn't show the problem.

I tested commenting out some lockless shortcuts, but they didn't fix it.
I'll keep playing with it.
repro.c

NeilBrown

unread,
Jan 2, 2019, 7:04:18 PM1/2/19
to Jeff Layton, syzbot, bfi...@fieldses.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
The repro didn't trigger for me, but code inspection suggested I should
invest in brown paper bags.

Note that while this patch clearly fixes a bug, and I suspect it is
the cause of this report, I cannot confirm with testing as I cannot
trigger the bug. I enabled KASAN and ran "qemu -smp 4" to no avail.

Thanks,
NeilBrown

From: NeilBrown <ne...@suse.com>
Date: Thu, 3 Jan 2019 10:59:50 +1100
Subject: [PATCH] locks: fix error in locks_move_blocks()

After moving all requests from
fl->fl_blocked_requests
to
new->fl_blocked_requests

it is nonsensical to do anything to all the remaining elements, there
aren't any.
This should do something to all the requests that have been
moved - for simplicity, it does it to all requests in the target
list.
Setting "f->fl_blocker = new" to all members of
new->fl_blocked_requests is "obviously correct" as it preserves
the invariant of the linkage among requests.

Reported-by: syzbot+239d99...@syzkaller.appspotmail.com
Fixes: 5946c4319ebb ("fs/locks: allow a lock request to block other requests.")
Signed-off-by: NeilBrown <ne...@suse.com>
---
fs/locks.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/locks.c b/fs/locks.c
index f0b24d98f36b..ff6af2c32601 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -453,7 +453,7 @@ static void locks_move_blocks(struct file_lock *new, struct file_lock *fl)
return;
spin_lock(&blocked_lock_lock);
list_splice_init(&fl->fl_blocked_requests, &new->fl_blocked_requests);
- list_for_each_entry(f, &fl->fl_blocked_requests, fl_blocked_member)
+ list_for_each_entry(f, &new->fl_blocked_requests, fl_blocked_member)
f->fl_blocker = new;
spin_unlock(&blocked_lock_lock);
}
--
2.14.0.rc0.dirty

signature.asc

Jeff Layton

unread,
Jan 2, 2019, 8:24:33 PM1/2/19
to NeilBrown, syzbot, bfi...@fieldses.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Thanks Neil,

The VM I was using seemed to reproduce this pretty readily, and the
patch does seem to fix it. I've gone ahead and added it to -next for
now, and will plan to pass it on to Linus in the next few days after
I've tested it a bit more.
--
Jeff Layton <jla...@kernel.org>

NeilBrown

unread,
Jan 2, 2019, 8:59:47 PM1/2/19
to Jeff Layton, syzbot, bfi...@fieldses.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Thanks, that's encouraging.

NeilBrown
signature.asc

syzbot

unread,
Jan 3, 2019, 8:13:04 AM1/3/19
to jla...@kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger
crash:

Reported-and-tested-by:
syzbot+239d99...@syzkaller.appspotmail.com

Tested on:

commit: bf77ae4c98d7 locks: fix error in locks_move_blocks()
git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git locks-4.21
kernel config: https://syzkaller.appspot.com/x/.config?x=c2ab9708c613a224
compiler: gcc (GCC) 9.0.0 20181231 (experimental)

Note: testing is done by a robot and is best-effort only.
Reply all
Reply to author
Forward
0 new messages