[syzbot] [lsm?] general protection fault in hook_inode_free_security

26 views
Skip to first unread message

syzbot

unread,
May 8, 2024, 3:32:23 PMMay 8
to jmo...@namei.org, linux-...@vger.kernel.org, linux-secu...@vger.kernel.org, m...@digikod.net, pa...@paul-moore.com, se...@hallyn.com, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: dccb07f2914c Merge tag 'for-6.9-rc7-tag' of git://git.kern..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14a46760980000
kernel config: https://syzkaller.appspot.com/x/.config?x=6d14c12b661fb43
dashboard link: https://syzkaller.appspot.com/bug?extid=5446fbf332b0602ede0b
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/39d66018d8ad/disk-dccb07f2.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/c160b651d1bc/vmlinux-dccb07f2.xz
kernel image: https://storage.googleapis.com/syzbot-assets/3662a33ac713/bzImage-dccb07f2.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+5446fb...@syzkaller.appspotmail.com

general protection fault, probably for non-canonical address 0xdffffc018f62f515: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: probably user-memory-access in range [0x0000000c7b17a8a8-0x0000000c7b17a8af]
CPU: 1 PID: 5102 Comm: syz-executor.1 Not tainted 6.9.0-rc7-syzkaller-00012-gdccb07f2914c #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
RIP: 0010:hook_inode_free_security+0x5b/0xb0 security/landlock/fs.c:1047
Code: 8a fd 48 8b 1b 48 c7 c0 c4 4e d5 8d 48 c1 e8 03 42 0f b6 04 30 84 c0 75 3e 48 63 05 33 59 65 09 48 01 c3 48 89 d8 48 c1 e8 03 <42> 80 3c 30 00 74 08 48 89 df e8 66 be 8a fd 48 83 3b 00 75 0d e8
RSP: 0018:ffffc9000307f9a8 EFLAGS: 00010212
RAX: 000000018f62f515 RBX: 0000000c7b17a8a8 RCX: ffff888027668000
RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff88805c0bb270
RBP: ffffffff8c01fb00 R08: ffffffff82132a15 R09: 1ffff1100b81765f
R10: dffffc0000000000 R11: ffffffff846ff540 R12: dffffc0000000000
R13: 1ffff1100b817683 R14: dffffc0000000000 R15: dffffc0000000000
FS: 0000000000000000(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f43c42de000 CR3: 00000000635f8000 CR4: 0000000000350ef0
Call Trace:
<TASK>
security_inode_free+0x4a/0xd0 security/security.c:1613
__destroy_inode+0x2d9/0x650 fs/inode.c:286
destroy_inode fs/inode.c:309 [inline]
evict+0x521/0x630 fs/inode.c:682
dispose_list fs/inode.c:700 [inline]
evict_inodes+0x5f9/0x690 fs/inode.c:750
generic_shutdown_super+0x9d/0x2d0 fs/super.c:626
kill_block_super+0x44/0x90 fs/super.c:1675
deactivate_locked_super+0xc6/0x130 fs/super.c:472
cleanup_mnt+0x426/0x4c0 fs/namespace.c:1267
task_work_run+0x251/0x310 kernel/task_work.c:180
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0xa1b/0x27e0 kernel/exit.c:878
do_group_exit+0x207/0x2c0 kernel/exit.c:1027
__do_sys_exit_group kernel/exit.c:1038 [inline]
__se_sys_exit_group kernel/exit.c:1036 [inline]
__x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1036
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f731567dd69
Code: Unable to access opcode bytes at 0x7f731567dd3f.
RSP: 002b:00007fff4f0804d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f73156c93a3 RCX: 00007f731567dd69
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000002 R08: 00007fff4f07e277 R09: 00007fff4f081790
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff4f081790
R13: 00007f73156c937e R14: 00000000000154d0 R15: 000000000000001e
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:hook_inode_free_security+0x5b/0xb0 security/landlock/fs.c:1047
Code: 8a fd 48 8b 1b 48 c7 c0 c4 4e d5 8d 48 c1 e8 03 42 0f b6 04 30 84 c0 75 3e 48 63 05 33 59 65 09 48 01 c3 48 89 d8 48 c1 e8 03 <42> 80 3c 30 00 74 08 48 89 df e8 66 be 8a fd 48 83 3b 00 75 0d e8
RSP: 0018:ffffc9000307f9a8 EFLAGS: 00010212
RAX: 000000018f62f515 RBX: 0000000c7b17a8a8 RCX: ffff888027668000
RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff88805c0bb270
RBP: ffffffff8c01fb00 R08: ffffffff82132a15 R09: 1ffff1100b81765f
R10: dffffc0000000000 R11: ffffffff846ff540 R12: dffffc0000000000
R13: 1ffff1100b817683 R14: dffffc0000000000 R15: dffffc0000000000
FS: 0000000000000000(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555587f03978 CR3: 0000000049876000 CR4: 0000000000350ef0
----------------
Code disassembly (best guess):
0: 8a fd mov %ch,%bh
2: 48 8b 1b mov (%rbx),%rbx
5: 48 c7 c0 c4 4e d5 8d mov $0xffffffff8dd54ec4,%rax
c: 48 c1 e8 03 shr $0x3,%rax
10: 42 0f b6 04 30 movzbl (%rax,%r14,1),%eax
15: 84 c0 test %al,%al
17: 75 3e jne 0x57
19: 48 63 05 33 59 65 09 movslq 0x9655933(%rip),%rax # 0x9655953
20: 48 01 c3 add %rax,%rbx
23: 48 89 d8 mov %rbx,%rax
26: 48 c1 e8 03 shr $0x3,%rax
* 2a: 42 80 3c 30 00 cmpb $0x0,(%rax,%r14,1) <-- trapping instruction
2f: 74 08 je 0x39
31: 48 89 df mov %rbx,%rdi
34: e8 66 be 8a fd call 0xfd8abe9f
39: 48 83 3b 00 cmpq $0x0,(%rbx)
3d: 75 0d jne 0x4c
3f: e8 .byte 0xe8


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Paul Moore

unread,
May 9, 2024, 8:02:01 PMMay 9
to syzbot, jmo...@namei.org, linux-...@vger.kernel.org, linux-secu...@vger.kernel.org, m...@digikod.net, se...@hallyn.com, syzkall...@googlegroups.com
On Wed, May 8, 2024 at 3:32 PM syzbot
<syzbot+5446fb...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: dccb07f2914c Merge tag 'for-6.9-rc7-tag' of git://git.kern..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=14a46760980000
> kernel config: https://syzkaller.appspot.com/x/.config?x=6d14c12b661fb43
> dashboard link: https://syzkaller.appspot.com/bug?extid=5446fbf332b0602ede0b
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/39d66018d8ad/disk-dccb07f2.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/c160b651d1bc/vmlinux-dccb07f2.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/3662a33ac713/bzImage-dccb07f2.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+5446fb...@syzkaller.appspotmail.com
>
> general protection fault, probably for non-canonical address 0xdffffc018f62f515: 0000 [#1] PREEMPT SMP KASAN NOPTI
> KASAN: probably user-memory-access in range [0x0000000c7b17a8a8-0x0000000c7b17a8af]
> CPU: 1 PID: 5102 Comm: syz-executor.1 Not tainted 6.9.0-rc7-syzkaller-00012-gdccb07f2914c #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
> RIP: 0010:hook_inode_free_security+0x5b/0xb0 security/landlock/fs.c:1047

Possibly a Landlock issue, Mickaël?
--
paul-moore.com

Mickaël Salaün

unread,
May 15, 2024, 11:13:27 AMMay 15
to Paul Moore, Jann Horn, syzbot, jmo...@namei.org, linux-...@vger.kernel.org, linux-secu...@vger.kernel.org, se...@hallyn.com, syzkall...@googlegroups.com, linux-...@vger.kernel.org, Casey Schaufler, Christian Brauner
On Thu, May 09, 2024 at 08:01:49PM -0400, Paul Moore wrote:
> On Wed, May 8, 2024 at 3:32 PM syzbot
> <syzbot+5446fb...@syzkaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: dccb07f2914c Merge tag 'for-6.9-rc7-tag' of git://git.kern..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=14a46760980000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=6d14c12b661fb43
> > dashboard link: https://syzkaller.appspot.com/bug?extid=5446fbf332b0602ede0b
> > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/39d66018d8ad/disk-dccb07f2.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/c160b651d1bc/vmlinux-dccb07f2.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/3662a33ac713/bzImage-dccb07f2.xz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+5446fb...@syzkaller.appspotmail.com
> >
> > general protection fault, probably for non-canonical address 0xdffffc018f62f515: 0000 [#1] PREEMPT SMP KASAN NOPTI
> > KASAN: probably user-memory-access in range [0x0000000c7b17a8a8-0x0000000c7b17a8af]
> > CPU: 1 PID: 5102 Comm: syz-executor.1 Not tainted 6.9.0-rc7-syzkaller-00012-gdccb07f2914c #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
> > RIP: 0010:hook_inode_free_security+0x5b/0xb0 security/landlock/fs.c:1047
>
> Possibly a Landlock issue, Mickaël?

It looks like security_inode_free() is called two times on the same
inode. This could happen if an inode labeled by Landlock is put
concurrently with release_inode() for a closed ruleset or with
hook_sb_delete(). I didn't find any race condition that could lead to
two calls to iput() though. Could WRITE_ONCE(object->underobj, NULL)
change anything even if object->lock is locked?

A bit unrelated but looking at the SELinux code, I see that selinux_inode()
checks `!inode->i_security`. In which case could this happen?

Mickaël Salaün

unread,
May 16, 2024, 3:52:19 AMMay 16
to Paul Moore, Jann Horn, Mathieu Desnoyers, Paul E. McKenney, syzbot, jmo...@namei.org, linux-...@vger.kernel.org, linux-secu...@vger.kernel.org, se...@hallyn.com, syzkall...@googlegroups.com, linux-...@vger.kernel.org, Casey Schaufler, Christian Brauner
Adding membarrier experts.

Mathieu Desnoyers

unread,
May 16, 2024, 9:16:16 AMMay 16
to Mickaël Salaün, Paul Moore, Jann Horn, Paul E. McKenney, syzbot, jmo...@namei.org, linux-...@vger.kernel.org, linux-secu...@vger.kernel.org, se...@hallyn.com, syzkall...@googlegroups.com, linux-...@vger.kernel.org, Casey Schaufler, Christian Brauner
On 2024-05-16 03:31, Mickaël Salaün wrote:
> Adding membarrier experts.

I do not see how this relates to the membarrier(2) system call.

Thanks,

Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Mickaël Salaün

unread,
Jun 27, 2024, 9:37:00 AM (7 days ago) Jun 27
to Paul Moore, Jann Horn, Christian Brauner, Paul E. McKenney, Casey Schaufler, Kees Cook, syzbot, jmo...@namei.org, linux-...@vger.kernel.org, linux-secu...@vger.kernel.org, se...@hallyn.com, syzkall...@googlegroups.com, linux-...@vger.kernel.org
I didn't find specific issues with Landlock's code except the extra
check in hook_inode_free_security(). It looks like inode->i_security is
a dangling pointer, leading to UAF.

Reading security_inode_free() comments, two things looks weird to me:
> /**
> * security_inode_free() - Free an inode's LSM blob
> * @inode: the inode
> *
> * Deallocate the inode security structure and set @inode->i_security to NULL.

I don't see where i_security is set to NULL.

> */
> void security_inode_free(struct inode *inode)
> {

Shouldn't we add this check here?
if (!inode->i_security)
return;

> call_void_hook(inode_free_security, inode);
> /*
> * The inode may still be referenced in a path walk and
> * a call to security_inode_permission() can be made
> * after inode_free_security() is called. Ideally, the VFS
> * wouldn't do this, but fixing that is a much harder
> * job. For now, simply free the i_security via RCU, and
> * leave the current inode->i_security pointer intact.
> * The inode will be freed after the RCU grace period too.

It's not clear to me why this should be safe if an LSM try to use the
partially-freed blob after the hook calls and before the actual blob
free.

> */
> if (inode->i_security)
> call_rcu((struct rcu_head *)inode->i_security,
> inode_free_by_rcu);

And then:
inode->i_security = NULL;

But why call_rcu()? i_security is not protected by RCU barriers.
I don't think so anymore, the issue is with i_security, not the blob
content.

> >
> > A bit unrelated but looking at the SELinux code, I see that selinux_inode()
> > checks `!inode->i_security`. In which case could this happen?

I think this shouldn't happen, and that might actually be an issue for
SELinux. See my above comment about security_free_inode().

Mickaël Salaün

unread,
Jun 27, 2024, 9:37:00 AM (7 days ago) Jun 27
to Mathieu Desnoyers, Paul Moore, Jann Horn, Paul E. McKenney, syzbot, jmo...@namei.org, linux-...@vger.kernel.org, linux-secu...@vger.kernel.org, se...@hallyn.com, syzkall...@googlegroups.com, linux-...@vger.kernel.org, Casey Schaufler, Christian Brauner
On Thu, May 16, 2024 at 09:07:29AM GMT, Mathieu Desnoyers wrote:
> On 2024-05-16 03:31, Mickaël Salaün wrote:
> > Adding membarrier experts.
>
> I do not see how this relates to the membarrier(2) system call.

I meant SMP barrier and RCU, so mostly Paul E. McKenney.
I'll remove you from the thread.

Kees Cook

unread,
Jun 27, 2024, 2:12:46 PM (7 days ago) Jun 27
to Mickaël Salaün, Christian Brauner, Steven Rostedt, Paul Moore, Jann Horn, Paul E. McKenney, Casey Schaufler, syzbot, jmo...@namei.org, linux-...@vger.kernel.org, linux-secu...@vger.kernel.org, se...@hallyn.com, syzkall...@googlegroups.com, linux-...@vger.kernel.org
On Thu, Jun 27, 2024 at 03:34:41PM +0200, Mickaël Salaün wrote:
> I didn't find specific issues with Landlock's code except the extra
> check in hook_inode_free_security(). It looks like inode->i_security is
> a dangling pointer, leading to UAF.
>
> Reading security_inode_free() comments, two things looks weird to me:
> > /**
> > * security_inode_free() - Free an inode's LSM blob
> > * @inode: the inode
> > *
> > * Deallocate the inode security structure and set @inode->i_security to NULL.
>
> I don't see where i_security is set to NULL.

Yeah, I don't either...

> > */
> > void security_inode_free(struct inode *inode)
> > {
>
> Shouldn't we add this check here?
> if (!inode->i_security)
> return;

Probably, yes. The LSMs that check for NULL i_security in the free hook
all do so right at the beginning...

>
> > call_void_hook(inode_free_security, inode);
> > /*
> > * The inode may still be referenced in a path walk and
> > * a call to security_inode_permission() can be made
> > * after inode_free_security() is called. Ideally, the VFS
> > * wouldn't do this, but fixing that is a much harder
> > * job. For now, simply free the i_security via RCU, and
> > * leave the current inode->i_security pointer intact.
> > * The inode will be freed after the RCU grace period too.
>
> It's not clear to me why this should be safe if an LSM try to use the
> partially-freed blob after the hook calls and before the actual blob
> free.

Yeah, it's not clear to me what the expected lifetime is here. How is
inode_permission() being called if all inode reference counts are 0? It
does seem intentional, though.

The RCU logic was introduced in commit 3dc91d4338d6 ("SELinux: Fix possible
NULL pointer dereference in selinux_inode_permission()"), with much
discussion:
https://lore.kernel.org/lkml/20140109101...@gandalf.local.home/
(This commit seems to remove setting "i_security = NULL", though, which
the comment implies is intended, but then it also seems to depend on
finding a NULL?)

LSMs using i_security are:

security/bpf/hooks.c: .lbs_inode = sizeof(struct bpf_storage_blob),
security/integrity/evm/evm_main.c: .lbs_inode = sizeof(struct evm_iint_cache),
security/integrity/ima/ima_main.c: .lbs_inode = sizeof(struct ima_iint_cache *),
security/landlock/setup.c: .lbs_inode = sizeof(struct landlock_inode_security),
security/selinux/hooks.c: .lbs_inode = sizeof(struct inode_security_struct),
security/smack/smack_lsm.c: .lbs_inode = sizeof(struct inode_smack),

SELinux is still checking for NULL. See selinux_inode() and
selinux_inode_free_security(), as do bpf_inode() and
bpf_inode_storage_free(). evm and ima also check for NULL.

landlock_inode() does not, though.

Smack doesn't hook the free, but it should still check for NULL, and it's not.

So I think this needs fixing in Landlock and Smack.

I kind of think that the LSM infrastructure needs to provide a common
helper for the "access the blob" action, as we've got it repeated in
each LSM, and we have 2 implementations that are missing NULL checks...

>
> > */
> > if (inode->i_security)
> > call_rcu((struct rcu_head *)inode->i_security,
> > inode_free_by_rcu);
>
> And then:
> inode->i_security = NULL;
>
> But why call_rcu()? i_security is not protected by RCU barriers.

I assume it's because security_inode_free() via __destroy_inode() via
destroy_inode() via evict() via iput_final() via iput() may be running
in interrupt context?

But I still don't see where i_security gets set to NULL. This won't fix
the permissions hook races for Landlock and Smack, but should make
lifetime a bit more clear?


diff --git a/security/security.c b/security/security.c
index 9c3fb2f60e2a..a8658ebcaf0c 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1613,7 +1613,8 @@ static void inode_free_by_rcu(struct rcu_head *head)
*/
void security_inode_free(struct inode *inode)
{
- call_void_hook(inode_free_security, inode);
+ struct rcu_head *inode_blob = inode->i_security;
+
/*
* The inode may still be referenced in a path walk and
* a call to security_inode_permission() can be made
@@ -1623,9 +1624,11 @@ void security_inode_free(struct inode *inode)
* leave the current inode->i_security pointer intact.
* The inode will be freed after the RCU grace period too.
*/
- if (inode->i_security)
- call_rcu((struct rcu_head *)inode->i_security,
- inode_free_by_rcu);
+ if (inode_blob) {
+ call_void_hook(inode_free_security, inode);
+ inode->i_security = NULL;
+ call_rcu(inode_blob, inode_free_by_rcu);
+ }
}

/**


-Kees
--
Kees Cook

Paul Moore

unread,
Jun 27, 2024, 2:28:16 PM (7 days ago) Jun 27
to Mickaël Salaün, Jann Horn, Christian Brauner, Paul E. McKenney, Casey Schaufler, Kees Cook, syzbot, jmo...@namei.org, linux-...@vger.kernel.org, linux-secu...@vger.kernel.org, se...@hallyn.com, syzkall...@googlegroups.com, linux-...@vger.kernel.org
On Thu, Jun 27, 2024 at 9:34 AM Mickaël Salaün <m...@digikod.net> wrote:
>
> I didn't find specific issues with Landlock's code except the extra
> check in hook_inode_free_security(). It looks like inode->i_security is
> a dangling pointer, leading to UAF.
>
> Reading security_inode_free() comments, two things looks weird to me:
>
> > /**
> > * security_inode_free() - Free an inode's LSM blob
> > * @inode: the inode
> > *
> > * Deallocate the inode security structure and set @inode->i_security to NULL.
>
> I don't see where i_security is set to NULL.

The function header comments are known to be a bit suspect, a side
effect of being detached from the functions for many years, this may
be one of those cases. I tried to fix up the really awful ones when I
moved the comments back, back I didn't have time to go through each
one in detail. Patches to correct the function header comments are
welcome and encouraged! :)

> > */
> > void security_inode_free(struct inode *inode)
> > {
>
> Shouldn't we add this check here?
> if (!inode->i_security)
> return;

Unless I'm remembering something wrong, I believe we *should* always
have a valid i_security pointer each time we are called, if not
something has gone wrong, e.g. the security_inode_free() hook is no
longer being called from the right place. If we add a NULL check, we
should probably have a WARN_ON(), pr_err(), or something similar to
put some spew on the console/logs.

All that said, it would be good to hear some confirmation from the VFS
folks that the security_inode_free() hook is located in a spot such
that once it exits it's current RCU critical section it is safe to
release the associated LSM state.

It's also worth mentioning that while we always allocate i_security in
security_inode_alloc() right now, I can see a world where we allocate
the i_security field based on need using the lsm_blob_size info (maybe
that works today? not sure how kmem_cache handled 0 length blobs?).
The result is that there might be a legitimate case where i_security
is NULL, yet we still want to call into the LSM using the
inode_free_security() implementation hook.

> > call_void_hook(inode_free_security, inode);
> > /*
> > * The inode may still be referenced in a path walk and
> > * a call to security_inode_permission() can be made
> > * after inode_free_security() is called. Ideally, the VFS
> > * wouldn't do this, but fixing that is a much harder
> > * job. For now, simply free the i_security via RCU, and
> > * leave the current inode->i_security pointer intact.
> > * The inode will be freed after the RCU grace period too.
>
> It's not clear to me why this should be safe if an LSM try to use the
> partially-freed blob after the hook calls and before the actual blob
> free.

I had the same thought while looking at this just now. At least in
the SELinux case I think this "works" simply because SELinux doesn't
do much here, it just drops the inode from a SELinux internal list
(long story) and doesn't actually release any memory or reset the
inode's SELinux state (there really isn't anything to "free" in the
SELinux case). I haven't checked the other LSMs, but they may behave
similarly.

We may want (need?) to consider two LSM implementation hooks called
from within security_inode_free(): the first where the existing
inode_free_security() implementation hook is called, the second inside
the inode_free_by_rcu() callback immediately before the i_security
data is free'd.

... or we find a better placement in the VFS for
security_inode_free(), is that is possible. It may not be, our VFS
friends should be able to help here.

> > */
> > if (inode->i_security)
> > call_rcu((struct rcu_head *)inode->i_security,
> > inode_free_by_rcu);
>
> And then:
> inode->i_security = NULL;

According to the comment we may still need i_security for permission
checks. See my comment about decomposing the LSM implementation into
two hooks to better handle this for LSMs.

> But why call_rcu()? i_security is not protected by RCU barriers.

I believe the issue is that the inode is protected by RCU and that
affects the lifetime of the i_security blob.

--
paul-moore.com

Paul E. McKenney

unread,
Jun 27, 2024, 2:45:42 PM (7 days ago) Jun 27
to Kees Cook, Mickaël Salaün, Christian Brauner, Steven Rostedt, Paul Moore, Jann Horn, Casey Schaufler, syzbot, jmo...@namei.org, linux-...@vger.kernel.org, linux-secu...@vger.kernel.org, se...@hallyn.com, syzkall...@googlegroups.com, linux-...@vger.kernel.org
On Thu, Jun 27, 2024 at 11:12:43AM -0700, Kees Cook wrote:
> On Thu, Jun 27, 2024 at 03:34:41PM +0200, Mickaël Salaün wrote:
> > I didn't find specific issues with Landlock's code except the extra
> > check in hook_inode_free_security(). It looks like inode->i_security is
> > a dangling pointer, leading to UAF.
> >
> > Reading security_inode_free() comments, two things looks weird to me:
> > > /**
> > > * security_inode_free() - Free an inode's LSM blob
> > > * @inode: the inode
> > > *
> > > * Deallocate the inode security structure and set @inode->i_security to NULL.
> >
> > I don't see where i_security is set to NULL.
>
> Yeah, I don't either...
>
> > > */
> > > void security_inode_free(struct inode *inode)
> > > {
> >
> > Shouldn't we add this check here?
> > if (!inode->i_security)
> > return;
>
> Probably, yes. The LSMs that check for NULL i_security in the free hook
> all do so right at the beginning...

Given your fix below, I am assuming that they do this check within some
sort of an RCU read-side critical section.
This approach looks plausible to me.

Thanx, Paul

Paul Moore

unread,
Jun 27, 2024, 3:30:01 PM (7 days ago) Jun 27
to Kees Cook, Mickaël Salaün, Christian Brauner, Steven Rostedt, Jann Horn, Paul E. McKenney, Casey Schaufler, syzbot, jmo...@namei.org, linux-...@vger.kernel.org, linux-secu...@vger.kernel.org, se...@hallyn.com, syzkall...@googlegroups.com, linux-...@vger.kernel.org
See my response to Mickaël, unless we can get some guidance from the
VFS folks on the possibility of calling security_inode_free() once
when the inode has already been marked for death and is no longer in
use, I'd rather see us split the LSM implementation into two hooks,
something like this pseudo code (very hand-wavy around the rcu_head
inode container bits):

void inode_free_rcu(rhead)
{
inode = multiple_container_of_magic(rhead);
/* LSMs can finally free any internal state */
call_void_hook(inode_free_rcu, inode)
inode->i_security = NULL;
}

void security_inode_free(inode)
{
/* LSMs can mark their state as "dead", but perm checks may still happen */
call_void_hook(inode_free, inode);
if (inode->i_security)
call_rcu(inode, inode_free_rcu);
}

> + }
> }

--
paul-moore.com
Reply all
Reply to author
Forward
0 new messages