[syzbot] [overlayfs?] general protection fault in d_path

21 views
Skip to first unread message

syzbot

unread,
Jul 5, 2023, 8:00:48 AM7/5/23
to amir...@gmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: d528014517f2 Revert ".gitignore: ignore *.cover and *.mbx"
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14fad002a80000
kernel config: https://syzkaller.appspot.com/x/.config?x=1085b4238c9eb6ba
dashboard link: https://syzkaller.appspot.com/bug?extid=a67fc5321ffb4b311c98
compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/fef94e788067/disk-d5280145.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/576412ea518b/vmlinux-d5280145.xz
kernel image: https://storage.googleapis.com/syzbot-assets/685a0e4be06b/bzImage-d5280145.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a67fc5...@syzkaller.appspotmail.com

general protection fault, probably for non-canonical address 0xdffffc000000000a: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000050-0x0000000000000057]
CPU: 1 PID: 10127 Comm: syz-executor.3 Not tainted 6.4.0-syzkaller-11478-gd528014517f2 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023
RIP: 0010:__lock_acquire+0x10d/0x7f70 kernel/locking/lockdep.c:5012
Code: 85 75 18 00 00 83 3d 15 c8 2c 0d 00 48 89 9c 24 10 01 00 00 0f 84 f8 0f 00 00 83 3d 5c de b3 0b 00 74 34 48 89 d0 48 c1 e8 03 <42> 80 3c 00 00 74 1a 48 89 d7 e8 b4 51 79 00 48 8b 94 24 80 00 00
RSP: 0018:ffffc900169be9e0 EFLAGS: 00010006
RAX: 000000000000000a RBX: 1ffff92002d37d60 RCX: 0000000000000002
RDX: 0000000000000050 RSI: 0000000000000000 RDI: 0000000000000050
RBP: ffffc900169beca8 R08: dffffc0000000000 R09: 0000000000000001
R10: dffffc0000000000 R11: fffffbfff1d2fe76 R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000000002 R15: ffff88802091d940
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa22a3fe000 CR3: 000000004b5e1000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5761
seqcount_lockdep_reader_access+0x139/0x220 include/linux/seqlock.h:102
get_fs_root_rcu fs/d_path.c:244 [inline]
d_path+0x2f0/0x6e0 fs/d_path.c:286
audit_log_d_path+0xd3/0x310 kernel/audit.c:2139
dump_common_audit_data security/lsm_audit.c:224 [inline]
common_lsm_audit+0x7cf/0x1a90 security/lsm_audit.c:458
smack_log+0x421/0x540 security/smack/smack_access.c:383
smk_tskacc+0x2ff/0x360 security/smack/smack_access.c:253
smack_inode_getattr+0x203/0x270 security/smack/smack_lsm.c:1202
security_inode_getattr+0xd3/0x120 security/security.c:2114
vfs_getattr+0x25/0x70 fs/stat.c:167
ovl_getattr+0x1b1/0xf70 fs/overlayfs/inode.c:174
ima_check_last_writer security/integrity/ima/ima_main.c:171 [inline]
ima_file_free+0x26e/0x4b0 security/integrity/ima/ima_main.c:203
__fput+0x36a/0x950 fs/file_table.c:378
task_work_run+0x24a/0x300 kernel/task_work.c:179
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x68f/0x2290 kernel/exit.c:874
do_group_exit+0x206/0x2c0 kernel/exit.c:1024
get_signal+0x1709/0x17e0 kernel/signal.c:2877
arch_do_signal_or_restart+0x91/0x670 arch/x86/kernel/signal.c:308
exit_to_user_mode_loop+0x6a/0x100 kernel/entry/common.c:168
exit_to_user_mode_prepare+0xb1/0x140 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
syscall_exit_to_user_mode+0x64/0x280 kernel/entry/common.c:297
do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f7f3ca8c389
Code: Unable to access opcode bytes at 0x7f7f3ca8c35f.
RSP: 002b:00007f7f3d741168 EFLAGS: 00000246 ORIG_RAX: 0000000000000052
RAX: fffffffffffffffb RBX: 00007f7f3cbac050 RCX: 00007f7f3ca8c389
RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000020000180
RBP: 00007f7f3cad7493 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff8432555f R14: 00007f7f3d741300 R15: 0000000000022000
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__lock_acquire+0x10d/0x7f70 kernel/locking/lockdep.c:5012
Code: 85 75 18 00 00 83 3d 15 c8 2c 0d 00 48 89 9c 24 10 01 00 00 0f 84 f8 0f 00 00 83 3d 5c de b3 0b 00 74 34 48 89 d0 48 c1 e8 03 <42> 80 3c 00 00 74 1a 48 89 d7 e8 b4 51 79 00 48 8b 94 24 80 00 00
RSP: 0018:ffffc900169be9e0 EFLAGS: 00010006
RAX: 000000000000000a RBX: 1ffff92002d37d60 RCX: 0000000000000002
RDX: 0000000000000050 RSI: 0000000000000000 RDI: 0000000000000050
RBP: ffffc900169beca8 R08: dffffc0000000000 R09: 0000000000000001
R10: dffffc0000000000 R11: fffffbfff1d2fe76 R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000000002 R15: ffff88802091d940
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa22a3fe000 CR3: 000000004b5e1000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: 85 75 18 test %esi,0x18(%rbp)
3: 00 00 add %al,(%rax)
5: 83 3d 15 c8 2c 0d 00 cmpl $0x0,0xd2cc815(%rip) # 0xd2cc821
c: 48 89 9c 24 10 01 00 mov %rbx,0x110(%rsp)
13: 00
14: 0f 84 f8 0f 00 00 je 0x1012
1a: 83 3d 5c de b3 0b 00 cmpl $0x0,0xbb3de5c(%rip) # 0xbb3de7d
21: 74 34 je 0x57
23: 48 89 d0 mov %rdx,%rax
26: 48 c1 e8 03 shr $0x3,%rax
* 2a: 42 80 3c 00 00 cmpb $0x0,(%rax,%r8,1) <-- trapping instruction
2f: 74 1a je 0x4b
31: 48 89 d7 mov %rdx,%rdi
34: e8 b4 51 79 00 callq 0x7951ed
39: 48 rex.W
3a: 8b .byte 0x8b
3b: 94 xchg %eax,%esp
3c: 24 80 and $0x80,%al


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to change bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Christian Brauner

unread,
Jul 5, 2023, 9:06:02 AM7/5/23
to syzbot, Jeff Layton, amir...@gmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
Ugh, I think the root of this might all be the call back into
vfs_getattr() that happens on overlayfs:

__fput()
-> ima_file_free()
-> mutex_lock()
-> vfs_getattr_nosec()
-> i_op->getattr() == ovl_getattr()
-> vfs_getattr()
-> security_inode_getattr()
-> mutex_unlock()

So either overlayfs needs to call vfs_getattr_nosec() when the request
comes from vfs_getattr_nosec() or this needs to use
backing_file_real_path() to operate on the real underlying path.

Thoughts?

Amir Goldstein

unread,
Jul 5, 2023, 9:39:23 AM7/5/23
to Christian Brauner, syzbot, Jeff Layton, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, Mimi Zohar
The latter.

IMA code cannot operate on a mixture of real inode (file_inode())
real dentry (file_dentry()) and ovl path, especially for reading
stat.change_cookie which is not really well defined in ovl.

At least those direct f_path references need to be fixed:

security/integrity/ima/ima_main.c:
vfs_getattr_nosec(&file->f_path, &stat,
security/integrity/ima/ima_api.c: result =
vfs_getattr_nosec(&file->f_path, &stat, STATX_CHANGE_COOKIE,
security/integrity/ima/ima_crypto.c: f =
dentry_open(&file->f_path, flags, file->f_cred);

and then all the places that format full path for audit logs:
security/integrity/ima/ima_main.c: *pathname =
ima_d_path(&file->f_path, pathbuf, filename);

Need to decide if it is prefered to log the full ovl path or the
relative real path (relative to the private mount clone of the ovl layer).

Thanks,
Amir.

Jeff Layton

unread,
Jul 5, 2023, 9:51:54 AM7/5/23
to Christian Brauner, syzbot, amir...@gmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
When you say "this needs to use backing_file_real_path()", what do you
mean by "this"? IMA?

That said, passing some sort of NOSEC flag to vfs_getattr that
designates the call as kernel-internal seems like the more correct thing
to do here, and might be useful in other weird stacking cases like this.
--
Jeff Layton <jla...@kernel.org>

Amir Goldstein

unread,
Jul 5, 2023, 9:54:57 AM7/5/23
to Jeff Layton, Christian Brauner, syzbot, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
I don't think that NOSEC is the root cause.

If you ever noticed file_dentry() sprinkled through fs code,
it is only there because if that code were to call use helpers
that rely on file_inode() and d_inode(file->f_path.dentry) being
the same - bad things will happen and NOSEC will not cover
all those bad things.

IMA code also has file_dentry() sprinkled.
But it still accesses file->f_path in a few places and that
can result in bad things.

Thanks,
Amir.

Jeff Layton

unread,
Jul 5, 2023, 10:10:59 AM7/5/23
to Amir Goldstein, Christian Brauner, syzbot, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
Ok, that makes sense, and is a lot less invasive than having to rework
vfs_getattr.
--
Jeff Layton <jla...@kernel.org>

syzbot

unread,
Sep 12, 2023, 6:10:54 PM9/12/23
to amir...@gmail.com, bra...@kernel.org, jla...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, zo...@linux.ibm.com
syzbot has found a reproducer for the following issue on:

HEAD commit: a747acc0b752 Merge tag 'linux-kselftest-next-6.6-rc2' of g..
git tree: upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=11c82308680000
kernel config: https://syzkaller.appspot.com/x/.config?x=df91a3034fe3f122
dashboard link: https://syzkaller.appspot.com/bug?extid=a67fc5321ffb4b311c98
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1671b694680000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=14ec94d8680000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/b28ecb88c714/disk-a747acc0.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/03dd2cd5356f/vmlinux-a747acc0.xz
kernel image: https://storage.googleapis.com/syzbot-assets/63365d9bf980/bzImage-a747acc0.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a67fc5...@syzkaller.appspotmail.com

general protection fault, probably for non-canonical address 0xdffffc0000000009: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
CPU: 0 PID: 5030 Comm: syz-executor173 Not tainted 6.6.0-rc1-syzkaller-00014-ga747acc0b752 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/04/2023
RIP: 0010:__seqprop_spinlock_sequence include/linux/seqlock.h:275 [inline]
RIP: 0010:get_fs_root_rcu fs/d_path.c:244 [inline]
RIP: 0010:d_path+0x2f0/0x6e0 fs/d_path.c:286
Code: 30 00 74 08 48 89 df e8 be 20 e1 ff 4c 8b 23 4d 8d 6c 24 48 49 81 c4 88 00 00 00 4c 89 eb 48 c1 eb 03 4c 89 ef e8 00 1e 00 00 <42> 0f b6 04 33 84 c0 0f 85 89 00 00 00 45 8b 7d 00 44 89 fe 83 e6
RSP: 0018:ffffc90003a7eee0 EFLAGS: 00010246
RAX: 7e73051ae5315e00 RBX: 0000000000000009 RCX: ffff88807da73b80
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90003a7eff0 R08: ffffffff82068d08 R09: 1ffffffff1d34ccd
R10: dffffc0000000000 R11: fffffbfff1d34cce R12: 0000000000000088
R13: 0000000000000048 R14: dffffc0000000000 R15: ffff8880206d8000
FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f351862ebb8 CR3: 00000000276a7000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
audit_log_d_path+0xd3/0x310 kernel/audit.c:2138
dump_common_audit_data security/lsm_audit.c:224 [inline]
common_lsm_audit+0x7cf/0x1a90 security/lsm_audit.c:458
smack_log+0x421/0x540 security/smack/smack_access.c:383
smk_tskacc+0x2ff/0x360 security/smack/smack_access.c:253
smack_inode_getattr+0x203/0x270 security/smack/smack_lsm.c:1271
security_inode_getattr+0xd3/0x120 security/security.c:2153
vfs_getattr+0x2a/0x3a0 fs/stat.c:206
ovl_getattr+0x1b1/0xf70 fs/overlayfs/inode.c:174
ima_check_last_writer security/integrity/ima/ima_main.c:171 [inline]
ima_file_free+0x26e/0x4b0 security/integrity/ima/ima_main.c:203
__fput+0x36a/0x910 fs/file_table.c:378
task_work_run+0x24a/0x300 kernel/task_work.c:179
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x68f/0x2290 kernel/exit.c:874
do_group_exit+0x206/0x2c0 kernel/exit.c:1024
get_signal+0x175d/0x1840 kernel/signal.c:2892
arch_do_signal_or_restart+0x96/0x860 arch/x86/kernel/signal.c:309
exit_to_user_mode_loop+0x6a/0x100 kernel/entry/common.c:168
exit_to_user_mode_prepare+0xb1/0x140 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x64/0x280 kernel/entry/common.c:296
do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f35185d8529
Code: Unable to access opcode bytes at 0x7f35185d84ff.
RSP: 002b:00007f3518599218 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: 0000000000000001 RBX: 00007f3518662308 RCX: 00007f35185d8529
RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00007f351866230c
RBP: 00007f3518662300 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f351862f064
R13: 0031656c69662f2e R14: 6e6f3d7865646e69 R15: 0079616c7265766f
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__seqprop_spinlock_sequence include/linux/seqlock.h:275 [inline]
RIP: 0010:get_fs_root_rcu fs/d_path.c:244 [inline]
RIP: 0010:d_path+0x2f0/0x6e0 fs/d_path.c:286
Code: 30 00 74 08 48 89 df e8 be 20 e1 ff 4c 8b 23 4d 8d 6c 24 48 49 81 c4 88 00 00 00 4c 89 eb 48 c1 eb 03 4c 89 ef e8 00 1e 00 00 <42> 0f b6 04 33 84 c0 0f 85 89 00 00 00 45 8b 7d 00 44 89 fe 83 e6
RSP: 0018:ffffc90003a7eee0 EFLAGS: 00010246
RAX: 7e73051ae5315e00 RBX: 0000000000000009 RCX: ffff88807da73b80
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90003a7eff0 R08: ffffffff82068d08 R09: 1ffffffff1d34ccd
R10: dffffc0000000000 R11: fffffbfff1d34cce R12: 0000000000000088
R13: 0000000000000048 R14: dffffc0000000000 R15: ffff8880206d8000
FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f351862ebb8 CR3: 000000007e769000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: 30 00 xor %al,(%rax)
2: 74 08 je 0xc
4: 48 89 df mov %rbx,%rdi
7: e8 be 20 e1 ff call 0xffe120ca
c: 4c 8b 23 mov (%rbx),%r12
f: 4d 8d 6c 24 48 lea 0x48(%r12),%r13
14: 49 81 c4 88 00 00 00 add $0x88,%r12
1b: 4c 89 eb mov %r13,%rbx
1e: 48 c1 eb 03 shr $0x3,%rbx
22: 4c 89 ef mov %r13,%rdi
25: e8 00 1e 00 00 call 0x1e2a
* 2a: 42 0f b6 04 33 movzbl (%rbx,%r14,1),%eax <-- trapping instruction
2f: 84 c0 test %al,%al
31: 0f 85 89 00 00 00 jne 0xc0
37: 45 8b 7d 00 mov 0x0(%r13),%r15d
3b: 44 89 fe mov %r15d,%esi
3e: 83 .byte 0x83
3f: e6 .byte 0xe6


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

Amir Goldstein

unread,
Sep 13, 2023, 3:10:01 AM9/13/23
to syzbot, bra...@kernel.org, jla...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, zo...@linux.ibm.com
#syz set subsystems: integrity, overlayfs

#syz test: https://github.com/amir73il/linux ima-ovl-fix

syzbot

unread,
Sep 13, 2023, 5:06:35 AM9/13/23
to amir...@gmail.com, bra...@kernel.org, jla...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, zo...@linux.ibm.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in d_path

general protection fault, probably for non-canonical address 0xdffffc0000000009: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
CPU: 0 PID: 5465 Comm: syz-executor.0 Not tainted 6.6.0-rc1-syzkaller-00004-g965067e2f71e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/04/2023
RIP: 0010:__seqprop_spinlock_sequence include/linux/seqlock.h:275 [inline]
RIP: 0010:get_fs_root_rcu fs/d_path.c:244 [inline]
RIP: 0010:d_path+0x2f0/0x6e0 fs/d_path.c:286
Code: 30 00 74 08 48 89 df e8 be 20 e1 ff 4c 8b 23 4d 8d 6c 24 48 49 81 c4 88 00 00 00 4c 89 eb 48 c1 eb 03 4c 89 ef e8 00 1e 00 00 <42> 0f b6 04 33 84 c0 0f 85 89 00 00 00 45 8b 7d 00 44 89 fe 83 e6
RSP: 0018:ffffc90005056ec0 EFLAGS: 00010246
RAX: be27ea831a7ad800 RBX: 0000000000000009 RCX: ffff88801c713b80
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90005056fd0 R08: ffffffff82068d08 R09: 1ffffffff1d34ccd
R10: dffffc0000000000 R11: fffffbfff1d34cce R12: 0000000000000088
R13: 0000000000000048 R14: dffffc0000000000 R15: ffff888076dcc000
FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4b67d70420 CR3: 0000000016f66000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
audit_log_d_path+0xd3/0x310 kernel/audit.c:2138
dump_common_audit_data security/lsm_audit.c:224 [inline]
common_lsm_audit+0x7cf/0x1a90 security/lsm_audit.c:458
smack_log+0x421/0x540 security/smack/smack_access.c:383
smk_tskacc+0x2ff/0x360 security/smack/smack_access.c:253
smack_inode_getattr+0x203/0x270 security/smack/smack_lsm.c:1271
security_inode_getattr+0xd3/0x120 security/security.c:2153
vfs_getattr+0x2a/0x3a0 fs/stat.c:206
ovl_getattr+0x1b1/0xf70 fs/overlayfs/inode.c:174
ima_check_last_writer security/integrity/ima/ima_main.c:171 [inline]
ima_file_free+0x2c3/0x560 security/integrity/ima/ima_main.c:203
__fput+0x36a/0x910 fs/file_table.c:378
task_work_run+0x24a/0x300 kernel/task_work.c:179
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x68f/0x2290 kernel/exit.c:874
do_group_exit+0x206/0x2c0 kernel/exit.c:1024
get_signal+0x175d/0x1840 kernel/signal.c:2892
arch_do_signal_or_restart+0x96/0x860 arch/x86/kernel/signal.c:309
exit_to_user_mode_loop+0x6a/0x100 kernel/entry/common.c:168
exit_to_user_mode_prepare+0xb1/0x140 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x64/0x280 kernel/entry/common.c:296
do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f2a67a7cae9
Code: Unable to access opcode bytes at 0x7f2a67a7cabf.
RSP: 002b:00007f2a6875c178 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: 0000000000000001 RBX: 00007f2a67b9bf88 RCX: 00007f2a67a7cae9
RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00007f2a67b9bf8c
RBP: 00007f2a67b9bf80 R08: 00007fffba3690b0 R09: 00007f2a6875c6c0
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2a67b9bf8c
R13: 000000000000000b R14: 00007fffba21b880 R15: 00007fffba21b968
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__seqprop_spinlock_sequence include/linux/seqlock.h:275 [inline]
RIP: 0010:get_fs_root_rcu fs/d_path.c:244 [inline]
RIP: 0010:d_path+0x2f0/0x6e0 fs/d_path.c:286
Code: 30 00 74 08 48 89 df e8 be 20 e1 ff 4c 8b 23 4d 8d 6c 24 48 49 81 c4 88 00 00 00 4c 89 eb 48 c1 eb 03 4c 89 ef e8 00 1e 00 00 <42> 0f b6 04 33 84 c0 0f 85 89 00 00 00 45 8b 7d 00 44 89 fe 83 e6
RSP: 0018:ffffc90005056ec0 EFLAGS: 00010246
RAX: be27ea831a7ad800 RBX: 0000000000000009 RCX: ffff88801c713b80
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90005056fd0 R08: ffffffff82068d08 R09: 1ffffffff1d34ccd
R10: dffffc0000000000 R11: fffffbfff1d34cce R12: 0000000000000088
R13: 0000000000000048 R14: dffffc0000000000 R15: ffff888076dcc000
FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4b67d70420 CR3: 0000000016f66000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: 30 00 xor %al,(%rax)
2: 74 08 je 0xc
4: 48 89 df mov %rbx,%rdi
7: e8 be 20 e1 ff call 0xffe120ca
c: 4c 8b 23 mov (%rbx),%r12
f: 4d 8d 6c 24 48 lea 0x48(%r12),%r13
14: 49 81 c4 88 00 00 00 add $0x88,%r12
1b: 4c 89 eb mov %r13,%rbx
1e: 48 c1 eb 03 shr $0x3,%rbx
22: 4c 89 ef mov %r13,%rdi
25: e8 00 1e 00 00 call 0x1e2a
* 2a: 42 0f b6 04 33 movzbl (%rbx,%r14,1),%eax <-- trapping instruction
2f: 84 c0 test %al,%al
31: 0f 85 89 00 00 00 jne 0xc0
37: 45 8b 7d 00 mov 0x0(%r13),%r15d
3b: 44 89 fe mov %r15d,%esi
3e: 83 .byte 0x83
3f: e6 .byte 0xe6


Tested on:

commit: 965067e2 ima: fix wrong dereferences of file->f_path
git tree: https://github.com/amir73il/linux ima-ovl-fix
console output: https://syzkaller.appspot.com/x/log.txt?x=109b00e8680000
kernel config: https://syzkaller.appspot.com/x/.config?x=df91a3034fe3f122
dashboard link: https://syzkaller.appspot.com/bug?extid=a67fc5321ffb4b311c98
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.

Mimi Zohar

unread,
Sep 14, 2023, 6:12:55 PM9/14/23
to syzbot, amir...@gmail.com, bra...@kernel.org, jla...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
On Wed, 2023-09-13 at 02:06 -0700, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> general protection fault in d_path

Is an IMA policy being loaded? Please include the boot command line.

syzbot

unread,
Sep 17, 2023, 8:04:35 PM9/17/23
to amir...@gmail.com, bra...@kernel.org, jla...@kernel.org, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, ste...@linux.ibm.com, syzkall...@googlegroups.com, zo...@linux.ibm.com
syzbot has bisected this issue to:

commit db1d1e8b9867aae5c3e61ad7859abfcc4a6fd6c7
Author: Jeff Layton <jla...@kernel.org>
Date: Mon Apr 17 16:55:51 2023 +0000

IMA: use vfs_getattr_nosec to get the i_version

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=106f7e54680000
start commit: a747acc0b752 Merge tag 'linux-kselftest-next-6.6-rc2' of g..
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=126f7e54680000
console output: https://syzkaller.appspot.com/x/log.txt?x=146f7e54680000
Reported-by: syzbot+a67fc5...@syzkaller.appspotmail.com
Fixes: db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get the i_version")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection
Message has been deleted

Stefan Berger

unread,
Sep 20, 2023, 1:39:01 PM9/20/23
to syzbot, amir...@gmail.com, bra...@kernel.org, jla...@kernel.org, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, zo...@linux.ibm.com
The final oops shows this here:

BUG: kernel NULL pointer dereference, address: 0000000000000058
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 0 PID: 3192 Comm: syz-executor.0 Not tainted 6.4.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 08/04/2023
RIP: 0010:__lock_acquire+0x35/0x490 kernel/locking/lockdep.c:4946
Code: 83 ec 18 65 4c 8b 35 aa 60 f4 7e 83 3d b7 11 e4 02 00 0f 84 05 02
00 00 4c 89 cb 89 cd 41 89 d5 49 89 ff 83 fe 01 77 0c 89 f0 <49> 8b 44
c7 08 48 85 c0 75 1b 4c 89 ff 31 d2 45 89 c4 e8 74 f6 ff
RSP: 0018:ffffc90002edb840 EFLAGS: 00010097
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000050
RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: ffff888102ea5340 R15: 0000000000000050
FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000058 CR3: 0000000003aa8000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 lock_acquire+0xd8/0x1f0 kernel/locking/lockdep.c:5691
 seqcount_lockdep_reader_access include/linux/seqlock.h:102 [inline]
 get_fs_root_rcu fs/d_path.c:243 [inline]
 d_path+0xd1/0x1f0 fs/d_path.c:285
 audit_log_d_path+0x65/0x130 kernel/audit.c:2139
 dump_common_audit_data security/lsm_audit.c:224 [inline]
 common_lsm_audit+0x3b3/0x840 security/lsm_audit.c:458
 smack_log+0xad/0x130 security/smack/smack_access.c:383
 smk_tskacc+0xb1/0xd0 security/smack/smack_access.c:253
 smack_inode_getattr+0x8a/0xb0 security/smack/smack_lsm.c:1187
 security_inode_getattr+0x32/0x50 security/security.c:2114
 vfs_getattr+0x1b/0x40 fs/stat.c:167
 ovl_getattr+0xa6/0x3e0 fs/overlayfs/inode.c:173
 ima_check_last_writer security/integrity/ima/ima_main.c:171 [inline]
 ima_file_free+0xbd/0x130 security/integrity/ima/ima_main.c:203
 __fput+0xc7/0x220 fs/file_table.c:315
 task_work_run+0x7d/0xa0 kernel/task_work.c:179
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0x2c7/0xa80 kernel/exit.c:871 <-----------------------
 do_group_exit+0x85/0xa0 kernel/exit.c:1021
 get_signal+0x73c/0x7f0 kernel/signal.c:2874
 arch_do_signal_or_restart+0x89/0x290 arch/x86/kernel/signal.c:306
 exit_to_user_mode_loop+0x61/0xb0 kernel/entry/common.c:168
 exit_to_user_mode_prepare+0x64/0xb0 kernel/entry/common.c:204
 __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
 syscall_exit_to_user_mode+0x2b/0x1d0 kernel/entry/common.c:297
 do_syscall_64+0x4d/0x90 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x63/0xcd


do_exit has called exit_fs(tsk) [
https://elixir.bootlin.com/linux/v6.4-rc2/source/kernel/exit.c#L867 ]

exit_fs(tsk) has set tsk->fs = NULL [
https://elixir.bootlin.com/linux/v6.4-rc2/source/fs/fs_struct.c#L103 ]

I think this then bites in d_path() where it calls:

    get_fs_root_rcu(current->fs, &root);   [
https://elixir.bootlin.com/linux/v6.4-rc2/source/fs/d_path.c#L285 ]

current->fs is likely NULL here.

If this was correct it would have nothing to do with the actual patch,
though, but rather with the fact that smack logs on process termination.
I am not sure what the solution would be other than testing for
current->fs == NULL in d_path before using it and returning an error
that is not normally returned or trying to intercept this case in smack.

   Stefan

Stefan Berger

unread,
Sep 20, 2023, 4:37:58 PM9/20/23
to syzbot, amir...@gmail.com, bra...@kernel.org, jla...@kernel.org, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, zo...@linux.ibm.com, ca...@schaufler-ca.com
I have now been able to recreate the syzbot issue with the test program
and the issue is exactly the one described here, current->fs == NULL.

   Stefan


Jeff Layton

unread,
Sep 20, 2023, 5:16:07 PM9/20/23
to Stefan Berger, syzbot, amir...@gmail.com, bra...@kernel.org, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, zo...@linux.ibm.com, ca...@schaufler-ca.com
Earlier in this thread, Amir had a diagnosis that IMA is inappropriately
trying to use f_path directly instead of using the helpers that are
friendly for stacking filesystems.

https://lore.kernel.org/linux-fsdevel/CAOQ4uxgjnYyeQL-LbS5kQ7+C0d6sjzKqMDWAtZW8cAkPaed6=Q...@mail.gmail.com/

I'm not an IMA hacker so I'm not planning to roll a fix here. Perhaps
someone on the IMA team could try this approach?

Cheers,
--
Jeff Layton <jla...@kernel.org>

Stefan Berger

unread,
Sep 20, 2023, 6:09:18 PM9/20/23
to Jeff Layton, syzbot, amir...@gmail.com, bra...@kernel.org, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, zo...@linux.ibm.com, ca...@schaufler-ca.com
I have applied this patch here from Amir now and it does NOT resolve the
issue:

https://lore.kernel.org/linux-integrity/296dae962a2a488bde682d3...@linux.ibm.com/T/#m4ebdb780bf6952e7f210c55e87950d0cfa1d5eb0

    Stefan


Stefan Berger

unread,
Sep 20, 2023, 8:10:56 PM9/20/23
to Jeff Layton, syzbot, amir...@gmail.com, bra...@kernel.org, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, zo...@linux.ibm.com, ca...@schaufler-ca.com
This seems to resolve the issue:

diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
index 585e5e35710b..57afcea1e39b 100644
--- a/security/smack/smack_access.c
+++ b/security/smack/smack_access.c
@@ -347,6 +347,9 @@ void smack_log(char *subject_label, char
*object_label, int request,
        struct smack_audit_data *sad;
        struct common_audit_data *a = &ad->a;

+       if (current->flags & PF_EXITING)
+               return;
+
        /* check if we have to log the current event */
        if (result < 0 && (log_policy & SMACK_AUDIT_DENIED) == 0)
                return;


Casey Schaufler

unread,
Sep 20, 2023, 8:52:46 PM9/20/23
to Stefan Berger, Jeff Layton, syzbot, amir...@gmail.com, bra...@kernel.org, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, zo...@linux.ibm.com, Casey Schaufler
Based on what I see here I can understand that this prevents the panic,
but it isn't so clear what changed that introduced the problem.

Jeff Layton

unread,
Sep 21, 2023, 6:32:38 AM9/21/23
to Casey Schaufler, Stefan Berger, syzbot, amir...@gmail.com, bra...@kernel.org, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, zo...@linux.ibm.com
Apparently, it's this patch:

db1d1e8b9867 IMA: use vfs_getattr_nosec to get the i_version

At one time, IMA would reach directly into the inode to get the
i_version and ctime. That was fine for certain filesystems, but with
more recent changes it needs to go through ->getattr instead. Evidently,
it's selecting the wrong inode to query when dealing with overlayfs and
that's causing panics at times.

As to why the above patch helps, I'm not sure, but given that it doesn't
seem to change which inode is being queried via getattr, it seems like
this is probably papering over the real bug. That said, IMA and
overlayfs are not really in my wheelhouse, so I could be very wrong
here.
--
Jeff Layton <jla...@kernel.org>

Mimi Zohar

unread,
Sep 21, 2023, 7:24:33 AM9/21/23
to Jeff Layton, Casey Schaufler, Stefan Berger, syzbot, amir...@gmail.com, bra...@kernel.org, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
Yes, the syzbot was updated with that info.

> At one time, IMA would reach directly into the inode to get the
> i_version and ctime. That was fine for certain filesystems, but with
> more recent changes it needs to go through ->getattr instead. Evidently,
> it's selecting the wrong inode to query when dealing with overlayfs and
> that's causing panics at times.
>
> As to why the above patch helps, I'm not sure, but given that it doesn't
> seem to change which inode is being queried via getattr, it seems like
> this is probably papering over the real bug. That said, IMA and
> overlayfs are not really in my wheelhouse, so I could be very wrong
> here.

The call to vfs_getattr_nosec() somehow triggers a call to
security_inode_getattr(). Without the call neither ovl_getattr() nor
smack_inode_getattr() would be called.

Mimi

Christian Brauner

unread,
Sep 21, 2023, 7:48:55 AM9/21/23
to Mimi Zohar, Jeff Layton, Casey Schaufler, Stefan Berger, syzbot, amir...@gmail.com, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
ima_file_free()
-> ima_check_last_writer()
-> vfs_getattr_nosec()
-> i_op->getattr() == ovl_getattr()
-> vfs_getattr()
-> security_inode_getattr()
-> real_i_op->getattr()

is the callchain that triggers this.

ima_file_free() is called in a very sensitive location: __fput() that
can be called from task work when the process is already PF_EXITING.

The ideal solution would be for ima to stop calling back into the
filesystems in this location at all but that's probably not going to
happen because I now realize you also set extended attributes from
__fput():


ima_check_last_writer()
-> ima_update_xatt()
-> ima_fix_xattr()
-> __vfs_setxattr_noperm()

The __vfs_setxattr_noperm() codepath can itself trigger
security_inode_post_setxattr() and security_inode_setsecurity(). So
those hooks are hopefully safe to be called with PF_EXITING tasks as
well...

Imho, this is all very wild but I'm not judging.

Two solutions imho:
(1) teach stacking filesystems like overlayfs and ecryptfs to use
vfs_getattr_nosec() in their ->getattr() implementation when they
are themselves called via vfs_getattr_nosec(). This will fix this by
not triggering another LSM hook.
(2) make all ->getattr() LSM hooks PF_EXITING safe ideally don't do
anything

Amir Goldstein

unread,
Sep 21, 2023, 8:03:47 AM9/21/23
to Jeff Layton, Casey Schaufler, Stefan Berger, syzbot, bra...@kernel.org, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com, zo...@linux.ibm.com
This is a partial story - one which I told and tried to solve with my patch.
It is true in some cases, but it wasn't the case in this reproducer.

Overlayfs keeps *two* files open, one for the ovl inode and one for
the 'real' inode in underlying fs.

IMA hooks can happen on either of these files.
fput of the ovl file will trigger fput of the real file.
My patch fixes IMA code that triggers on the real file,
which must use file_real_path() to get the path.
My patch really fixes a bug that was also introduced by your
patch, but it is not the bug that syzbot reported.

The syzbot reproducer is triggered on fput of the ovl file.
This is why my patch did not fix it.

> As to why the above patch helps, I'm not sure, but given that it doesn't
> seem to change which inode is being queried via getattr, it seems like
> this is probably papering over the real bug. That said, IMA and
> overlayfs are not really in my wheelhouse, so I could be very wrong
> here.

IMO, the solution is one of the two solutions that Christian proposed -
you had also proposed to propage the nosec context yourself
before I suggested my fix. I never said that it was wrong - just didn't
think it was the root cause for this regression, because there was
another bug under the lamppost.

Thanks,
Amir.

Stefan Berger

unread,
Sep 21, 2023, 10:29:10 AM9/21/23
to Christian Brauner, Mimi Zohar, Jeff Layton, Casey Schaufler, syzbot, amir...@gmail.com, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
LSM inode_post_setxattr has two users, Smack and SELinux. Smack does not
call smack_log/audit in this case. SELinux seems safe as well.

LSM inode_setsecurity has two users, Smack and SELinux. Smack does not
call smack_log/audit in this case. SELinux seems safe as well.


>
> Imho, this is all very wild but I'm not judging.
>
> Two solutions imho:
> (1) teach stacking filesystems like overlayfs and ecryptfs to use
> vfs_getattr_nosec() in their ->getattr() implementation when they
> are themselves called via vfs_getattr_nosec(). This will fix this by
> not triggering another LSM hook.
> (2) make all ->getattr() LSM hooks PF_EXITING safe ideally don't do
> anything

Re (2): LSM's getattr should still make policy decision, right? So
callers should be allowed to go to some depth into these functions but
avoid calls deeper into smack_log (for example) that trigger the call path
  common_lsm_audit
   -> dump_common_audit_data
    -> audit_log_d_path via a->type LSM_AUDIT_DATA_PATH or
LSM_AUDIT_DATA_FILE or  LSM_AUDIT_DATA_IOCTL_OP and then possibly run
into current->fs == NULL in d_path.

To avoid audit_log_d_path being called the a->type should not be either
one of the 3 mentioned above:

LSM_AUDIT_DATA_PATH: used by Smack & SELinux
LSM_AUDIT_DATA_FILE: used by SELinux
LSM_AUDIT_DATA_IOCTL: used by SELinux


The LSM getattr has users AppArmor, SELinux, Smack and Tomoyo.

Tomoyo: seems safe
SELinux:

selinux_inode_getattr
  -> path_has_perm  [ sets LSM_AUDIT_DATA_FILE !! ]
    -> inode_has_perm
     -> avc_has_perm
      -> avc_audit
        -> slow_avc_audit
          -> common_lsm_audit
           -> dump_common_audit_data
            -> audit_log_d_path
             -> d_path
              -> get_fs_root_rcu with current->fs = NULL

Smack: (the known path per this syzbot issue)

smack_inode_getattr   [ sets LSM_AUDIT_DATA_PATH !! ]
  -> smk_curacc
    -> smk_tskacc
      -> smack_log
       -> common_lsm_audit
         -> dump_common_audit_data
           -> audit_log_d_path
             -> d_path
              -> get_fs_root_rcu with current->fs = NULL

AppArmor:

apparmor_inode_getattr
  -> common_perm_cond
     -> common_perm
       -> aa_path_perm
          -> profile_path_perm
              -> aa_path_perm
                 -> aa_audit_file      [ sets LSM_AUDIT_DATA_TASK ]
                    -> aa_audit
                     -> aa_audit_msg
                       -> common_lsm_audit
                        -> dump_common_audit_data
                          -> DOES NOT call audit_log_d_path but calls
task_tgid_nr(tsk)


So, SELinux and Smack would be affected. The common code path starts
with common_lsm_audit and either a check for current->flags & PF_EXITING
or a more fine-grained one like this here could be a solution for (2)

diff --git a/security/lsm_audit.c b/security/lsm_audit.c
index 849e832719e2..4f3570322851 100644
--- a/security/lsm_audit.c
+++ b/security/lsm_audit.c
@@ -445,6 +445,18 @@ void common_lsm_audit(struct common_audit_data *a,

        if (a == NULL)
                return;
+
+       if (current->flags & PF_EXITING) {
+               /*
+                * Avoid running into audit_log_d_path -> d_path
+                * -> get_fs_root_rcu with current->fs = NULL
+                */
+               if (a->type == LSM_AUDIT_DATA_PATH ||
+                   a->type == LSM_AUDIT_DATA_FILE ||
+                   a->type == LSM_AUDIT_DATA_IOCTL_OP)
+                       return;
+       }
+
        /* we use GFP_ATOMIC so we won't sleep */
        ab = audit_log_start(audit_context(), GFP_ATOMIC | __GFP_NOWARN,
                             AUDIT_AVC);

Mimi Zohar

unread,
Sep 21, 2023, 10:52:45 AM9/21/23
to Christian Brauner, Jeff Layton, Casey Schaufler, Stefan Berger, syzbot, amir...@gmail.com, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
Thank you for the explanation as to why ovl_getattr() and subsequently
smack_inode_getattr() is being called.

>
> ima_file_free() is called in a very sensitive location: __fput() that
> can be called from task work when the process is already PF_EXITING.
>
> The ideal solution would be for ima to stop calling back into the
> filesystems in this location at all but that's probably not going to
> happen because I now realize you also set extended attributes from
> __fput():
>
>
> ima_check_last_writer()
> -> ima_update_xatt()
> -> ima_fix_xattr()
> -> __vfs_setxattr_noperm()
>
> The __vfs_setxattr_noperm() codepath can itself trigger
> security_inode_post_setxattr() and security_inode_setsecurity(). So
> those hooks are hopefully safe to be called with PF_EXITING tasks as
> well...
>
> Imho, this is all very wild but I'm not judging.

Measuring and verifying immutable files is straight forward.
Measuring, verifiying, and updating mutable file hashes is a lot more
complicated. Re-calculating the file hash everytime the file changes
would impact performance. The file hash is currently updated as the
last writer closes the file (__fput). One of the reasons for the wq
was for IMA to safely calculate the file hash and and take the i_mutex
to write the xattr.

IMA support for mutable files makes IMA a lot more complicated. Any
improvement suggestions would be appreciated.

>
> Two solutions imho:
> (1) teach stacking filesystems like overlayfs and ecryptfs to use
> vfs_getattr_nosec() in their ->getattr() implementation when they
> are themselves called via vfs_getattr_nosec(). This will fix this by
> not triggering another LSM hook.
> (2) make all ->getattr() LSM hooks PF_EXITING safe ideally don't do
> anything

The original problem was detecting i_version change on overlayfs.

Amir's proposed patch might resolve it without commit db1d1e8b9867
("IMA: use vfs_getattr_nosec to get the i_version"). However, as Amir
said, it does not address the new problem introduced by it. Assuming
Amir's proposed patch resolves the original problem, an alternative
solution would be to revert commit db1d1e8b9867.

Mimi

Jeff Layton

unread,
Sep 21, 2023, 11:10:20 AM9/21/23
to Mimi Zohar, Christian Brauner, Casey Schaufler, Stefan Berger, syzbot, amir...@gmail.com, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
If you're going to revert that commit, then I'm wondering what you
intend to do instead. Reaching directly into the inode to get this
information is really no bueno.
--
Jeff Layton <jla...@kernel.org>

Amir Goldstein

unread,
Sep 21, 2023, 11:19:11 AM9/21/23
to Jeff Layton, Mimi Zohar, Christian Brauner, Casey Schaufler, Stefan Berger, syzbot, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
Just for overlayfs or in general?

The way I see it, IMA support on overlayfs is buggy.
There are many open issues in syzbot and not overlayfs developers
and not IMA developers really care enough to fix these bugs.
Is it a setup that is interesting for IMA users at all?

If not, then we should definitely prefer a correctness bug
in IMA+overlayfs over a crash.

If it is possible, I would recommend to opt-out overlayfs
from the default IMA policy to silence syzbot bug reports
this setup.

Thanks,
Amir.

Mimi Zohar

unread,
Sep 21, 2023, 11:19:30 AM9/21/23
to Jeff Layton, Christian Brauner, Casey Schaufler, Stefan Berger, syzbot, amir...@gmail.com, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
IMA detecting file change based on i_version has been there since IMA
was upstreamed. Please explain why this is not a good idea.

Mimi

Jeff Layton

unread,
Sep 21, 2023, 11:39:50 AM9/21/23
to Mimi Zohar, Christian Brauner, Casey Schaufler, Stefan Berger, syzbot, amir...@gmail.com, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
Not all i_version values are managed in the same way. Network
filesystems need to pass through the value from the server, whereas with
a local filesystems the kernel needs to manage the increment.

IMA is mostly interested in local filesystems at the moment. The main
kernel-managed versions in the kernel are in btrfs, ext4 and xfs and
tmpfs. Until recently, only btrfs had one that functioned properly. Both
ext4 and xfs would also increment their values on atime updates. In
practical terms, this means that IMA ends up doing unnecessary
remeasurements after read events in some cases.

ext4 recently had its i_version value fixed to not do this, but the XFS
developers are unable to fix theirs to avoid incrementing on atime
updates. For that, I'm working on the multigrain ctime patches which
should allow XFS to fake up a STATX_CHANGE_COOKIE in its getattr
routine.

IMA has no practical way to tell what the filesystem can do if it's
groveling around inside struct inode, which is why I recommended using
vfs_getattr_nosec to grab this info. If that's problematic then by all
means, back out that patch, but you'll need to come up with some way to
deal with the different nuances of the different i_version counters in
across different filesystems.

Mimi Zohar

unread,
Sep 21, 2023, 12:31:02 PM9/21/23
to Jeff Layton, Christian Brauner, Casey Schaufler, Stefan Berger, syzbot, amir...@gmail.com, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
Got it. This is basically a performance issue, because i_version is
being updated too frequently on some filesystems. It's not an issue of
missing measurements or not re-evaluting the file's integrity when
needed.

Let's see if Amir's patch actually fixes the original problem before
making any decisions. (Wishing for a reproducer of the original
problem.)

Mimi

Amir Goldstein

unread,
Sep 21, 2023, 1:02:05 PM9/21/23
to Mimi Zohar, Jeff Layton, Christian Brauner, Casey Schaufler, Stefan Berger, syzbot, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
Confused. What is the "original problem"?
I never claimed that my patch fixes the "original problem".
I claimed [1] that my patch fixes a problem that existed before
db1d1e8b9867, but db1d1e8b9867 added two more instances
of that bug (wrong dereference of file->f_path).
Apparently, db1d1e8b9867 introduced another bug.

It looks like you should revert db1d1e8b9867, but regardless,
I recommend that you apply my patch. My patch conflicts
with the revert but the conflict is trivial - the two hunks that
fix the new vfs_getattr_nosec() calls are irrelevant - the rest are.

Thanks,
Amir.

[1] https://lore.kernel.org/linux-unionfs/20230913073755.3...@gmail.com/

Mimi Zohar

unread,
Sep 26, 2023, 10:40:20 AM9/26/23
to Amir Goldstein, Jeff Layton, Christian Brauner, Casey Schaufler, Stefan Berger, syzbot, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
Here are some of the issues with IMA/Overlay:

1. False positive syzbot IMA/overlay lockdep warnings.
2, Not detecting file change with squashfs + overlay.
3. Changes to the backing file are not detected by overlay (when
backing file is not in policy).

Commit db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get the i_version")
upstreamed to address 2, but has become unnecessary due to other
changes. According to Stefan, the problem subsequently was resolved
without either commit db1d1e8b9867 or 18b44bc5a672. (Kernel was not
bi-sected to find bug resolution.)

Commit 18b44bc5a672 ("ovl: Always reevaluate the file signature for
IMA") to address 3.

[PATCH] "ima: fix wrong dereferences of file->f_path" is probably
correct. Does it address any syzbot reports?

--
thanks,

Mimi

Stefan Berger

unread,
Sep 28, 2023, 8:02:56 PM9/28/23
to Christian Brauner, Mimi Zohar, Jeff Layton, Casey Schaufler, syzbot, amir...@gmail.com, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com

On 9/21/23 07:48, Christian Brauner wrote:
>
> Imho, this is all very wild but I'm not judging.
>
> Two solutions imho:
> (1) teach stacking filesystems like overlayfs and ecryptfs to use
> vfs_getattr_nosec() in their ->getattr() implementation when they
> are themselves called via vfs_getattr_nosec(). This will fix this by
> not triggering another LSM hook.

This somewhat lengthy patch I think would be a solution for (1). I don't
think the Fixes tag is correct but IMO it should propagate farther back,
if possible.


From 01467f6e879c4cd757abb4d79cb18bf11150bed8 Mon Sep 17 00:00:00 2001
From: Stefan Berger <ste...@linux.ibm.com>
Date: Thu, 28 Sep 2023 14:42:39 -0400
Subject: [PATCH] fs: Enable GETATTR_NOSEC parameter for getattr interface
 function

When vfs_getattr_nosec() calls a filesystem's getattr interface function
then the 'nosec' should propagate into this function so that
vfs_getattr_nosec() can again be called from the filesystem's gettattr
rather than vfs_getattr(). The latter would add unnecessary security
checks that the initial vfs_getattr_nosec() call wanted to avoid.
Therefore, introduce the getattr flag GETATTR_NOSEC and allow to pass
with the new getattr_flags parameter to the getattr interface function.
In overlayfs and ecryptfs use this flag to determine which one of the
two functions to call.

In a recent code change introduced to IMA vfs_getattr_nosec() ended up
calling vfs_getattr() in overlayfs, which in turn called
security_inode_getattr() on an exiting process that did not have
current->fs set anymore, which then caused a kernel NULL pointer
dereference. With this change the call to security_inode_getattr() can
be avoided, thus avoiding the NULL pointer dereference.

Reported-by: syzbot+a67fc5...@syzkaller.appspotmail.com
Fixes: db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get the i_version")
Signed-off-by: Stefan Berger <ste...@linux.ibm.com>

---

simple_getattr has been adjusted and all files returned by the following
grep have been adjusted as well.

 grep -rEI "^[[:space:]]+\.getattr" ./ | \
   grep -v simple_getattr  | \
   cut -d ":" -f1 | sort | uniq
---
 fs/9p/vfs_inode.c             |  3 ++-
 fs/9p/vfs_inode_dotl.c        |  3 ++-
 fs/afs/inode.c                |  3 ++-
 fs/afs/internal.h             |  2 +-
 fs/bad_inode.c                |  3 ++-
 fs/btrfs/inode.c              |  3 ++-
 fs/ceph/inode.c               |  9 ++++++---
 fs/ceph/super.h               |  3 ++-
 fs/coda/coda_linux.h          |  2 +-
 fs/coda/inode.c               |  3 ++-
 fs/ecryptfs/inode.c           | 14 ++++++++++----
 fs/erofs/inode.c              |  2 +-
 fs/erofs/internal.h           |  2 +-
 fs/exfat/exfat_fs.h           |  2 +-
 fs/exfat/file.c               |  2 +-
 fs/ext2/ext2.h                |  2 +-
 fs/ext2/inode.c               |  3 ++-
 fs/ext4/ext4.h                |  6 ++++--
 fs/ext4/inode.c               |  9 ++++++---
 fs/ext4/symlink.c             |  6 ++++--
 fs/f2fs/f2fs.h                |  3 ++-
 fs/f2fs/file.c                |  3 ++-
 fs/f2fs/namei.c               |  6 ++++--
 fs/fat/fat.h                  |  3 ++-
 fs/fat/file.c                 |  3 ++-
 fs/fuse/dir.c                 |  3 ++-
 fs/gfs2/inode.c               |  4 +++-
 fs/hfsplus/hfsplus_fs.h       |  2 +-
 fs/hfsplus/inode.c            |  2 +-
 fs/kernfs/inode.c             |  3 ++-
 fs/kernfs/kernfs-internal.h   |  3 ++-
 fs/libfs.c                    |  5 +++--
 fs/minix/inode.c              |  3 ++-
 fs/minix/minix.h              |  3 ++-
 fs/nfs/inode.c                |  3 ++-
 fs/nfs/namespace.c            |  5 +++--
 fs/ntfs3/file.c               |  3 ++-
 fs/ntfs3/ntfs_fs.h            |  3 ++-
 fs/ocfs2/file.c               |  3 ++-
 fs/ocfs2/file.h               |  3 ++-
 fs/orangefs/inode.c           |  3 ++-
 fs/orangefs/orangefs-kernel.h |  3 ++-
 fs/overlayfs/inode.c          |  8 ++++++--
 fs/overlayfs/overlayfs.h      |  3 ++-
 fs/proc/base.c                |  6 ++++--
 fs/proc/fd.c                  |  3 ++-
 fs/proc/generic.c             |  3 ++-
 fs/proc/internal.h            |  3 ++-
 fs/proc/proc_net.c            |  3 ++-
 fs/proc/proc_sysctl.c         |  3 ++-
 fs/proc/root.c                |  3 ++-
 fs/smb/client/cifsfs.h        |  3 ++-
 fs/smb/client/inode.c         |  3 ++-
 fs/stat.c                     |  3 ++-
 fs/sysv/itree.c               |  3 ++-
 fs/sysv/sysv.h                |  2 +-
 fs/ubifs/dir.c                |  3 ++-
 fs/ubifs/file.c               |  6 ++++--
 fs/ubifs/ubifs.h              |  3 ++-
 fs/udf/symlink.c              |  3 ++-
 fs/vboxsf/utils.c             |  3 ++-
 fs/vboxsf/vfsmod.h            |  2 +-
 fs/xfs/xfs_iops.c             |  3 ++-
 include/linux/fs.h            | 10 ++++++++--
 include/linux/nfs_fs.h        |  2 +-
 mm/shmem.c                    |  3 ++-
 66 files changed, 159 insertions(+), 82 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 0d28ecf668d0..9c5a7e653bb1 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -1000,7 +1000,8 @@ v9fs_vfs_rename(struct mnt_idmap *idmap, struct
inode *old_dir,

 static int
 v9fs_vfs_getattr(struct mnt_idmap *idmap, const struct path *path,
-         struct kstat *stat, u32 request_mask, unsigned int flags)
+         struct kstat *stat, u32 request_mask, unsigned int flags,
+         unsigned int getattr_flags)
 {
     struct dentry *dentry = path->dentry;
     struct inode *inode = d_inode(dentry);
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index 1312f68965ac..e4238ee243bf 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -439,7 +439,8 @@ static int v9fs_vfs_mkdir_dotl(struct mnt_idmap *idmap,
 static int
 v9fs_vfs_getattr_dotl(struct mnt_idmap *idmap,
               const struct path *path, struct kstat *stat,
-              u32 request_mask, unsigned int flags)
+              u32 request_mask, unsigned int flags,
+              unsigned int getattr_flags)
 {
     struct dentry *dentry = path->dentry;
     struct v9fs_session_info *v9ses;
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 1c794a1896aa..8763e6126a8c 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -750,7 +750,8 @@ int afs_validate(struct afs_vnode *vnode, struct key
*key)
  * read the attributes of an inode
  */
 int afs_getattr(struct mnt_idmap *idmap, const struct path *path,
-        struct kstat *stat, u32 request_mask, unsigned int query_flags)
+        struct kstat *stat, u32 request_mask, unsigned int query_flags,
+        unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct afs_vnode *vnode = AFS_FS_I(inode);
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index da73b97e19a9..b8dfb6232086 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -1175,7 +1175,7 @@ extern bool afs_check_validity(struct afs_vnode *);
 extern int afs_validate(struct afs_vnode *, struct key *);
 bool afs_pagecache_valid(struct afs_vnode *);
 extern int afs_getattr(struct mnt_idmap *idmap, const struct path *,
-               struct kstat *, u32, unsigned int);
+               struct kstat *, u32, unsigned int, unsigned int);
 extern int afs_setattr(struct mnt_idmap *idmap, struct dentry *,
struct iattr *);
 extern void afs_evict_inode(struct inode *);
 extern int afs_drop_inode(struct inode *);
diff --git a/fs/bad_inode.c b/fs/bad_inode.c
index 83f9566c973b..22219161382d 100644
--- a/fs/bad_inode.c
+++ b/fs/bad_inode.c
@@ -97,7 +97,8 @@ static int bad_inode_permission(struct mnt_idmap *idmap,

 static int bad_inode_getattr(struct mnt_idmap *idmap,
                  const struct path *path, struct kstat *stat,
-                 u32 request_mask, unsigned int query_flags)
+                 u32 request_mask, unsigned int query_flags,
+                 unsigned int getattr_flags)
 {
     return -EIO;
 }
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 7814b9d654ce..bc9fbaa42b93 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8624,7 +8624,8 @@ int __init btrfs_init_cachep(void)

 static int btrfs_getattr(struct mnt_idmap *idmap,
              const struct path *path, struct kstat *stat,
-             u32 request_mask, unsigned int flags)
+             u32 request_mask, unsigned int flags,
+             unsigned int getattr_flags)
 {
     u64 delalloc_bytes;
     u64 inode_bytes;
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 800ab7920513..a798f0a7238f 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2255,11 +2255,13 @@ static const char
*ceph_encrypted_get_link(struct dentry *dentry,
 static int ceph_encrypted_symlink_getattr(struct mnt_idmap *idmap,
                       const struct path *path,
                       struct kstat *stat, u32 request_mask,
-                      unsigned int query_flags)
+                      unsigned int query_flags,
+                      unsigned int getattr_flags)
 {
     int ret;

-    ret = ceph_getattr(idmap, path, stat, request_mask, query_flags);
+    ret = ceph_getattr(idmap, path, stat, request_mask, query_flags,
+               getattr_flags);
     if (ret)
         return ret;
     return fscrypt_symlink_getattr(path, stat);
@@ -2960,7 +2962,8 @@ static int statx_to_caps(u32 want, umode_t mode)
  * then we can avoid talking to the MDS at all.
  */
 int ceph_getattr(struct mnt_idmap *idmap, const struct path *path,
-         struct kstat *stat, u32 request_mask, unsigned int flags)
+         struct kstat *stat, u32 request_mask, unsigned int flags,
+         unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct super_block *sb = inode->i_sb;
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 51c7f2b14f6f..f472cefd21bd 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1100,7 +1100,8 @@ extern int ceph_setattr(struct mnt_idmap *idmap,
             struct dentry *dentry, struct iattr *attr);
 extern int ceph_getattr(struct mnt_idmap *idmap,
             const struct path *path, struct kstat *stat,
-            u32 request_mask, unsigned int flags);
+            u32 request_mask, unsigned int flags,
+            unsigned int getattr_flags);
 void ceph_inode_shutdown(struct inode *inode);

 static inline bool ceph_inode_is_shutdown(struct inode *inode)
diff --git a/fs/coda/coda_linux.h b/fs/coda/coda_linux.h
index dd6277d87afb..ec6e30cbb35f 100644
--- a/fs/coda/coda_linux.h
+++ b/fs/coda/coda_linux.h
@@ -50,7 +50,7 @@ int coda_permission(struct mnt_idmap *idmap, struct
inode *inode,
             int mask);
 int coda_revalidate_inode(struct inode *);
 int coda_getattr(struct mnt_idmap *, const struct path *, struct kstat *,
-         u32, unsigned int);
+         u32, unsigned int, unsigned int);
 int coda_setattr(struct mnt_idmap *, struct dentry *, struct iattr *);

 /* this file:  helpers */
diff --git a/fs/coda/inode.c b/fs/coda/inode.c
index 0c7c2528791e..52465963c455 100644
--- a/fs/coda/inode.c
+++ b/fs/coda/inode.c
@@ -252,7 +252,8 @@ static void coda_evict_inode(struct inode *inode)
 }

 int coda_getattr(struct mnt_idmap *idmap, const struct path *path,
-         struct kstat *stat, u32 request_mask, unsigned int flags)
+         struct kstat *stat, u32 request_mask, unsigned int flags,
+         unsigned int getattr_flags)
 {
     int err = coda_revalidate_inode(d_inode(path->dentry));
     if (!err)
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 992d9c7e64ae..31173a4534d2 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -974,7 +974,8 @@ static int ecryptfs_setattr(struct mnt_idmap *idmap,

 static int ecryptfs_getattr_link(struct mnt_idmap *idmap,
                  const struct path *path, struct kstat *stat,
-                 u32 request_mask, unsigned int flags)
+                 u32 request_mask, unsigned int flags,
+                 unsigned int getattr_flags)
 {
     struct dentry *dentry = path->dentry;
     struct ecryptfs_mount_crypt_stat *mount_crypt_stat;
@@ -1000,14 +1001,19 @@ static int ecryptfs_getattr_link(struct
mnt_idmap *idmap,

 static int ecryptfs_getattr(struct mnt_idmap *idmap,
                 const struct path *path, struct kstat *stat,
-                u32 request_mask, unsigned int flags)
+                u32 request_mask, unsigned int flags,
+                unsigned int getattr_flags)
 {
     struct dentry *dentry = path->dentry;
     struct kstat lower_stat;
     int rc;

-    rc = vfs_getattr(ecryptfs_dentry_to_lower_path(dentry), &lower_stat,
-             request_mask, flags);
+    if (getattr_flags & GETATTR_NOSEC)
+        rc = vfs_getattr_nosec(ecryptfs_dentry_to_lower_path(dentry),
+                       &lower_stat, request_mask, flags);
+    else
+        rc = vfs_getattr(ecryptfs_dentry_to_lower_path(dentry),
+                 &lower_stat, request_mask, flags);
     if (!rc) {
         fsstack_copy_attr_all(d_inode(dentry),
                       ecryptfs_inode_to_lower(d_inode(dentry)));
diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index edc8ec7581b8..f9ac4c001c17 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -358,7 +358,7 @@ struct inode *erofs_iget(struct super_block *sb,
erofs_nid_t nid)

 int erofs_getattr(struct mnt_idmap *idmap, const struct path *path,
           struct kstat *stat, u32 request_mask,
-          unsigned int query_flags)
+          unsigned int query_flags, unsigned int getattr_flags)
 {
     struct inode *const inode = d_inode(path->dentry);

diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 4ff88d0dd980..16fbb39d670f 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -422,7 +422,7 @@ int erofs_map_blocks(struct inode *inode, struct
erofs_map_blocks *map);
 struct inode *erofs_iget(struct super_block *sb, erofs_nid_t nid);
 int erofs_getattr(struct mnt_idmap *idmap, const struct path *path,
           struct kstat *stat, u32 request_mask,
-          unsigned int query_flags);
+          unsigned int query_flags, unsigned int getattr_flags);
 int erofs_namei(struct inode *dir, const struct qstr *name,
         erofs_nid_t *nid, unsigned int *d_type);

diff --git a/fs/exfat/exfat_fs.h b/fs/exfat/exfat_fs.h
index f55498e5c23d..5e86c62de12f 100644
--- a/fs/exfat/exfat_fs.h
+++ b/fs/exfat/exfat_fs.h
@@ -454,7 +454,7 @@ int exfat_setattr(struct mnt_idmap *idmap, struct
dentry *dentry,
           struct iattr *attr);
 int exfat_getattr(struct mnt_idmap *idmap, const struct path *path,
           struct kstat *stat, unsigned int request_mask,
-          unsigned int query_flags);
+          unsigned int query_flags, unsigned int getattr_flags);
 int exfat_file_fsync(struct file *file, loff_t start, loff_t end, int
datasync);
 long exfat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
 long exfat_compat_ioctl(struct file *filp, unsigned int cmd,
diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index 32395ef686a2..06cb318bfa0a 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -227,7 +227,7 @@ void exfat_truncate(struct inode *inode)

 int exfat_getattr(struct mnt_idmap *idmap, const struct path *path,
           struct kstat *stat, unsigned int request_mask,
-          unsigned int query_flags)
+          unsigned int query_flags, unsigned int getattr_flags)
 {
     struct inode *inode = d_backing_inode(path->dentry);
     struct exfat_inode_info *ei = EXFAT_I(inode);
diff --git a/fs/ext2/ext2.h b/fs/ext2/ext2.h
index 7fdd685c384d..a2b460786d21 100644
--- a/fs/ext2/ext2.h
+++ b/fs/ext2/ext2.h
@@ -748,7 +748,7 @@ void ext2_write_failed(struct address_space
*mapping, loff_t to);
 extern int ext2_get_block(struct inode *, sector_t, struct buffer_head
*, int);
 extern int ext2_setattr (struct mnt_idmap *, struct dentry *, struct
iattr *);
 extern int ext2_getattr (struct mnt_idmap *, const struct path *,
-             struct kstat *, u32, unsigned int);
+             struct kstat *, u32, unsigned int, unsigned int);
 extern void ext2_set_inode_flags(struct inode *inode);
 extern int ext2_fiemap(struct inode *inode, struct fiemap_extent_info
*fieinfo,
                u64 start, u64 len);
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 314b415ee518..153738e254ac 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -1611,7 +1611,8 @@ int ext2_write_inode(struct inode *inode, struct
writeback_control *wbc)
 }

 int ext2_getattr(struct mnt_idmap *idmap, const struct path *path,
-         struct kstat *stat, u32 request_mask, unsigned int query_flags)
+         struct kstat *stat, u32 request_mask, unsigned int query_flags,
+         unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct ext2_inode_info *ei = EXT2_I(inode);
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 9418359b1d9d..90b1e08bd89a 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2977,11 +2977,13 @@ extern int  ext4_setattr(struct mnt_idmap *,
struct dentry *,
              struct iattr *);
 extern u32  ext4_dio_alignment(struct inode *inode);
 extern int  ext4_getattr(struct mnt_idmap *, const struct path *,
-             struct kstat *, u32, unsigned int);
+             struct kstat *, u32, unsigned int,
+             unsigned int);
 extern void ext4_evict_inode(struct inode *);
 extern void ext4_clear_inode(struct inode *);
 extern int  ext4_file_getattr(struct mnt_idmap *, const struct path *,
-                  struct kstat *, u32, unsigned int);
+                  struct kstat *, u32, unsigned int,
+                  unsigned int);
 extern void ext4_dirty_inode(struct inode *, int);
 extern int ext4_change_inode_journal_flag(struct inode *, int);
 extern int ext4_get_inode_loc(struct inode *, struct ext4_iloc *);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 4ce35f1c8b0a..ede71d313519 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5515,7 +5515,8 @@ u32 ext4_dio_alignment(struct inode *inode)
 }

 int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
-         struct kstat *stat, u32 request_mask, unsigned int query_flags)
+         struct kstat *stat, u32 request_mask, unsigned int query_flags,
+         unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct ext4_inode *raw_inode;
@@ -5577,12 +5578,14 @@ int ext4_getattr(struct mnt_idmap *idmap, const
struct path *path,

 int ext4_file_getattr(struct mnt_idmap *idmap,
               const struct path *path, struct kstat *stat,
-              u32 request_mask, unsigned int query_flags)
+              u32 request_mask, unsigned int query_flags,
+              unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     u64 delalloc_blocks;

-    ext4_getattr(idmap, path, stat, request_mask, query_flags);
+    ext4_getattr(idmap, path, stat, request_mask, query_flags,
+             getattr_flags);

     /*
      * If there is inline data in the inode, the inode will normally not
diff --git a/fs/ext4/symlink.c b/fs/ext4/symlink.c
index 75bf1f88843c..abc30ebb2be2 100644
--- a/fs/ext4/symlink.c
+++ b/fs/ext4/symlink.c
@@ -58,9 +58,11 @@ static const char *ext4_encrypted_get_link(struct
dentry *dentry,
 static int ext4_encrypted_symlink_getattr(struct mnt_idmap *idmap,
                       const struct path *path,
                       struct kstat *stat, u32 request_mask,
-                      unsigned int query_flags)
+                      unsigned int query_flags,
+                      unsigned int getattr_flags)
 {
-    ext4_getattr(idmap, path, stat, request_mask, query_flags);
+    ext4_getattr(idmap, path, stat, request_mask, query_flags,
+             getattr_flags);

     return fscrypt_symlink_getattr(path, stat);
 }
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 6d688e42d89c..0a8436cd2f5f 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3477,7 +3477,8 @@ int f2fs_do_truncate_blocks(struct inode *inode,
u64 from, bool lock);
 int f2fs_truncate_blocks(struct inode *inode, u64 from, bool lock);
 int f2fs_truncate(struct inode *inode);
 int f2fs_getattr(struct mnt_idmap *idmap, const struct path *path,
-         struct kstat *stat, u32 request_mask, unsigned int flags);
+         struct kstat *stat, u32 request_mask, unsigned int flags,
+         unsigned int getattr_flags);
 int f2fs_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
          struct iattr *attr);
 int f2fs_truncate_hole(struct inode *inode, pgoff_t pg_start, pgoff_t
pg_end);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index ca5904129b16..e05bdd318aef 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -832,7 +832,8 @@ static bool f2fs_force_buffered_io(struct inode
*inode, int rw)
 }

 int f2fs_getattr(struct mnt_idmap *idmap, const struct path *path,
-         struct kstat *stat, u32 request_mask, unsigned int query_flags)
+         struct kstat *stat, u32 request_mask, unsigned int query_flags,
+         unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct f2fs_inode_info *fi = F2FS_I(inode);
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index 193b22a2d6bf..cfdcdeb2c0b0 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -1356,9 +1356,11 @@ static const char *f2fs_encrypted_get_link(struct
dentry *dentry,
 static int f2fs_encrypted_symlink_getattr(struct mnt_idmap *idmap,
                       const struct path *path,
                       struct kstat *stat, u32 request_mask,
-                      unsigned int query_flags)
+                      unsigned int query_flags,
+                      unsigned int getattr_flags)
 {
-    f2fs_getattr(idmap, path, stat, request_mask, query_flags);
+    f2fs_getattr(idmap, path, stat, request_mask, query_flags,
+             getattr_flags);

     return fscrypt_symlink_getattr(path, stat);
 }
diff --git a/fs/fat/fat.h b/fs/fat/fat.h
index 66cf4778cf3b..74b57f2e1c36 100644
--- a/fs/fat/fat.h
+++ b/fs/fat/fat.h
@@ -403,7 +403,8 @@ extern int fat_setattr(struct mnt_idmap *idmap,
struct dentry *dentry,
 extern void fat_truncate_blocks(struct inode *inode, loff_t offset);
 extern int fat_getattr(struct mnt_idmap *idmap,
                const struct path *path, struct kstat *stat,
-               u32 request_mask, unsigned int flags);
+               u32 request_mask, unsigned int flags,
+               unsigned int getattr_flags);
 extern int fat_file_fsync(struct file *file, loff_t start, loff_t end,
               int datasync);

diff --git a/fs/fat/file.c b/fs/fat/file.c
index e887e9ab7472..2177784f54a2 100644
--- a/fs/fat/file.c
+++ b/fs/fat/file.c
@@ -396,7 +396,8 @@ void fat_truncate_blocks(struct inode *inode, loff_t
offset)
 }

 int fat_getattr(struct mnt_idmap *idmap, const struct path *path,
-        struct kstat *stat, u32 request_mask, unsigned int flags)
+        struct kstat *stat, u32 request_mask, unsigned int flags,
+        unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb);
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index d707e6987da9..3531d4239635 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -2069,7 +2069,8 @@ static int fuse_setattr(struct mnt_idmap *idmap,
struct dentry *entry,

 static int fuse_getattr(struct mnt_idmap *idmap,
             const struct path *path, struct kstat *stat,
-            u32 request_mask, unsigned int flags)
+            u32 request_mask, unsigned int flags,
+            unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct fuse_conn *fc = get_fuse_conn(inode);
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 0eac04507904..50383ec30573 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -2038,6 +2038,7 @@ static int gfs2_setattr(struct mnt_idmap *idmap,
  * @stat: The inode's stats
  * @request_mask: Mask of STATX_xxx flags indicating the caller's
interests
  * @flags: AT_STATX_xxx setting
+ * @getattr_flags: GETATTR_xxx
  *
  * This may be called from the VFS directly, or from within GFS2 with the
  * inode locked, so we look to see if the glock is already locked and only
@@ -2050,7 +2051,8 @@ static int gfs2_setattr(struct mnt_idmap *idmap,

 static int gfs2_getattr(struct mnt_idmap *idmap,
             const struct path *path, struct kstat *stat,
-            u32 request_mask, unsigned int flags)
+            u32 request_mask, unsigned int flags,
+            unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct gfs2_inode *ip = GFS2_I(inode);
diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
index 7ededcb720c1..3da3a79c9742 100644
--- a/fs/hfsplus/hfsplus_fs.h
+++ b/fs/hfsplus/hfsplus_fs.h
@@ -483,7 +483,7 @@ int hfsplus_cat_read_inode(struct inode *inode,
struct hfs_find_data *fd);
 int hfsplus_cat_write_inode(struct inode *inode);
 int hfsplus_getattr(struct mnt_idmap *idmap, const struct path *path,
             struct kstat *stat, u32 request_mask,
-            unsigned int query_flags);
+            unsigned int query_flags, unsigned int getattr_flags);
 int hfsplus_file_fsync(struct file *file, loff_t start, loff_t end,
                int datasync);
 int hfsplus_fileattr_get(struct dentry *dentry, struct fileattr *fa);
diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c
index c65c8c4b03dd..afa7e8ee8cb2 100644
--- a/fs/hfsplus/inode.c
+++ b/fs/hfsplus/inode.c
@@ -278,7 +278,7 @@ static int hfsplus_setattr(struct mnt_idmap *idmap,

 int hfsplus_getattr(struct mnt_idmap *idmap, const struct path *path,
             struct kstat *stat, u32 request_mask,
-            unsigned int query_flags)
+            unsigned int query_flags, unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct hfsplus_inode_info *hip = HFSPLUS_I(inode);
diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index 922719a343a7..92af4b394274 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -182,7 +182,8 @@ static void kernfs_refresh_inode(struct kernfs_node
*kn, struct inode *inode)

 int kernfs_iop_getattr(struct mnt_idmap *idmap,
                const struct path *path, struct kstat *stat,
-               u32 request_mask, unsigned int query_flags)
+               u32 request_mask, unsigned int query_flags,
+               unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct kernfs_node *kn = inode->i_private;
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index a9b854cdfdb5..ccf74e08105c 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -135,7 +135,8 @@ int kernfs_iop_setattr(struct mnt_idmap *idmap,
struct dentry *dentry,
                struct iattr *iattr);
 int kernfs_iop_getattr(struct mnt_idmap *idmap,
                const struct path *path, struct kstat *stat,
-               u32 request_mask, unsigned int query_flags);
+               u32 request_mask, unsigned int query_flags,
+               unsigned int getattr_flags);
 ssize_t kernfs_iop_listxattr(struct dentry *dentry, char *buf, size_t
size);
 int __kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr);

diff --git a/fs/libfs.c b/fs/libfs.c
index 37f2d34ee090..757d24b8f4be 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -30,7 +30,7 @@

 int simple_getattr(struct mnt_idmap *idmap, const struct path *path,
            struct kstat *stat, u32 request_mask,
-           unsigned int query_flags)
+           unsigned int query_flags, unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat);
@@ -1579,7 +1579,8 @@ static struct dentry *empty_dir_lookup(struct
inode *dir, struct dentry *dentry,

 static int empty_dir_getattr(struct mnt_idmap *idmap,
                  const struct path *path, struct kstat *stat,
-                 u32 request_mask, unsigned int query_flags)
+                 u32 request_mask, unsigned int query_flags,
+                 unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat);
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index df575473c1cc..74ae2c33f5a5 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -651,7 +651,8 @@ static int minix_write_inode(struct inode *inode,
struct writeback_control *wbc)
 }

 int minix_getattr(struct mnt_idmap *idmap, const struct path *path,
-          struct kstat *stat, u32 request_mask, unsigned int flags)
+          struct kstat *stat, u32 request_mask, unsigned int flags,
+          unsigned int getattr_flags)
 {
     struct super_block *sb = path->dentry->d_sb;
     struct inode *inode = d_inode(path->dentry);
diff --git a/fs/minix/minix.h b/fs/minix/minix.h
index d493507c064f..6ac55d3b649c 100644
--- a/fs/minix/minix.h
+++ b/fs/minix/minix.h
@@ -52,7 +52,8 @@ extern int minix_new_block(struct inode * inode);
 extern void minix_free_block(struct inode *inode, unsigned long block);
 extern unsigned long minix_count_free_blocks(struct super_block *sb);
 extern int minix_getattr(struct mnt_idmap *, const struct path *,
-             struct kstat *, u32, unsigned int);
+             struct kstat *, u32, unsigned int,
+             unsigned int);
 extern int minix_prepare_chunk(struct page *page, loff_t pos, unsigned
len);

 extern void V1_minix_truncate(struct inode *);
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index e21c073158e5..c5cad2515b37 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -831,7 +831,8 @@ static u32 nfs_get_valid_attrmask(struct inode *inode)
 }

 int nfs_getattr(struct mnt_idmap *idmap, const struct path *path,
-        struct kstat *stat, u32 request_mask, unsigned int query_flags)
+        struct kstat *stat, u32 request_mask, unsigned int query_flags,
+        unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct nfs_server *server = NFS_SERVER(inode);
diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c
index e7494cdd957e..d4c3622ff4ed 100644
--- a/fs/nfs/namespace.c
+++ b/fs/nfs/namespace.c
@@ -210,11 +210,12 @@ struct vfsmount *nfs_d_automount(struct path *path)
 static int
 nfs_namespace_getattr(struct mnt_idmap *idmap,
               const struct path *path, struct kstat *stat,
-              u32 request_mask, unsigned int query_flags)
+              u32 request_mask, unsigned int query_flags,
+              unsigned int getattr_flags)
 {
     if (NFS_FH(d_inode(path->dentry))->size != 0)
         return nfs_getattr(idmap, path, stat, request_mask,
-                   query_flags);
+                   query_flags, getattr_flags);
     generic_fillattr(&nop_mnt_idmap, request_mask, d_inode(path->dentry),
              stat);
     return 0;
diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
index 962f12ce6c0a..729ca3c09958 100644
--- a/fs/ntfs3/file.c
+++ b/fs/ntfs3/file.c
@@ -72,7 +72,8 @@ static long ntfs_compat_ioctl(struct file *filp, u32
cmd, unsigned long arg)
  * ntfs_getattr - inode_operations::getattr
  */
 int ntfs_getattr(struct mnt_idmap *idmap, const struct path *path,
-         struct kstat *stat, u32 request_mask, u32 flags)
+         struct kstat *stat, u32 request_mask, u32 flags,
+         unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct ntfs_inode *ni = ntfs_i(inode);
diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h
index 629403ede6e5..3c83352883f2 100644
--- a/fs/ntfs3/ntfs_fs.h
+++ b/fs/ntfs3/ntfs_fs.h
@@ -492,7 +492,8 @@ extern const struct file_operations ntfs_dir_operations;

 /* Globals from file.c */
 int ntfs_getattr(struct mnt_idmap *idmap, const struct path *path,
-         struct kstat *stat, u32 request_mask, u32 flags);
+         struct kstat *stat, u32 request_mask, u32 flags,
+         unsigned int getattr_flags);
 int ntfs3_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
           struct iattr *attr);
 void ntfs_sparse_cluster(struct inode *inode, struct page *page0, CLST
vcn,
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index c45596c25c66..7e367cd801d3 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -1300,7 +1300,8 @@ int ocfs2_setattr(struct mnt_idmap *idmap, struct
dentry *dentry,
 }

 int ocfs2_getattr(struct mnt_idmap *idmap, const struct path *path,
-          struct kstat *stat, u32 request_mask, unsigned int flags)
+          struct kstat *stat, u32 request_mask, unsigned int flags,
+          unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct super_block *sb = path->dentry->d_sb;
diff --git a/fs/ocfs2/file.h b/fs/ocfs2/file.h
index 8e53e4ac1120..2dbf4edd94da 100644
--- a/fs/ocfs2/file.h
+++ b/fs/ocfs2/file.h
@@ -52,7 +52,8 @@ int ocfs2_zero_extend(struct inode *inode, struct
buffer_head *di_bh,
 int ocfs2_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
           struct iattr *attr);
 int ocfs2_getattr(struct mnt_idmap *idmap, const struct path *path,
-          struct kstat *stat, u32 request_mask, unsigned int flags);
+          struct kstat *stat, u32 request_mask, unsigned int flags,
+          unsigned int getattr_flags);
 int ocfs2_permission(struct mnt_idmap *idmap,
              struct inode *inode,
              int mask);
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c
index 085912268442..f7c2c318d392 100644
--- a/fs/orangefs/inode.c
+++ b/fs/orangefs/inode.c
@@ -859,7 +859,8 @@ int orangefs_setattr(struct mnt_idmap *idmap, struct
dentry *dentry,
  * Obtain attributes of an object given a dentry
  */
 int orangefs_getattr(struct mnt_idmap *idmap, const struct path *path,
-             struct kstat *stat, u32 request_mask, unsigned int flags)
+             struct kstat *stat, u32 request_mask, unsigned int flags,
+             unsigned int getattr_flags)
 {
     int ret;
     struct inode *inode = path->dentry->d_inode;
diff --git a/fs/orangefs/orangefs-kernel.h b/fs/orangefs/orangefs-kernel.h
index b711654ca18a..f0682b51e5e0 100644
--- a/fs/orangefs/orangefs-kernel.h
+++ b/fs/orangefs/orangefs-kernel.h
@@ -365,7 +365,8 @@ int __orangefs_setattr_mode(struct dentry *dentry,
struct iattr *iattr);
 int orangefs_setattr(struct mnt_idmap *, struct dentry *, struct iattr *);

 int orangefs_getattr(struct mnt_idmap *idmap, const struct path *path,
-             struct kstat *stat, u32 request_mask, unsigned int flags);
+             struct kstat *stat, u32 request_mask, unsigned int flags,
+             unsigned int getattr_flags);

 int orangefs_permission(struct mnt_idmap *idmap,
             struct inode *inode, int mask);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 83ef66644c21..8c0f3e125d09 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -155,7 +155,8 @@ static void ovl_map_dev_ino(struct dentry *dentry,
struct kstat *stat, int fsid)
 }

 int ovl_getattr(struct mnt_idmap *idmap, const struct path *path,
-        struct kstat *stat, u32 request_mask, unsigned int flags)
+        struct kstat *stat, u32 request_mask, unsigned int flags,
+        unsigned int getattr_flags)
 {
     struct dentry *dentry = path->dentry;
     enum ovl_path_type type;
@@ -171,7 +172,10 @@ int ovl_getattr(struct mnt_idmap *idmap, const
struct path *path,

     type = ovl_path_real(dentry, &realpath);
     old_cred = ovl_override_creds(dentry->d_sb);
-    err = vfs_getattr(&realpath, stat, request_mask, flags);
+    if (getattr_flags & GETATTR_NOSEC)
+        err = vfs_getattr_nosec(&realpath, stat, request_mask, flags);
+    else
+        err = vfs_getattr(&realpath, stat, request_mask, flags);
     if (err)
         goto out;

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 9817b2dcb132..6f91bfb1bbbd 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -687,7 +687,8 @@ unsigned int ovl_get_nlink(struct ovl_fs *ofs,
struct dentry *lowerdentry,
 int ovl_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
         struct iattr *attr);
 int ovl_getattr(struct mnt_idmap *idmap, const struct path *path,
-        struct kstat *stat, u32 request_mask, unsigned int flags);
+        struct kstat *stat, u32 request_mask, unsigned int flags,
+        unsigned int getattr_flags);
 int ovl_permission(struct mnt_idmap *idmap, struct inode *inode,
            int mask);
 int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const
char *name,
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ffd54617c354..3e733cd17fba 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1960,7 +1960,8 @@ static struct inode
*proc_pid_make_base_inode(struct super_block *sb,
 }

 int pid_getattr(struct mnt_idmap *idmap, const struct path *path,
-        struct kstat *stat, u32 request_mask, unsigned int query_flags)
+        struct kstat *stat, u32 request_mask, unsigned int query_flags,
+        unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct proc_fs_info *fs_info = proc_sb_info(inode->i_sb);
@@ -3896,7 +3897,8 @@ static int proc_task_readdir(struct file *file,
struct dir_context *ctx)

 static int proc_task_getattr(struct mnt_idmap *idmap,
                  const struct path *path, struct kstat *stat,
-                 u32 request_mask, unsigned int query_flags)
+                 u32 request_mask, unsigned int query_flags,
+                 unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct task_struct *p = get_proc_task(inode);
diff --git a/fs/proc/fd.c b/fs/proc/fd.c
index 6276b3938842..1ea9eea9953d 100644
--- a/fs/proc/fd.c
+++ b/fs/proc/fd.c
@@ -347,7 +347,8 @@ int proc_fd_permission(struct mnt_idmap *idmap,

 static int proc_fd_getattr(struct mnt_idmap *idmap,
             const struct path *path, struct kstat *stat,
-            u32 request_mask, unsigned int query_flags)
+            u32 request_mask, unsigned int query_flags,
+            unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     int rv = 0;
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index 775ce0bcf08c..4acb07536329 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -135,7 +135,8 @@ static int proc_notify_change(struct mnt_idmap *idmap,

 static int proc_getattr(struct mnt_idmap *idmap,
             const struct path *path, struct kstat *stat,
-            u32 request_mask, unsigned int query_flags)
+            u32 request_mask, unsigned int query_flags,
+            unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct proc_dir_entry *de = PDE(inode);
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 9a8f32f21ff5..c3bc8eee4771 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -163,7 +163,8 @@ extern int proc_pid_statm(struct seq_file *, struct
pid_namespace *,
  */
 extern const struct dentry_operations pid_dentry_operations;
 extern int pid_getattr(struct mnt_idmap *, const struct path *,
-               struct kstat *, u32, unsigned int);
+               struct kstat *, u32, unsigned int,
+               unsigned int);
 extern int proc_setattr(struct mnt_idmap *, struct dentry *,
             struct iattr *);
 extern void proc_pid_evict_inode(struct proc_inode *);
diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index 2ba31b6d68c0..e76019d1fe3b 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -301,7 +301,8 @@ static struct dentry *proc_tgid_net_lookup(struct
inode *dir,

 static int proc_tgid_net_getattr(struct mnt_idmap *idmap,
                  const struct path *path, struct kstat *stat,
-                 u32 request_mask, unsigned int query_flags)
+                 u32 request_mask, unsigned int query_flags,
+                 unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct net *net;
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index c88854df0b62..3dc86f02f64e 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -842,7 +842,8 @@ static int proc_sys_setattr(struct mnt_idmap *idmap,

 static int proc_sys_getattr(struct mnt_idmap *idmap,
                 const struct path *path, struct kstat *stat,
-                u32 request_mask, unsigned int query_flags)
+                u32 request_mask, unsigned int query_flags,
+                unsigned int getattr_flags)
 {
     struct inode *inode = d_inode(path->dentry);
     struct ctl_table_header *head = grab_header(inode);
diff --git a/fs/proc/root.c b/fs/proc/root.c
index 9191248f2dac..ab9113fc119d 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -312,7 +312,8 @@ void __init proc_root_init(void)

 static int proc_root_getattr(struct mnt_idmap *idmap,
                  const struct path *path, struct kstat *stat,
-                 u32 request_mask, unsigned int query_flags)
+                 u32 request_mask, unsigned int query_flags,
+                 unsigned int getattr_flags)
 {
     generic_fillattr(&nop_mnt_idmap, request_mask, d_inode(path->dentry),
              stat);
diff --git a/fs/smb/client/cifsfs.h b/fs/smb/client/cifsfs.h
index 41daebd220ff..453a3bd7ba65 100644
--- a/fs/smb/client/cifsfs.h
+++ b/fs/smb/client/cifsfs.h
@@ -73,7 +73,8 @@ extern int cifs_invalidate_mapping(struct inode *inode);
 extern int cifs_revalidate_mapping(struct inode *inode);
 extern int cifs_zap_mapping(struct inode *inode);
 extern int cifs_getattr(struct mnt_idmap *, const struct path *,
-            struct kstat *, u32, unsigned int);
+            struct kstat *, u32, unsigned int,
+            unsigned int);
 extern int cifs_setattr(struct mnt_idmap *, struct dentry *,
             struct iattr *);
 extern int cifs_fiemap(struct inode *, struct fiemap_extent_info *,
u64 start,
diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c
index d7c302442c1e..1aa1bd0c85fd 100644
--- a/fs/smb/client/inode.c
+++ b/fs/smb/client/inode.c
@@ -2570,7 +2570,8 @@ int cifs_revalidate_dentry(struct dentry *dentry)
 }

 int cifs_getattr(struct mnt_idmap *idmap, const struct path *path,
-         struct kstat *stat, u32 request_mask, unsigned int flags)
+         struct kstat *stat, u32 request_mask, unsigned int flags,
+         unsigned int getattr_flags)
 {
     struct dentry *dentry = path->dentry;
     struct cifs_sb_info *cifs_sb = CIFS_SB(dentry->d_sb);
diff --git a/fs/stat.c b/fs/stat.c
index d43a5cc1bfa4..fc4f6c0a91a8 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -133,7 +133,8 @@ int vfs_getattr_nosec(const struct path *path,
struct kstat *stat,
     idmap = mnt_idmap(path->mnt);
     if (inode->i_op->getattr)
         return inode->i_op->getattr(idmap, path, stat,
-                        request_mask, query_flags);
+                        request_mask, query_flags,
+                        GETATTR_NOSEC);

     generic_fillattr(idmap, request_mask, inode, stat);
     return 0;
diff --git a/fs/sysv/itree.c b/fs/sysv/itree.c
index edb94e55de8e..3ff96577f0db 100644
--- a/fs/sysv/itree.c
+++ b/fs/sysv/itree.c
@@ -446,7 +446,8 @@ static unsigned sysv_nblocks(struct super_block *s,
loff_t size)
 }

 int sysv_getattr(struct mnt_idmap *idmap, const struct path *path,
-         struct kstat *stat, u32 request_mask, unsigned int flags)
+         struct kstat *stat, u32 request_mask, unsigned int flags,
+         unsigned int getattr_flags)
 {
     struct super_block *s = path->dentry->d_sb;
     generic_fillattr(&nop_mnt_idmap, request_mask, d_inode(path->dentry),
diff --git a/fs/sysv/sysv.h b/fs/sysv/sysv.h
index e3f988b469ee..1e6e1e2faad4 100644
--- a/fs/sysv/sysv.h
+++ b/fs/sysv/sysv.h
@@ -142,7 +142,7 @@ extern int sysv_write_inode(struct inode *, struct
writeback_control *wbc);
 extern int sysv_sync_inode(struct inode *);
 extern void sysv_set_inode(struct inode *, dev_t);
 extern int sysv_getattr(struct mnt_idmap *, const struct path *,
-            struct kstat *, u32, unsigned int);
+            struct kstat *, u32, unsigned int, unsigned int);
 extern int sysv_init_icache(void);
 extern void sysv_destroy_icache(void);

diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
index 2f48c58d47cd..30fe9f64292c 100644
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -1632,7 +1632,8 @@ static int ubifs_rename(struct mnt_idmap *idmap,
 }

 int ubifs_getattr(struct mnt_idmap *idmap, const struct path *path,
-          struct kstat *stat, u32 request_mask, unsigned int flags)
+          struct kstat *stat, u32 request_mask, unsigned int flags,
+          unsigned int getattr_flags)
 {
     loff_t size;
     struct inode *inode = d_inode(path->dentry);
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index e5382f0b2587..ef0dda82adfc 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1622,9 +1622,11 @@ static const char *ubifs_get_link(struct dentry
*dentry,

 static int ubifs_symlink_getattr(struct mnt_idmap *idmap,
                  const struct path *path, struct kstat *stat,
-                 u32 request_mask, unsigned int query_flags)
+                 u32 request_mask, unsigned int query_flags,
+                 unsigned int getattr_flags)
 {
-    ubifs_getattr(idmap, path, stat, request_mask, query_flags);
+    ubifs_getattr(idmap, path, stat, request_mask, query_flags,
+              getattr_flags);

     if (IS_ENCRYPTED(d_inode(path->dentry)))
         return fscrypt_symlink_getattr(path, stat);
diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
index ebb3ad6b5e7e..bf1429096176 100644
--- a/fs/ubifs/ubifs.h
+++ b/fs/ubifs/ubifs.h
@@ -2033,7 +2033,8 @@ int ubifs_update_time(struct inode *inode, int flags);
 struct inode *ubifs_new_inode(struct ubifs_info *c, struct inode *dir,
                   umode_t mode, bool is_xattr);
 int ubifs_getattr(struct mnt_idmap *idmap, const struct path *path,
-          struct kstat *stat, u32 request_mask, unsigned int flags);
+          struct kstat *stat, u32 request_mask, unsigned int flags,
+          unsigned int getattr_flags);
 int ubifs_check_dir_empty(struct inode *dir);

 /* xattr.c */
diff --git a/fs/udf/symlink.c b/fs/udf/symlink.c
index f7eaf7b14594..743f5f59e94c 100644
--- a/fs/udf/symlink.c
+++ b/fs/udf/symlink.c
@@ -143,7 +143,8 @@ static int udf_symlink_filler(struct file *file,
struct folio *folio)

 static int udf_symlink_getattr(struct mnt_idmap *idmap,
                    const struct path *path, struct kstat *stat,
-                   u32 request_mask, unsigned int flags)
+                   u32 request_mask, unsigned int flags,
+                   unsigned int getattr_flags)
 {
     struct dentry *dentry = path->dentry;
     struct inode *inode = d_backing_inode(dentry);
diff --git a/fs/vboxsf/utils.c b/fs/vboxsf/utils.c
index 83f20dd15522..8bbc31c99ede 100644
--- a/fs/vboxsf/utils.c
+++ b/fs/vboxsf/utils.c
@@ -232,7 +232,8 @@ int vboxsf_inode_revalidate(struct dentry *dentry)
 }

 int vboxsf_getattr(struct mnt_idmap *idmap, const struct path *path,
-           struct kstat *kstat, u32 request_mask, unsigned int flags)
+           struct kstat *kstat, u32 request_mask, unsigned int flags,
+           unsigned int getattr_flags)
 {
     int err;
     struct dentry *dentry = path->dentry;
diff --git a/fs/vboxsf/vfsmod.h b/fs/vboxsf/vfsmod.h
index 05973eb89d52..8ec1edc0cbe6 100644
--- a/fs/vboxsf/vfsmod.h
+++ b/fs/vboxsf/vfsmod.h
@@ -99,7 +99,7 @@ int vboxsf_stat_dentry(struct dentry *dentry, struct
shfl_fsobjinfo *info);
 int vboxsf_inode_revalidate(struct dentry *dentry);
 int vboxsf_getattr(struct mnt_idmap *idmap, const struct path *path,
            struct kstat *kstat, u32 request_mask,
-           unsigned int query_flags);
+           unsigned int query_flags, unsigned int getattr_flags);
 int vboxsf_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
            struct iattr *iattr);
 struct shfl_string *vboxsf_path_from_dentry(struct vboxsf_sbi *sbi,
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 1c1e6171209d..d70ab9d791a9 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -552,7 +552,8 @@ xfs_vn_getattr(
     const struct path    *path,
     struct kstat        *stat,
     u32            request_mask,
-    unsigned int        query_flags)
+    unsigned int        query_flags,
+    unsigned int        getattr_flags)
 {
     struct inode        *inode = d_inode(path->dentry);
     struct xfs_inode    *ip = XFS_I(inode);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b528f063e8ff..f2ec5a48e5dc 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1924,7 +1924,7 @@ struct inode_operations {
             struct inode *, struct dentry *, unsigned int);
     int (*setattr) (struct mnt_idmap *, struct dentry *, struct iattr *);
     int (*getattr) (struct mnt_idmap *, const struct path *,
-            struct kstat *, u32, unsigned int);
+            struct kstat *, u32, unsigned int, unsigned int);
     ssize_t (*listxattr) (struct dentry *, char *, size_t);
     int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start,
               u64 len);
@@ -2098,6 +2098,11 @@ static inline bool sb_rdonly(const struct
super_block *sb) { return sb->s_flags
 #define IS_WHITEOUT(inode)    (S_ISCHR(inode->i_mode) && \
                  (inode)->i_rdev == WHITEOUT_DEV)

+/*
+ * getattr flags
+ */
+#define GETATTR_NOSEC    (1 << 0)
+
 static inline bool HAS_UNMAPPED_ID(struct mnt_idmap *idmap,
                    struct inode *inode)
 {
@@ -3066,7 +3071,8 @@ extern int dcache_readdir(struct file *, struct
dir_context *);
 extern int simple_setattr(struct mnt_idmap *, struct dentry *,
               struct iattr *);
 extern int simple_getattr(struct mnt_idmap *, const struct path *,
-              struct kstat *, u32, unsigned int);
+              struct kstat *, u32, unsigned int,
+              unsigned int);
 extern int simple_statfs(struct dentry *, struct kstatfs *);
 extern int simple_open(struct inode *inode, struct file *file);
 extern int simple_link(struct dentry *, struct inode *, struct dentry *);
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 279262057a92..f6cecdbe11ca 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -421,7 +421,7 @@ extern int nfs_post_op_update_inode(struct inode
*inode, struct nfs_fattr *fattr
 extern int nfs_post_op_update_inode_force_wcc(struct inode *inode,
struct nfs_fattr *fattr);
 extern int nfs_post_op_update_inode_force_wcc_locked(struct inode
*inode, struct nfs_fattr *fattr);
 extern int nfs_getattr(struct mnt_idmap *, const struct path *,
-               struct kstat *, u32, unsigned int);
+               struct kstat *, u32, unsigned int, unsigned int);
 extern void nfs_access_add_cache(struct inode *, struct
nfs_access_entry *, const struct cred *);
 extern void nfs_access_set_mask(struct nfs_access_entry *, u32);
 extern int nfs_permission(struct mnt_idmap *, struct inode *, int);
diff --git a/mm/shmem.c b/mm/shmem.c
index 69595d341882..ff43a0d248bb 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1119,7 +1119,8 @@ EXPORT_SYMBOL_GPL(shmem_truncate_range);

 static int shmem_getattr(struct mnt_idmap *idmap,
              const struct path *path, struct kstat *stat,
-             u32 request_mask, unsigned int query_flags)
+             u32 request_mask, unsigned int query_flags,
+             unsigned int getattr_flags)
 {
     struct inode *inode = path->dentry->d_inode;
     struct shmem_inode_info *info = SHMEM_I(inode);
--
2.40.1

Amir Goldstein

unread,
Sep 29, 2023, 12:25:21 AM9/29/23
to Stefan Berger, Christian Brauner, Mimi Zohar, Jeff Layton, Casey Schaufler, syzbot, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
You can avoid all this churn.
Just use the existing query_flags arg.
Nothing outside the AT_STATX_SYNC_TYPE query_flags is
passed into filesystems from userspace.

Mast out AT_STATX_SYNC_TYPE in vfs_getattr()
And allow kernel internal request_flags in vfs_getattr_nosec()

The AT_ flag namespace is already a challenge, but mixing user
flags and kernel-only flags in vfs interfaces has been done before.

...

> @@ -171,7 +172,10 @@ int ovl_getattr(struct mnt_idmap *idmap, const
> struct path *path,
>
> type = ovl_path_real(dentry, &realpath);
> old_cred = ovl_override_creds(dentry->d_sb);
> - err = vfs_getattr(&realpath, stat, request_mask, flags);
> + if (getattr_flags & GETATTR_NOSEC)
> + err = vfs_getattr_nosec(&realpath, stat, request_mask, flags);
> + else
> + err = vfs_getattr(&realpath, stat, request_mask, flags);
> if (err)
> goto out;
>

There are two more vfs_getattr() calls in this function that you missed.

Please create ovl_do_getattr() inline wrapper in overlayfs.h next to all
the other ovl_do_ wrappers and use it here.

Thanks,
Amir.

Amir Goldstein

unread,
Sep 29, 2023, 12:30:22 AM9/29/23
to Mimi Zohar, Jeff Layton, Christian Brauner, Casey Schaufler, Stefan Berger, syzbot, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
On Tue, Sep 26, 2023 at 5:40 PM Mimi Zohar <zo...@linux.ibm.com> wrote:
>
> On Thu, 2023-09-21 at 20:01 +0300, Amir Goldstein wrote:
> > On Thu, Sep 21, 2023 at 7:31 PM Mimi Zohar <zo...@linux.ibm.com> wrote:
> > >
...
Not that I know of.

Mimi,

I am going to change my recommendation to -
Please wait with applying my patch unless you know that it
fixes a known bug, because:

1. I don't have a complete picture of ovl+IMA
2. I didn't find any specific test case to prove the bug
3. I have a plan to get rid of the file_real_path() anomaly

Thanks,
Amir.

Stefan Berger

unread,
Sep 29, 2023, 8:39:54 AM9/29/23
to Amir Goldstein, Christian Brauner, Mimi Zohar, Jeff Layton, Casey Schaufler, syzbot, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com

On 9/29/23 00:25, Amir Goldstein wrote:
> On Fri, Sep 29, 2023 at 3:02 AM Stefan Berger <ste...@linux.ibm.com> wrote:
>>
>> On 9/21/23 07:48, Christian Brauner wrote:
>>> Imho, this is all very wild but I'm not judging.
>>>
>>> Two solutions imho:
>>> (1) teach stacking filesystems like overlayfs and ecryptfs to use
>>> vfs_getattr_nosec() in their ->getattr() implementation when they
>>> are themselves called via vfs_getattr_nosec(). This will fix this by
>>> not triggering another LSM hook.
>>
>> You can avoid all this churn.
>> Just use the existing query_flags arg.
>> Nothing outside the AT_STATX_SYNC_TYPE query_flags is
>> passed into filesystems from userspace.
>>
>> Mast out AT_STATX_SYNC_TYPE in vfs_getattr()
>> And allow kernel internal request_flags in vfs_getattr_nosec()
Hm, I thought that vfs_getattr_nosec needs to pass AT_GETATTR_NOSEC into
->getattr().
>>
>> The AT_ flag namespace is already a challenge, but mixing user
>> flags and kernel-only flags in vfs interfaces has been done before.
>>
>> ...


That's what I wanted to avoid since now all filesystems' getattr() may
have the AT_GETATTR_NOSEC mixed into the query_flags.

Anyway, here's what I currently have:

diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 992d9c7e64ae..f7b5b1843dcc 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -998,16 +998,28 @@ static int ecryptfs_getattr_link(struct mnt_idmap
*idmap,
        return rc;
 }

+static int ecryptfs_do_getattr(bool nosec, const struct path *path,
+                              struct kstat *stat, u32 request_mask,
+                              unsigned int flags)
+{
+       if (nosec)
+               return vfs_getattr_nosec(path, stat, request_mask, flags);
+       return vfs_getattr(path, stat, request_mask, flags);
+}
+
 static int ecryptfs_getattr(struct mnt_idmap *idmap,
                            const struct path *path, struct kstat *stat,
                            u32 request_mask, unsigned int flags)
 {
        struct dentry *dentry = path->dentry;
        struct kstat lower_stat;
+       bool nosec = flags & AT_GETATTR_NOSEC;
        int rc;

-       rc = vfs_getattr(ecryptfs_dentry_to_lower_path(dentry), &lower_stat,
-                        request_mask, flags);
+       flags &= ~AT_INTERNAL_MASK;
+
+       rc = ecryptfs_do_getattr(nosec,
ecryptfs_dentry_to_lower_path(dentry),
+                                &lower_stat, request_mask, flags);
        if (!rc) {
                fsstack_copy_attr_all(d_inode(dentry),
ecryptfs_inode_to_lower(d_inode(dentry)));
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 83ef66644c21..ec4ceb5b4ebf 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -166,12 +166,15 @@ int ovl_getattr(struct mnt_idmap *idmap, const
struct path *path,
        int fsid = 0;
        int err;
        bool metacopy_blocks = false;
+       bool nosec = flags & AT_GETATTR_NOSEC;
+
+       flags &= ~AT_INTERNAL_MASK;

        metacopy_blocks = ovl_is_metacopy_dentry(dentry);

        type = ovl_path_real(dentry, &realpath);
        old_cred = ovl_override_creds(dentry->d_sb);
-       err = vfs_getattr(&realpath, stat, request_mask, flags);
+       err = ovl_do_getattr(nosec, &realpath, stat, request_mask, flags);
        if (err)
                goto out;

@@ -196,8 +199,8 @@ int ovl_getattr(struct mnt_idmap *idmap, const
struct path *path,
                                        (!is_dir ? STATX_NLINK : 0);

                        ovl_path_lower(dentry, &realpath);
-                       err = vfs_getattr(&realpath, &lowerstat,
-                                         lowermask, flags);
+                       err = ovl_do_getattr(nosec, &realpath, &lowerstat,
+                                            lowermask, flags);
                        if (err)
                                goto out;

@@ -249,8 +252,9 @@ int ovl_getattr(struct mnt_idmap *idmap, const
struct path *path,

                        ovl_path_lowerdata(dentry, &realpath);
                        if (realpath.dentry) {
-                               err = vfs_getattr(&realpath, &lowerdatastat,
-                                                 lowermask, flags);
+                               err = ovl_do_getattr(nosec, &realpath,
+ &lowerdatastat, lowermask,
+                                                    flags);
                                if (err)
                                        goto out;
                        } else {
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 9817b2dcb132..cbee3ff3bab7 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -397,6 +397,15 @@ static inline bool ovl_open_flags_need_copy_up(int
flags)
        return ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC));
 }

+static inline int ovl_do_getattr(bool nosec, const struct path *path,
+                                struct kstat *stat, u32 request_mask,
+                                unsigned int flags)
+{
+       if (nosec)
+               return vfs_getattr_nosec(path, stat, request_mask, flags);
+       return vfs_getattr(path, stat, request_mask, flags);
+}
+
 /* util.c */
 int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
diff --git a/fs/stat.c b/fs/stat.c
index d43a5cc1bfa4..3250e427e1aa 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -133,7 +133,8 @@ int vfs_getattr_nosec(const struct path *path,
struct kstat *stat,
        idmap = mnt_idmap(path->mnt);
        if (inode->i_op->getattr)
                return inode->i_op->getattr(idmap, path, stat,
-                                           request_mask, query_flags);
+                                           request_mask,
+                                           query_flags | AT_GETATTR_NOSEC);

        generic_fillattr(idmap, request_mask, inode, stat);
        return 0;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b528f063e8ff..9069d6a301f0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2027,6 +2027,12 @@ struct super_operations {
        void (*shutdown)(struct super_block *sb);
 };

+/*
+ * Internal query flags. See fcntl.h AT_xxx flags for the rest.
+ */
+#define AT_GETATTR_NOSEC               0x80000000
+#define AT_INTERNAL_MASK               0x80000000
+
 /*
  * Inode flags - they have no relation to superblock flags now
  */



Amir Goldstein

unread,
Oct 1, 2023, 10:52:03 AM10/1/23
to Stefan Berger, Christian Brauner, Mimi Zohar, Jeff Layton, Casey Schaufler, syzbot, linux-...@vger.kernel.org, linux-i...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mik...@szeredi.hu, syzkall...@googlegroups.com
I don't understand why you need the nosec helper arg.
What's wrong with:

static int ovl_do_getattr(const struct path *path,
struct kstat *stat, u32 request_mask,
unsigned int flags)
{
if (flags & AT_GETATTR_NOSEC)
return vfs_getattr_nosec(path, stat, request_mask, flags);
return vfs_getattr(path, stat, request_mask, flags);
}

likewise for ecryptfs.
> idmap = mnt_idmap(path->mnt);)
> if (inode->i_op->getattr)
> return inode->i_op->getattr(idmap, path, stat,
> - request_mask, query_flags);
> + request_mask,
> + query_flags | AT_GETATTR_NOSEC);
>

You also need in vfs_getattr():

if (WARN_ON_ONCE(query_flags & AT_GETATTR_NOSEC)
return -EPERM;

> generic_fillattr(idmap, request_mask, inode, stat);
> return 0;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index b528f063e8ff..9069d6a301f0 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2027,6 +2027,12 @@ struct super_operations {
> void (*shutdown)(struct super_block *sb);
> };
>
> +/*
> + * Internal query flags. See fcntl.h AT_xxx flags for the rest.
> + */
> +#define AT_GETATTR_NOSEC 0x80000000
> +#define AT_INTERNAL_MASK 0x80000000
> +

Yeh, the problem is that people adding flags to fcntl.h
won't be seeing this comment and we don't want to put those
"expose" those flags in uapi header either.

One possible compromise is to put them in fcntl.h under
#ifdef __KERNEL__

Very controversial, yes, I know.
The whole concept of mixing functional flags (i.e. AT_STATX_*)
with lookup AT_* flags is controversial to begin with, not to
mention flag overload for different syscalls (i.e. AT_EACCESS/
AT_REMOVEDIR/AT_HANDLE_FID).

But since we have accepted this necessary evil, I think that at least
we could explicitly partition the AT_ flags namespace and declare:

#ifdef __KERNEL__
AT_LOOKUP_FLAGS_MASK ...
AT_MOUNT_FLAGS_MASK AT_RECURSIVE
AT_SYSCALL_PRIVATE_MASK AT_EACCESS
AT_SYNC_TYPE_MASK AT_STATX_SYNC_TYPE
AT_KERNEL_INTERNAL_MASK 0x80000000
#endif

Sfefan,

I feel that I have to stress the point that this is only *my* opinion and
I accept that others (like some vfs co-maintains..) may passionately
disagree to further pollute the AT_ flags namespace.

The advantage of the AT_KERNEL_INTERNAL_MASK is that it is
in no way exposed to users via ABI, so if we decide to undo this
decision anytime in the future and reclaim the internal AT_ flags,
we could do that.

IMO, this is a decent compromise compared to the very noisy
patch that adds another flags argument to ->getattr() just to fix
this IMA/overlayfs corner case.

Thanks,
Amir.
Reply all
Reply to author
Forward
0 new messages