perf: use-after-free in perf

Dmitry Vyukov

unread,

Mar 6, 2017, 4:57:28 AM3/6/17

to Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

Hello,

I've got the following use-after-free report while running syzkaller
fuzzer on 86292b33d4b79ee03e2f43ea0381ef85f077c760. Note that the task
is freed right in copy_process due to some error, but it's referenced
by another thread in perf subsystem.

==================================================================
BUG: KASAN: use-after-free in atomic_dec_and_test
arch/x86/include/asm/atomic.h:123 [inline] at addr ffff880079c30158
BUG: KASAN: use-after-free in put_task_struct
include/linux/sched/task.h:93 [inline] at addr ffff880079c30158
BUG: KASAN: use-after-free in put_ctx+0xcf/0x110
kernel/events/core.c:1131 at addr ffff880079c30158
Write of size 4 by task syz-executor6/25698
CPU: 2 PID: 25698 Comm: syz-executor6 Not tainted 4.10.0+ #302
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x2fb/0x3fd lib/dump_stack.c:52
kasan_object_err+0x1c/0x90 mm/kasan/report.c:166
print_address_description mm/kasan/report.c:208 [inline]
kasan_report_error mm/kasan/report.c:292 [inline]
kasan_report.part.2+0x1b0/0x460 mm/kasan/report.c:314
kasan_report+0x21/0x30 mm/kasan/report.c:301
check_memory_region_inline mm/kasan/kasan.c:326 [inline]
check_memory_region+0x139/0x190 mm/kasan/kasan.c:333
kasan_check_write+0x14/0x20 mm/kasan/kasan.c:344
atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline]
put_task_struct include/linux/sched/task.h:93 [inline]
put_ctx+0xcf/0x110 kernel/events/core.c:1131
perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322
perf_release+0x37/0x50 kernel/events/core.c:4338
__fput+0x332/0x800 fs/file_table.c:209
____fput+0x15/0x20 fs/file_table.c:245
task_work_run+0x197/0x260 kernel/task_work.c:116
exit_task_work include/linux/task_work.h:21 [inline]
do_exit+0xb38/0x29c0 kernel/exit.c:880
do_group_exit+0x149/0x420 kernel/exit.c:984
get_signal+0x7e0/0x1820 kernel/signal.c:2318
do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808
exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157
syscall_return_slowpath arch/x86/entry/common.c:191 [inline]
do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x4458d9
RSP: 002b:00007f3f07187cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000007080c8 RCX: 00000000004458d9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000007080c8
RBP: 00000000007080a8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f3f071889c0 R15: 00007f3f07188700
Object at ffff880079c30140, in cache task_struct size: 5376
Allocated:
PID = 25681
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
save_stack+0x43/0xd0 mm/kasan/kasan.c:513
set_track mm/kasan/kasan.c:525 [inline]
kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:616
kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:555
kmem_cache_alloc_node+0x122/0x6f0 mm/slab.c:3662
alloc_task_struct_node kernel/fork.c:153 [inline]
dup_task_struct kernel/fork.c:495 [inline]
copy_process.part.38+0x19c8/0x4aa0 kernel/fork.c:1560
copy_process kernel/fork.c:1531 [inline]
_do_fork+0x200/0x1010 kernel/fork.c:1994
SYSC_clone kernel/fork.c:2104 [inline]
SyS_clone+0x37/0x50 kernel/fork.c:2098
do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
return_from_SYSCALL_64+0x0/0x7a
Freed:
PID = 25681
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
save_stack+0x43/0xd0 mm/kasan/kasan.c:513
set_track mm/kasan/kasan.c:525 [inline]
kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:589
__cache_free mm/slab.c:3514 [inline]
kmem_cache_free+0x71/0x240 mm/slab.c:3774
free_task_struct kernel/fork.c:158 [inline]
free_task+0x151/0x1d0 kernel/fork.c:370
copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931
copy_process kernel/fork.c:1531 [inline]
_do_fork+0x200/0x1010 kernel/fork.c:1994
SYSC_clone kernel/fork.c:2104 [inline]
SyS_clone+0x37/0x50 kernel/fork.c:2098
do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
return_from_SYSCALL_64+0x0/0x7a

Peter Zijlstra

unread,

Mar 6, 2017, 7:13:12 AM3/6/17

to Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On Mon, Mar 06, 2017 at 10:57:07AM +0100, Dmitry Vyukov wrote:
> Hello,
>
> I've got the following use-after-free report while running syzkaller
> fuzzer on 86292b33d4b79ee03e2f43ea0381ef85f077c760. Note that the task
> is freed right in copy_process due to some error, but it's referenced
> by another thread in perf subsystem.

Weird... you don't happen to have a reproduction case available?

Dmitry Vyukov

unread,

Mar 6, 2017, 7:18:03 AM3/6/17

to Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

Unfortunately no. I've looked at both logs that I have and there are
no memory allocation failures preceding the crash (however maybe
somebody used NOWARN?). But probably if you inject an error into
copy_process somewhere after perf_event_init_task, it should reproduce
the bug with KASAN I think.

Peter Zijlstra

unread,

Mar 6, 2017, 7:23:27 AM3/6/17

to Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

I'll try. Thanks!

Dmitry Vyukov

unread,

Mar 6, 2017, 7:28:02 AM3/6/17

to Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

I think you will also need the attached patch. It seems that it was
found due to it. Going to send it out soon.

atomic.patch

Peter Zijlstra

unread,

Mar 6, 2017, 7:47:23 AM3/6/17

to Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On Mon, Mar 06, 2017 at 01:27:41PM +0100, Dmitry Vyukov wrote:

> I think you will also need the attached patch. It seems that it was
> found due to it. Going to send it out soon.

Yuck, that's nasty. Although I don't see an alternative there.

You might also want to do the bitops, they suffer the same problem.

Peter Zijlstra

unread,

Mar 6, 2017, 8:14:57 AM3/6/17

to Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller, Oleg Nesterov

On Mon, Mar 06, 2017 at 10:57:07AM +0100, Dmitry Vyukov wrote:

> ==================================================================
> BUG: KASAN: use-after-free in atomic_dec_and_test
> arch/x86/include/asm/atomic.h:123 [inline] at addr ffff880079c30158
> BUG: KASAN: use-after-free in put_task_struct
> include/linux/sched/task.h:93 [inline] at addr ffff880079c30158
> BUG: KASAN: use-after-free in put_ctx+0xcf/0x110

FWIW, this output is very confusing, is this a result of your
post-processing replicating the line for every 'inlined' part?

> kernel/events/core.c:1131 at addr ffff880079c30158
> Write of size 4 by task syz-executor6/25698

> atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline]
> put_task_struct include/linux/sched/task.h:93 [inline]
> put_ctx+0xcf/0x110 kernel/events/core.c:1131
> perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322
> perf_release+0x37/0x50 kernel/events/core.c:4338
> __fput+0x332/0x800 fs/file_table.c:209
> ____fput+0x15/0x20 fs/file_table.c:245
> task_work_run+0x197/0x260 kernel/task_work.c:116
> exit_task_work include/linux/task_work.h:21 [inline]
> do_exit+0xb38/0x29c0 kernel/exit.c:880
> do_group_exit+0x149/0x420 kernel/exit.c:984
> get_signal+0x7e0/0x1820 kernel/signal.c:2318
> do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808
> exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157
> syscall_return_slowpath arch/x86/entry/common.c:191 [inline]
> do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286
> entry_SYSCALL64_slow_path+0x25/0x25

So this is fput()..

> Freed:
> PID = 25681
> save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:589
> __cache_free mm/slab.c:3514 [inline]
> kmem_cache_free+0x71/0x240 mm/slab.c:3774
> free_task_struct kernel/fork.c:158 [inline]
> free_task+0x151/0x1d0 kernel/fork.c:370
> copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931
> copy_process kernel/fork.c:1531 [inline]
> _do_fork+0x200/0x1010 kernel/fork.c:1994
> SYSC_clone kernel/fork.c:2104 [inline]
> SyS_clone+0x37/0x50 kernel/fork.c:2098
> do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
> return_from_SYSCALL_64+0x0/0x7a

and this is a failed fork().

However, inherited events don't have a filedesc to fput(), and
similarly, a task that fails for has never been visible to attach a perf
event to because it never hits the pid-hash.

Or so it is assumed.

I'm forever getting lost in the PID code. Oleg, is there any way
find_task_by_vpid() can return a task that can still fail fork() ?

Dmitry Vyukov

unread,

Mar 6, 2017, 8:35:11 AM3/6/17

to Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller, Oleg Nesterov

On Mon, Mar 6, 2017 at 2:14 PM, Peter Zijlstra <pet...@infradead.org> wrote:
> On Mon, Mar 06, 2017 at 10:57:07AM +0100, Dmitry Vyukov wrote:
>
>> ==================================================================
>> BUG: KASAN: use-after-free in atomic_dec_and_test
>> arch/x86/include/asm/atomic.h:123 [inline] at addr ffff880079c30158
>> BUG: KASAN: use-after-free in put_task_struct
>> include/linux/sched/task.h:93 [inline] at addr ffff880079c30158
>> BUG: KASAN: use-after-free in put_ctx+0xcf/0x110
>
> FWIW, this output is very confusing, is this a result of your
> post-processing replicating the line for every 'inlined' part?

Yes.
We probably should not do this inlining in the header line. But the
problem is that it is very difficult to understand that it is a header
line in general.

FWIW here are 2 syzkaller programs that triggered the bug:
https://gist.githubusercontent.com/dvyukov/d67f980050589775237a7fbdff226bec/raw/4bca72861cb2ede64059b6dad403e19f425a361f/gistfile1.txt
They look very similar, so most likely they are a mutation of the same
program. Which may suggest that there is something in that program
that provokes the bug. Note that the calls in these programs are
executed potentially in multiple threads. But at least it can give
some idea wrt e.g. flags passed to perf_event_open.

Peter Zijlstra

unread,

Mar 7, 2017, 4:08:21 AM3/7/17

to Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller, Oleg Nesterov

On Mon, Mar 06, 2017 at 02:34:50PM +0100, Dmitry Vyukov wrote:
> FWIW here are 2 syzkaller programs that triggered the bug:
> https://gist.githubusercontent.com/dvyukov/d67f980050589775237a7fbdff226bec/raw/4bca72861cb2ede64059b6dad403e19f425a361f/gistfile1.txt

Hurm, previously your gistfile thingies were actual C, but this thing is
gibberish. How do I run it?

Dmitry Vyukov

unread,

Mar 7, 2017, 4:26:41 AM3/7/17

to Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller, Oleg Nesterov

The same way we did it here:
https://groups.google.com/d/msg/syzkaller/MHXa-o8foyc/yrGfDOrwAQAJ
This will run it in infinite loop in 10 parallel processes:
./syz-execprog -repeat=0 -procs=10 -sandbox=namespace gistfile1.txt
-sandbox=namespace will require CONFIG_USER_NS=y, I am not sure if it
is actually required, but that's how bots triggered it. You can do
-sandbox=none as well.

Peter Zijlstra

unread,

Mar 7, 2017, 4:37:58 AM3/7/17

to Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller, Oleg Nesterov

On Tue, Mar 07, 2017 at 10:26:20AM +0100, Dmitry Vyukov wrote:
> On Tue, Mar 7, 2017 at 10:08 AM, Peter Zijlstra <pet...@infradead.org> wrote:
> > On Mon, Mar 06, 2017 at 02:34:50PM +0100, Dmitry Vyukov wrote:
> >> FWIW here are 2 syzkaller programs that triggered the bug:
> >> https://gist.githubusercontent.com/dvyukov/d67f980050589775237a7fbdff226bec/raw/4bca72861cb2ede64059b6dad403e19f425a361f/gistfile1.txt
> >
> > Hurm, previously your gistfile thingies were actual C, but this thing is
> > gibberish. How do I run it?
>
> The same way we did it here:
> https://groups.google.com/d/msg/syzkaller/MHXa-o8foyc/yrGfDOrwAQAJ

Oh right, completely forgot about that. The last gistfile stuff I found
in my history were actual C files.

> This will run it in infinite loop in 10 parallel processes:
> ./syz-execprog -repeat=0 -procs=10 -sandbox=namespace gistfile1.txt
> -sandbox=namespace will require CONFIG_USER_NS=y, I am not sure if it
> is actually required, but that's how bots triggered it. You can do
> -sandbox=none as well.

I still have an ancient syzcaller; -sandbox doesn't exist and I needed
to add -executor=bin/syz-executor but now it appears to run.

I'll go up procs, 10 is somewhat low I feel.

---

root@ivb-ep:~/gopath/src/github.com/google/syzkaller# ./bin/syz-execprog -repeat=0 -procs=10 -executor=bin/syz-executor gistfile2.txt
2017/03/07 10:35:14 parsed 2 programs
2017/03/07 10:35:14 executed 0 programs
result: failed=false hanged=false err=executor is not serving

2017/03/07 10:36:14 executed 10 programs
result: failed=false hanged=false err=executor is not serving

result: failed=false hanged=false err=executor is not serving

result: failed=false hanged=false err=executor is not serving

result: failed=false hanged=false err=executor is not serving

result: failed=false hanged=false err=executor is not serving

result: failed=false hanged=false err=executor is not serving

result: failed=false hanged=false err=executor is not serving

result: failed=false hanged=false err=executor is not serving

result: failed=false hanged=false err=executor is not serving

Dmitry Vyukov

unread,

Mar 7, 2017, 4:43:53 AM3/7/17

to Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller, Oleg Nesterov

An old syzkaller may not understand part of syscalls in the program
and silently drop them, you need a new one.

Here is a straightforward conversion of the syzkaller program to C
(with/without namespace sandbox):

https://gist.githubusercontent.com/dvyukov/b6540bed50b7da1dff3d7373ba570c77/raw/fd5f2f3aaa52b70b2bb9f114cf8a3226d8a30960/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/dbd8ec38bcb50df4bdc95210d4247b09/raw/f9cbb5e17cd4ff4a7a7881c97dea7d6cd6dd8bf1/gistfile1.txt

That's also with -procs=10, you can change number of procs in main funciton.

But I wasn't able to reproduce the crash using these programs (neither
the syzkaller program), that's why I did not provide it all initially.

Peter Zijlstra

unread,

Mar 7, 2017, 5:00:24 AM3/7/17

to Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller, Oleg Nesterov

On Tue, Mar 07, 2017 at 10:43:32AM +0100, Dmitry Vyukov wrote:

> An old syzkaller may not understand part of syscalls in the program
> and silently drop them, you need a new one.

That's yucky semantics, better to at least warn on that occasion.

> Here is a straightforward conversion of the syzkaller program to C
> (with/without namespace sandbox):
>
> https://gist.githubusercontent.com/dvyukov/b6540bed50b7da1dff3d7373ba570c77/raw/fd5f2f3aaa52b70b2bb9f114cf8a3226d8a30960/gistfile1.txt
> https://gist.githubusercontent.com/dvyukov/dbd8ec38bcb50df4bdc95210d4247b09/raw/f9cbb5e17cd4ff4a7a7881c97dea7d6cd6dd8bf1/gistfile1.txt

Thanks!

> That's also with -procs=10, you can change number of procs in main funciton.
>
> But I wasn't able to reproduce the crash using these programs (neither
> the syzkaller program), that's why I did not provide it all initially.

Right, I'll run them while at the same time trying to see what it is
they're doing to find clues.

Peter Zijlstra

unread,

Mar 7, 2017, 8:16:53 AM3/7/17

to Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller, Oleg Nesterov

So I _think_ find_task_by_vpid() can return an already dead task; and
we'll happily increase task->usage.

Dmitry; I have no idea how easy it is for you to reproduce the thing;
but so far I've not had much success. Could you perhaps stick the below
in?

Once we convert task_struct to refcount_t that should generate a WARN of
its own I suppose.

---

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 000fdb2..612d652 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -763,6 +763,7 @@ struct perf_event_context {
#ifdef CONFIG_CGROUP_PERF
int nr_cgroups; /* cgroup evts */
#endif
+ int switches;
void *task_ctx_data; /* pmu specific data */
struct rcu_head rcu_head;
};
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 6f41548f..6455b7a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2902,6 +2902,8 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn,
if (!parent && !next_parent)
goto unlock;

+ ctx->switches++;
+
if (next_parent == ctx || next_ctx == parent || next_parent == parent) {
/*
* Looks like the two contexts are clones, so we might be
@@ -3780,6 +3782,12 @@ find_lively_task_by_vpid(pid_t vpid)
task = current;
else
task = find_task_by_vpid(vpid);
+
+ if (task) {
+ if (WARN_ON_ONCE(task->flags & PF_EXITING))
+ task = NULL;
+ }
+
if (task)
get_task_struct(task);
rcu_read_unlock();
@@ -10432,6 +10440,10 @@ void perf_event_free_task(struct task_struct *task)

mutex_unlock(&ctx->mutex);

+ WARN_ON_ONCE(ctx->switches);
+ WARN_ON_ONCE(atomic_read(&ctx->refcount) != 1);
+ WARN_ON_ONCE(ctx->task != task);
+
put_ctx(ctx);
}
}

Peter Zijlstra

unread,

Mar 7, 2017, 8:27:46 AM3/7/17

to Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller, Oleg Nesterov

On Tue, Mar 07, 2017 at 02:16:49PM +0100, Peter Zijlstra wrote:
> So I _think_ find_task_by_vpid() can return an already dead task; and
> we'll happily increase task->usage.

Hurm, so find_get_context() already does the PF_EXITING test. And then
the put_ctx would've been from find_get_context(), not fput().

So still puzzled.

Oleg Nesterov

unread,

Mar 7, 2017, 9:04:17 AM3/7/17

to Peter Zijlstra, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On 03/06, Peter Zijlstra wrote:
>
> and this is a failed fork().
>
>
> However, inherited events don't have a filedesc to fput(), and
> similarly, a task that fails for has never been visible to attach a perf
> event to because it never hits the pid-hash.

Yes, it is not visible to find_task_by_vpid() until copy_process() does
attach_pid(PIDTYPE_PID), and copy_process() can't fail after that.

Oleg.

Dmitry Vyukov

unread,

Mar 7, 2017, 9:17:59 AM3/7/17

to Oleg Nesterov, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

I would what is that that is failed in copy_process. Could it be
perf_event_init_task itself? Maybe it leaves a pointer to p in some
shared state on some error conditions?

Oleg Nesterov

unread,

Mar 7, 2017, 11:51:35 AM3/7/17

to Dmitry Vyukov, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

I am looking at perf_event_init_task() too and I can't understand the
error handling...

perf_event_init_context() can return success even if inherit_task_group() in
the first list_for_each_entry(pinned_groups) fails, "ret" will be overwritten
by the 2nd list_for_each_entry(flexible_groups) loop. "inherited_all" should
be cleared, still this looks confusing at least.

inherit_event() returns NULL under is_orphaned_event() check, not ERR_PTR().
Is it correct?

Oleg.

Peter Zijlstra

unread,

Mar 7, 2017, 12:30:03 PM3/7/17

to Oleg Nesterov, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

Urgh, there was something tricky there, but I cannot remember, and it
seems we didn't put a comment in either :/

Alexander, can you remember?

But yes, this all looks a tad dodgy, I'll try and have a look, but I
feel like I'm coming down with something :-(

Peter Zijlstra

unread,

Mar 14, 2017, 8:55:14 AM3/14/17

to Oleg Nesterov, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On Tue, Mar 07, 2017 at 05:51:32PM +0100, Oleg Nesterov wrote:

> I am looking at perf_event_init_task() too and I can't understand the
> error handling...
>
> perf_event_init_context() can return success even if inherit_task_group() in
> the first list_for_each_entry(pinned_groups) fails, "ret" will be overwritten
> by the 2nd list_for_each_entry(flexible_groups) loop. "inherited_all" should
> be cleared, still this looks confusing at least.

Yes, this looks buggy. But I cannot explain how that would result in the
observed use-after-free. If we 'loose' an error this way,
perf_event_init_context() will not fail, and hence we'll not hit that
perf_event_free_task() path.

The task would continue living with an incorrectly copied context, and
counter behaviour would be affected.

> inherit_event() returns NULL under is_orphaned_event() check, not ERR_PTR().
> Is it correct?

Yes. This is all a tad tricky, but it seems to be correct.

By returning NULL, not an error, we affect the silent discard of
orphaned events. This is correct, because otherwise
perf_event_release_kernel() would have come by and explicitly discarded
those events for us anyway.

My current patch is below; won't do much good I suspect..

---
kernel/events/core.c | 55 +++++++++++++++++++++++++++++++++++-----------------
1 file changed, 37 insertions(+), 18 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index a17ed56c8ce1..3cd907e00377 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4256,7 +4256,7 @@ int perf_event_release_kernel(struct perf_event *event)

raw_spin_lock_irq(&ctx->lock);
/*
- * Mark this even as STATE_DEAD, there is no external reference to it
+ * Mark this event as STATE_DEAD, there is no external reference to it
* anymore.
*
* Anybody acquiring event->child_mutex after the below loop _must_
@@ -10417,21 +10417,9 @@ void perf_event_free_task(struct task_struct *task)
continue;

mutex_lock(&ctx->mutex);
-again:
- list_for_each_entry_safe(event, tmp, &ctx->pinned_groups,
- group_entry)
- perf_free_event(event, ctx);
-
- list_for_each_entry_safe(event, tmp, &ctx->flexible_groups,
- group_entry)
+ list_for_each_entry_safe(event, tmp, &ctx->event_list, event_entry)
perf_free_event(event, ctx);
-
- if (!list_empty(&ctx->pinned_groups) ||
- !list_empty(&ctx->flexible_groups))
- goto again;
-
mutex_unlock(&ctx->mutex);
-
put_ctx(ctx);
}
}
@@ -10469,7 +10457,12 @@ const struct perf_event_attr *perf_event_attrs(struct perf_event *event)
}

/*
- * inherit a event from parent task to child task:
+ * Inherit a event from parent task to child task.
+ *
+ * Returns:
+ * - valid pointer on success
+ * - NULL for orphaned events
+ * - IS_ERR() on error
*/
static struct perf_event *
inherit_event(struct perf_event *parent_event,
@@ -10563,6 +10556,16 @@ inherit_event(struct perf_event *parent_event,
return child_event;
}

+/*
+ * Inherits an event group.
+ *
+ * This will quietly suppress orphaned events; !inherit_event() is not an error.
+ * This matches with perf_event_release_kernel() removing all child events.
+ *
+ * Returns:
+ * - 0 on success
+ * - <0 on error
+ */
static int inherit_group(struct perf_event *parent_event,
struct task_struct *parent,
struct perf_event_context *parent_ctx,
@@ -10577,6 +10580,11 @@ static int inherit_group(struct perf_event *parent_event,
child, NULL, child_ctx);
if (IS_ERR(leader))
return PTR_ERR(leader);
+ /*
+ * @leader can be NULL here because of is_orphaned_event(). In this
+ * case inherit_event() will create individual events, similar to what
+ * perf_group_detach() would do anyway.
+ */
list_for_each_entry(sub, &parent_event->sibling_list, group_entry) {
child_ctr = inherit_event(sub, parent, parent_ctx,
child, leader, child_ctx);
@@ -10586,6 +10594,17 @@ static int inherit_group(struct perf_event *parent_event,
return 0;
}

+/*
+ * Creates the child task context and tries to inherit the event-group.
+ *
+ * Clears @inherited_all on !attr.inherited or error. Note that we'll leave
+ * inherited_all set when we 'fail' to inherit an orphaned event; this is
+ * consistent with perf_event_release_kernel() removing all child events.
+ *
+ * Returns:
+ * - 0 on success
+ * - <0 on error
+ */
static int
inherit_task_group(struct perf_event *event, struct task_struct *parent,
struct perf_event_context *parent_ctx,
@@ -10608,7 +10627,6 @@ inherit_task_group(struct perf_event *event, struct task_struct *parent,
* First allocate and initialize a context for the
* child.
*/
-
child_ctx = alloc_perf_context(parent_ctx->pmu, child);
if (!child_ctx)
return -ENOMEM;
@@ -10670,7 +10688,7 @@ static int perf_event_init_context(struct task_struct *child, int ctxn)
ret = inherit_task_group(event, parent, parent_ctx,
child, ctxn, &inherited_all);
if (ret)
- break;
+ goto out_unlock;
}

/*
@@ -10686,7 +10704,7 @@ static int perf_event_init_context(struct task_struct *child, int ctxn)
ret = inherit_task_group(event, parent, parent_ctx,
child, ctxn, &inherited_all);
if (ret)
- break;
+ goto out_unlock;
}

raw_spin_lock_irqsave(&parent_ctx->lock, flags);
@@ -10714,6 +10732,7 @@ static int perf_event_init_context(struct task_struct *child, int ctxn)
}

raw_spin_unlock_irqrestore(&parent_ctx->lock, flags);
+out_unlock:
mutex_unlock(&parent_ctx->mutex);

perf_unpin_context(parent_ctx);

Oleg Nesterov

unread,

Mar 14, 2017, 9:26:23 AM3/14/17

to Peter Zijlstra, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On 03/14, Peter Zijlstra wrote:
>
> On Tue, Mar 07, 2017 at 05:51:32PM +0100, Oleg Nesterov wrote:
>
> > inherit_event() returns NULL under is_orphaned_event() check, not ERR_PTR().
> > Is it correct?
>
> Yes. This is all a tad tricky, but it seems to be correct.
>
> By returning NULL, not an error, we affect the silent discard of
> orphaned events. This is correct, because otherwise
> perf_event_release_kernel() would have come by and explicitly discarded
> those events for us anyway.

Thanks... I'll try to understand this later.

> @@ -10608,7 +10627,6 @@ inherit_task_group(struct perf_event *event, struct task_struct *parent,
> * First allocate and initialize a context for the
> * child.
> */
> -
> child_ctx = alloc_perf_context(parent_ctx->pmu, child);
> if (!child_ctx)
> return -ENOMEM;
> @@ -10670,7 +10688,7 @@ static int perf_event_init_context(struct task_struct *child, int ctxn)
> ret = inherit_task_group(event, parent, parent_ctx,
> child, ctxn, &inherited_all);
> if (ret)
> - break;
> + goto out_unlock;
> }
>
> /*
> @@ -10686,7 +10704,7 @@ static int perf_event_init_context(struct task_struct *child, int ctxn)
> ret = inherit_task_group(event, parent, parent_ctx,
> child, ctxn, &inherited_all);
> if (ret)
> - break;
> + goto out_unlock;

With this change you can also simplify inherit_task_group() a little bit,
it no longer needs to nullify *inherited_all if inherit_group() fails.

Oleg.

Peter Zijlstra

unread,

Mar 14, 2017, 9:47:19 AM3/14/17

to Oleg Nesterov, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

Ah, that last one is broken because then we forget to re-enable
parent_ctx->rotate_disable.

So if we keep that a break, we still need that inherited_all thing as
well.

Oleg Nesterov

unread,

Mar 14, 2017, 10:04:59 AM3/14/17

to Peter Zijlstra, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On 03/14, Peter Zijlstra wrote:
>

> Yes, this looks buggy. But I cannot explain how that would result in the
> observed use-after-free.

Yes...

Suppose that copy_process() fails after perf_event_init_task(). In this
case perf_event_free_task() does put_ctx(), but if this ctx has another
reference (ctx->refcount > 1) then ctx->task will point to the already
freed task, copy_process() does free_task() at the end of error path.
And we can't replace it with put_task_struct().

I am looking at TASK_TOMBSTONE, perhaps perf_event_free_task() should
use it too?

Oleg.

Peter Zijlstra

unread,

Mar 14, 2017, 10:07:59 AM3/14/17

to Oleg Nesterov, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

The idea was that the task isn't visible when we use
perf_event_free_task(). But I'll have a look.

Oleg Nesterov

unread,

Mar 14, 2017, 10:32:07 AM3/14/17

to Peter Zijlstra, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

I can be easily wrong, I do not understans this code.

But. perf_event_init_task() adds child_event to parent_event->child_list.

If perf_event_release_kernel(parent_event) is called before copy_process()
does perf_event_free_task() which (in particular) removes it from child_list,
perf_event_release_kernel() can find this child_event and do get_ctx(ctx)
(under the list_for_each_entry(child, &event->child_list, child_list) loop).

Then it does put_ctx(ctx), but ctx->task can be already freed by
copy_process()->free_task() in this case.

No?

Oleg.

Peter Zijlstra

unread,

Mar 14, 2017, 11:02:53 AM3/14/17

to Oleg Nesterov, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On Tue, Mar 14, 2017 at 03:30:11PM +0100, Oleg Nesterov wrote:

> But. perf_event_init_task() adds child_event to parent_event->child_list.
>
> If perf_event_release_kernel(parent_event) is called before copy_process()
> does perf_event_free_task() which (in particular) removes it from child_list,
> perf_event_release_kernel() can find this child_event and do get_ctx(ctx)
> (under the list_for_each_entry(child, &event->child_list, child_list) loop).

Right; the child_list is the only thing that is exposed. And yes, it
looks like that can interleave just right.

> Then it does put_ctx(ctx), but ctx->task can be already freed by
> copy_process()->free_task() in this case.

Task1 Task2

fork()
perf_event_init_task()
/* ... */
goto bad_fork_$foo;
/* ... */
perf_event_free_task()
mutex_lock(ctx->lock)
perf_free_event(B)

perf_event_release_kernel(A)
mutex_lock(A->child_mutex)
list_for_each_entry(child, ...) {
/* child == B */
ctx = B->ctx;
get_ctx(ctx);
mutex_unlock(A->child_mutex);

mutex_lock(A->child_mutex)
list_del_init(B->child_list)
mutex_unlock(A->child_mutex)

/* ... */

mutex_unlock(ctx->lock);
put_ctx() /* >0 */
free_task();
mutex_lock(ctx->lock);
mutex_lock(A->child_mutex);
/* ... */
mutex_unlock(A->child_mutex);
mutex_unlock(ctx->lock)
put_ctx() /* 0 */
ctx->task && !TOMBSTONE
put_task_struct() /* UAF */

Something like that, right?

Let me see if it makes sense to retain perf_event_free_task() at all;
maybe we should always do perf_event_exit_task().

Peter Zijlstra

unread,

Mar 14, 2017, 11:07:50 AM3/14/17

to Oleg Nesterov, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

Do we want a WARN_ON_ONCE(atomic_read(&tsk->usage)); in free_task()?
Because in the above scenario we're freeing it with references on.

Oleg Nesterov

unread,

Mar 14, 2017, 11:21:07 AM3/14/17

to Peter Zijlstra, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On 03/14, Peter Zijlstra wrote:
>

> mutex_unlock(ctx->lock);
> put_ctx() /* >0 */
> free_task();
> mutex_lock(ctx->lock);
> mutex_lock(A->child_mutex);
> /* ... */
> mutex_unlock(A->child_mutex);
> mutex_unlock(ctx->lock)
> put_ctx() /* 0 */
> ctx->task && !TOMBSTONE
> put_task_struct() /* UAF */
>
>
> Something like that, right?

Yes, exactly.

> Let me see if it makes sense to retain perf_event_free_task() at all;
> maybe we should always do perf_event_exit_task().

Yes, perhaps... but this needs changes too. Say, WARN_ON_ONCE(child != current)
in perf_event_exit_task_context(). And even perf_event_task(new => F) does not
look right in this case. In fact it would be simply buggy to do this, this task
was not fully constructed yet, so even perf_event_pid(task) is not safe.

Oleg.

Peter Zijlstra

unread,

Mar 14, 2017, 11:26:33 AM3/14/17

to Oleg Nesterov, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

Yeah; there's a fair amount of stuff like that. I'm afraid crafting
exceptions for all that will just end up with more of a mess than we
safe by merging the two :/

A well.. I'll go do the 'trivial' patch then.

Oleg Nesterov

unread,

Mar 14, 2017, 11:39:01 AM3/14/17

to Peter Zijlstra, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On 03/14, Peter Zijlstra wrote:
>

> Do we want a WARN_ON_ONCE(atomic_read(&tsk->usage)); in free_task()?
> Because in the above scenario we're freeing it with references on.

Not sure, in this case copy_process() should decrement tsk->usage
before free_task(), note the atomic_set(&tsk->usage, 2) in
dup_task_struct().

Perhaps we should just add WARN_ON(tsk->usage != 2) into copy_process()
right before free_task() ?

On the other hand, WARN_ON(atomic_read(&tsk->usage)) looks pointless,
the only caller is put_task_struct().

Oleg.

Peter Zijlstra

unread,

Mar 14, 2017, 11:46:44 AM3/14/17

to Oleg Nesterov, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On Tue, Mar 14, 2017 at 04:37:05PM +0100, Oleg Nesterov wrote:
> On 03/14, Peter Zijlstra wrote:
> >
> > Do we want a WARN_ON_ONCE(atomic_read(&tsk->usage)); in free_task()?
> > Because in the above scenario we're freeing it with references on.
>
> Not sure, in this case copy_process() should decrement tsk->usage
> before free_task(), note the atomic_set(&tsk->usage, 2) in
> dup_task_struct().
>
> Perhaps we should just add WARN_ON(tsk->usage != 2) into copy_process()
> right before free_task() ?

Sure; that works. I'll try that once I'm back home again, to see if
there's unexpected fail because other things increment it.

Peter Zijlstra

unread,

Mar 14, 2017, 1:37:05 PM3/14/17

to Oleg Nesterov, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On Tue, Mar 14, 2017 at 04:26:25PM +0100, Peter Zijlstra wrote:
> A well.. I'll go do the 'trivial' patch then.

A little like so; completely untested.

---
kernel/events/core.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 110b38a58493..6576449b6029 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10346,6 +10346,17 @@ void perf_event_free_task(struct task_struct *task)
continue;

mutex_lock(&ctx->mutex);
+ raw_spin_lock_irq(&ctx->lock);
+ /*
+ * Destroy the task <-> ctx relation and mark the context dead.
+ *
+ * This is important because even though the task hasn't been
+ * exposed yet the context has been (through child_list).
+ */
+ RCU_INIT_POINTER(task->perf_event_ctxp[ctxn], NULL);
+ WRITE_ONCE(ctx->task, TASK_TOMBSTONE);
+ put_task_struct(task); /* cannot be last */
+ raw_spin_unlock_irq(&ctx->lock);
again:
list_for_each_entry_safe(event, tmp, &ctx->pinned_groups,
group_entry)

Oleg Nesterov

unread,

Mar 15, 2017, 12:45:00 PM3/15/17

to Peter Zijlstra, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On 03/14, Peter Zijlstra wrote:
>

> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10346,6 +10346,17 @@ void perf_event_free_task(struct task_struct *task)
> continue;
>
> mutex_lock(&ctx->mutex);
> + raw_spin_lock_irq(&ctx->lock);
> + /*
> + * Destroy the task <-> ctx relation and mark the context dead.
> + *
> + * This is important because even though the task hasn't been
> + * exposed yet the context has been (through child_list).
> + */
> + RCU_INIT_POINTER(task->perf_event_ctxp[ctxn], NULL);
> + WRITE_ONCE(ctx->task, TASK_TOMBSTONE);
> + put_task_struct(task); /* cannot be last */
> + raw_spin_unlock_irq(&ctx->lock);

Agreed, this is what I had in mind. Although you know, I spent 3
hours looking at your patch and I still can't convince myself I am
really sure it closes all races ;)

OK, I believe this is correct. And iiuc both RCU_INIT_POINTER(NULL)
and put_task_struct() are not strictly necessary? At least until we
add WARN_ON(tsk->usage != 2) before free_task() in copy process().

---------------------------------------------------------------------
This is off-topic, but to me list_for_each_entry(event->child_list)
in perf_event_release_kernel() looks very confusing and misleading.
And list_first_entry_or_null(), we do not really need NULL if list
is empty, tmp == child should be F even if we use list_first_entry().
And given that we already have list_is_last(), it would be nice to
add list_is_first() and cleanup perf_event_release_kernel() a bit:

--- x/kernel/events/core.c
+++ x/kernel/events/core.c
@@ -4152,7 +4152,7 @@ static void put_event(struct perf_event
int perf_event_release_kernel(struct perf_event *event)
{
struct perf_event_context *ctx = event->ctx;
- struct perf_event *child, *tmp;
+ struct perf_event *child;

/*
* If we got here through err_file: fput(event_file); we will not have
@@ -4190,8 +4190,9 @@ int perf_event_release_kernel(struct per

again:
mutex_lock(&event->child_mutex);
- list_for_each_entry(child, &event->child_list, child_list) {
-
+ if (!list_empty(&event->child_list)) {
+ child = list_first_entry(&event->child_list,
+ struct perf_event, child_list);
/*
* Cannot change, child events are not migrated, see the
* comment with perf_event_ctx_lock_nested().
@@ -4221,9 +4222,7 @@ again:
* state, if child is still the first entry, it didn't get freed
* and we can continue doing so.
*/
- tmp = list_first_entry_or_null(&event->child_list,
- struct perf_event, child_list);
- if (tmp == child) {
+ if (list_is_first(child, &event->child_list)) {
perf_remove_from_context(child, DETACH_GROUP);
list_del(&child->child_list);
free_event(child);

But we can't, because

static inline int list_is_first(const struct list_head *list,
const struct list_head *head)
{
return list->prev == head;
}

won't work, "child" can be freed so we can't dereference it, and

static inline int list_is_first(const struct list_head *list,
const struct list_head *head)
{
return head->next == list;
}

won't be symmetrical with list_is_last() we already have.

Oleg.

Peter Zijlstra

unread,

Mar 16, 2017, 8:05:37 AM3/16/17

to Oleg Nesterov, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On Wed, Mar 15, 2017 at 05:43:02PM +0100, Oleg Nesterov wrote:
> On 03/14, Peter Zijlstra wrote:
> >
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -10346,6 +10346,17 @@ void perf_event_free_task(struct task_struct *task)
> > continue;
> >
> > mutex_lock(&ctx->mutex);
> > + raw_spin_lock_irq(&ctx->lock);
> > + /*
> > + * Destroy the task <-> ctx relation and mark the context dead.
> > + *
> > + * This is important because even though the task hasn't been
> > + * exposed yet the context has been (through child_list).
> > + */
> > + RCU_INIT_POINTER(task->perf_event_ctxp[ctxn], NULL);
> > + WRITE_ONCE(ctx->task, TASK_TOMBSTONE);
> > + put_task_struct(task); /* cannot be last */
> > + raw_spin_unlock_irq(&ctx->lock);
>
> Agreed, this is what I had in mind. Although you know, I spent 3
> hours looking at your patch and I still can't convince myself I am
> really sure it closes all races ;)

Ha; yes I know that feeling. I used to have a few sheets of paper filled
with diagrams. Sadly I could not find them again. Must've been over
eager cleaning my desk at some point.

>
> OK, I believe this is correct. And iiuc both RCU_INIT_POINTER(NULL)
> and put_task_struct() are not strictly necessary? At least until we
> add WARN_ON(tsk->usage != 2) before free_task() in copy process().

Right; I just kept the code similar to the other location. I even
considered making a helper function to not duplicate, but in the end
decided against it.

> ---------------------------------------------------------------------
> This is off-topic, but to me list_for_each_entry(event->child_list)
> in perf_event_release_kernel() looks very confusing and misleading.
> And list_first_entry_or_null(), we do not really need NULL if list
> is empty, tmp == child should be F even if we use list_first_entry().
> And given that we already have list_is_last(), it would be nice to
> add list_is_first() and cleanup perf_event_release_kernel() a bit:
>

Agreed; its a bit of a weird one.

Let me go write proper patches for the things we have so far though.

Peter Zijlstra

unread,

Mar 16, 2017, 9:57:39 AM3/16/17

to Oleg Nesterov, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On Wed, Mar 15, 2017 at 05:43:02PM +0100, Oleg Nesterov wrote:
> static inline int list_is_first(const struct list_head *list,
> const struct list_head *head)
> {
> return head->next == list;
> }
>
> won't be symmetrical with list_is_last() we already have.

This is the one that makes sense to me though; that is, the current
list_is_last() doesn't make sense to me.

I would expect:

static inline int list_is_last(const struct list_head *list,
const struct list_head *head)
{
return head->prev == list
}

because @head is the list argument (yes, I know, horrible naming!).

Oleg Nesterov

unread,

Mar 16, 2017, 12:43:23 PM3/16/17

to Peter Zijlstra, Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, LKML, Mathieu Desnoyers, syzkaller

On 03/16, Peter Zijlstra wrote:
>
> On Wed, Mar 15, 2017 at 05:43:02PM +0100, Oleg Nesterov wrote:
> > static inline int list_is_first(const struct list_head *list,
> > const struct list_head *head)
> > {
> > return head->next == list;
> > }
> >
> > won't be symmetrical with list_is_last() we already have.
>
> This is the one that makes sense to me though; that is, the current
> list_is_last() doesn't make sense to me.
>
> I would expect:
>
> static inline int list_is_last(const struct list_head *list,
> const struct list_head *head)
> {
> return head->prev == list
> }

Yes!

> because @head is the list argument (yes, I know, horrible naming!).

and perhaps it could have more users if we redefine it to dereference
"head" which is likely more "stable", iow less likely can go away.

But after the quick grep I came to conclusion it is not possible to
audit the users it already has.

Oleg.

Reply all

Reply to author

Forward

perf: use-after-free in perf_release

Dmitry Vyukov

Peter Zijlstra

Dmitry Vyukov

Peter Zijlstra

Dmitry Vyukov

Peter Zijlstra

Peter Zijlstra

Dmitry Vyukov

Peter Zijlstra

Dmitry Vyukov

Peter Zijlstra

Dmitry Vyukov

Peter Zijlstra

Peter Zijlstra

Peter Zijlstra

Oleg Nesterov

Dmitry Vyukov

Oleg Nesterov

Peter Zijlstra

Peter Zijlstra

Oleg Nesterov

Peter Zijlstra

Oleg Nesterov

Peter Zijlstra

Oleg Nesterov

Peter Zijlstra

Peter Zijlstra

Oleg Nesterov

Peter Zijlstra

Oleg Nesterov

Peter Zijlstra

Peter Zijlstra

Oleg Nesterov

Peter Zijlstra

Peter Zijlstra

Oleg Nesterov