[PATCH 0/5] kasan: add workqueue and timer stack for generic KASAN

9 views
Skip to first unread message

Walter Wu

unread,
Aug 10, 2020, 3:21:25 AM8/10/20
to Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Matthias Brugger, John Stultz, Stephen Boyd, Andrew Morton, Tejun Heo, Lai Jiangshan, kasa...@googlegroups.com, linu...@kvack.org, linux-...@vger.kernel.org, linux-ar...@lists.infradead.org, wsd_upstream, linux-m...@lists.infradead.org, Walter Wu
Syzbot reports many UAF issues for workqueue or timer, see [1] and [2].
In some of these access/allocation happened in process_one_work(),
we see the free stack is useless in KASAN report, it doesn't help
programmers to solve UAF on workqueue. The same may stand for times.

This patchset improves KASAN reports by making them to have workqueue
queueing stack and timer queueing stack information. It is useful for
programmers to solve use-after-free or double-free memory issue.

Generic KASAN will record the last two workqueue and timer stacks,
print them in KASAN report. It is only suitable for generic KASAN.

In order to print the last two workqueue and timer stacks, so that
we add new members in struct kasan_alloc_meta.
- two workqueue queueing work stacks, total size is 8 bytes.
- two timer queueing stacks, total size is 8 bytes.

Orignial struct kasan_alloc_meta size is 16 bytes. After add new
members, then the struct kasan_alloc_meta total size is 32 bytes,
It is a good number of alignment. Let it get better memory consumption.

[1]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22+process_one_work
[2]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22%20expire_timers
[3]https://bugzilla.kernel.org/show_bug.cgi?id=198437

Walter Wu (5):
timer: kasan: record and print timer stack
workqueue: kasan: record and print workqueue stack
lib/test_kasan.c: add timer test case
lib/test_kasan.c: add workqueue test case
kasan: update documentation for generic kasan

Documentation/dev-tools/kasan.rst | 4 ++--
include/linux/kasan.h | 4 ++++
kernel/time/timer.c | 2 ++
kernel/workqueue.c | 3 +++
lib/test_kasan.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
mm/kasan/generic.c | 42 ++++++++++++++++++++++++++++++++++++++++++
mm/kasan/kasan.h | 6 +++++-
mm/kasan/report.c | 22 ++++++++++++++++++++++
8 files changed, 134 insertions(+), 3 deletions(-)

Qian Cai

unread,
Aug 10, 2020, 7:19:39 AM8/10/20
to Walter Wu, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Matthias Brugger, John Stultz, Stephen Boyd, Andrew Morton, Tejun Heo, Lai Jiangshan, kasa...@googlegroups.com, linu...@kvack.org, linux-...@vger.kernel.org, linux-ar...@lists.infradead.org, wsd_upstream, linux-m...@lists.infradead.org


> On Aug 10, 2020, at 3:21 AM, Walter Wu <walter...@mediatek.com> wrote:
>
> Syzbot reports many UAF issues for workqueue or timer, see [1] and [2].
> In some of these access/allocation happened in process_one_work(),
> we see the free stack is useless in KASAN report, it doesn't help
> programmers to solve UAF on workqueue. The same may stand for times.
>
> This patchset improves KASAN reports by making them to have workqueue
> queueing stack and timer queueing stack information. It is useful for
> programmers to solve use-after-free or double-free memory issue.
>
> Generic KASAN will record the last two workqueue and timer stacks,
> print them in KASAN report. It is only suitable for generic KASAN.
>
> In order to print the last two workqueue and timer stacks, so that
> we add new members in struct kasan_alloc_meta.
> - two workqueue queueing work stacks, total size is 8 bytes.
> - two timer queueing stacks, total size is 8 bytes.
>
> Orignial struct kasan_alloc_meta size is 16 bytes. After add new
> members, then the struct kasan_alloc_meta total size is 32 bytes,
> It is a good number of alignment. Let it get better memory consumption.

Getting debugging tools complicated surely is the best way to kill it. I would argue that it only make sense to complicate it if it is useful most of the time which I never feel or hear that is the case. This reminds me your recent call_rcu() stacks that most of time just makes parsing the report cumbersome. Thus, I urge this exercise to over-engineer on special cases need to stop entirely.

>
> [1]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22+process_one_work
> [2]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22%20expire_timers
> [3]https://bugzilla.kernel.org/show_bug.cgi?id=198437
>
> Walter Wu (5):
> timer: kasan: record and print timer stack
> workqueue: kasan: record and print workqueue stack
> lib/test_kasan.c: add timer test case
> lib/test_kasan.c: add workqueue test case
> kasan: update documentation for generic kasan
>
> Documentation/dev-tools/kasan.rst | 4 ++--
> include/linux/kasan.h | 4 ++++
> kernel/time/timer.c | 2 ++
> kernel/workqueue.c | 3 +++
> lib/test_kasan.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> mm/kasan/generic.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> mm/kasan/kasan.h | 6 +++++-
> mm/kasan/report.c | 22 ++++++++++++++++++++++
> 8 files changed, 134 insertions(+), 3 deletions(-)
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20200810072115.429-1-walter-zh.wu%40mediatek.com.

Walter Wu

unread,
Aug 10, 2020, 7:51:11 AM8/10/20
to Qian Cai, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Matthias Brugger, John Stultz, Stephen Boyd, Andrew Morton, Tejun Heo, Lai Jiangshan, kasa...@googlegroups.com, linu...@kvack.org, linux-...@vger.kernel.org, linux-ar...@lists.infradead.org, wsd_upstream, linux-m...@lists.infradead.org
On Mon, 2020-08-10 at 07:19 -0400, Qian Cai wrote:
>
> > On Aug 10, 2020, at 3:21 AM, Walter Wu <walter...@mediatek.com> wrote:
> >
> > Syzbot reports many UAF issues for workqueue or timer, see [1] and [2].
> > In some of these access/allocation happened in process_one_work(),
> > we see the free stack is useless in KASAN report, it doesn't help
> > programmers to solve UAF on workqueue. The same may stand for times.
> >
> > This patchset improves KASAN reports by making them to have workqueue
> > queueing stack and timer queueing stack information. It is useful for
> > programmers to solve use-after-free or double-free memory issue.
> >
> > Generic KASAN will record the last two workqueue and timer stacks,
> > print them in KASAN report. It is only suitable for generic KASAN.
> >
> > In order to print the last two workqueue and timer stacks, so that
> > we add new members in struct kasan_alloc_meta.
> > - two workqueue queueing work stacks, total size is 8 bytes.
> > - two timer queueing stacks, total size is 8 bytes.
> >
> > Orignial struct kasan_alloc_meta size is 16 bytes. After add new
> > members, then the struct kasan_alloc_meta total size is 32 bytes,
> > It is a good number of alignment. Let it get better memory consumption.
>
> Getting debugging tools complicated surely is the best way to kill it. I would argue that it only make sense to complicate it if it is useful most of the time which I never feel or hear that is the case. This reminds me your recent call_rcu() stacks that most of time just makes parsing the report cumbersome. Thus, I urge this exercise to over-engineer on special cases need to stop entirely.
>

A good debug tool is to have complete information in order to solve
issue. We should focus on if KASAN reports always show this debug
information or create a option to decide if show it. Because this
feature is Dimitry's suggestion. see [1]. So I think it need to be
implemented. Maybe we can wait his response.

[1]https://lkml.org/lkml/2020/6/23/256

Thanks.

Walter Wu

unread,
Aug 10, 2020, 8:12:38 AM8/10/20
to Qian Cai, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Matthias Brugger, John Stultz, Stephen Boyd, Andrew Morton, Tejun Heo, Lai Jiangshan, kasa...@googlegroups.com, linu...@kvack.org, linux-...@vger.kernel.org, linux-ar...@lists.infradead.org, wsd_upstream, linux-m...@lists.infradead.org
On Mon, 2020-08-10 at 19:50 +0800, Walter Wu wrote:
> On Mon, 2020-08-10 at 07:19 -0400, Qian Cai wrote:
> >
> > > On Aug 10, 2020, at 3:21 AM, Walter Wu <walter...@mediatek.com> wrote:
> > >
> > > Syzbot reports many UAF issues for workqueue or timer, see [1] and [2].
> > > In some of these access/allocation happened in process_one_work(),
> > > we see the free stack is useless in KASAN report, it doesn't help
> > > programmers to solve UAF on workqueue. The same may stand for times.
> > >
> > > This patchset improves KASAN reports by making them to have workqueue
> > > queueing stack and timer queueing stack information. It is useful for
> > > programmers to solve use-after-free or double-free memory issue.
> > >
> > > Generic KASAN will record the last two workqueue and timer stacks,
> > > print them in KASAN report. It is only suitable for generic KASAN.
> > >
> > > In order to print the last two workqueue and timer stacks, so that
> > > we add new members in struct kasan_alloc_meta.
> > > - two workqueue queueing work stacks, total size is 8 bytes.
> > > - two timer queueing stacks, total size is 8 bytes.
> > >
> > > Orignial struct kasan_alloc_meta size is 16 bytes. After add new
> > > members, then the struct kasan_alloc_meta total size is 32 bytes,
> > > It is a good number of alignment. Let it get better memory consumption.
> >
> > Getting debugging tools complicated surely is the best way to kill it. I would argue that it only make sense to complicate it if it is useful most of the time which I never feel or hear that is the case. This reminds me your recent call_rcu() stacks that most of time just makes parsing the report cumbersome. Thus, I urge this exercise to over-engineer on special cases need to stop entirely.
> >
>
> A good debug tool is to have complete information in order to solve
> issue. We should focus on if KASAN reports always show this debug
> information or create a option to decide if show it. Because this
> feature is Dmitry's suggestion. see [1]. So I think it need to be
> implemented. Maybe we can wait his response.
>
> [1]https://lkml.org/lkml/2020/6/23/256
>
> Thanks.
>

Fix name typo. I am sorry to him.
And add a bugzilla to show why need to do it. please see [1].

[1] https://bugzilla.kernel.org/show_bug.cgi?id=198437

Qian Cai

unread,
Aug 10, 2020, 8:44:39 AM8/10/20
to Walter Wu, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Matthias Brugger, John Stultz, Stephen Boyd, Andrew Morton, Tejun Heo, Lai Jiangshan, kasa...@googlegroups.com, linu...@kvack.org, linux-...@vger.kernel.org, linux-ar...@lists.infradead.org, wsd_upstream, linux-m...@lists.infradead.org
I don't know if it is Dmitry's pipe-dream which every KASAN report would enable
developers to fix it without reproducing it. It is always an ongoing struggling
between to make kernel easier to debug and the things less cumbersome.

On the other hand, Dmitry's suggestion makes sense only if the price we are
going to pay is fair. With the current diffstat and the recent experience of
call_rcu() stacks "waste" screen spaces as a heavy KASAN user myself, I can't
really get that exciting for pushing the limit again at all.

Walter Wu

unread,
Aug 10, 2020, 10:31:30 AM8/10/20
to Qian Cai, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Matthias Brugger, John Stultz, Stephen Boyd, Andrew Morton, Tejun Heo, Lai Jiangshan, kasa...@googlegroups.com, linu...@kvack.org, linux-...@vger.kernel.org, linux-ar...@lists.infradead.org, wsd_upstream, linux-m...@lists.infradead.org
If you are concerned that the report is long, maybe we can create an
option for the user decide whether print them (include call_rcu).
So this should satisfy everyone?

Qian Cai

unread,
Aug 10, 2020, 10:51:10 AM8/10/20
to Walter Wu, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Matthias Brugger, John Stultz, Stephen Boyd, Andrew Morton, Tejun Heo, Lai Jiangshan, kasa...@googlegroups.com, linu...@kvack.org, linux-...@vger.kernel.org, linux-ar...@lists.infradead.org, wsd_upstream, linux-m...@lists.infradead.org
Adding kernel config options is just another way to add complications with real
cost. The only other way I can think of right now is to create some kinds of
plugin systems for kasan to be able to run ebpf scripts (for example) to deal
with those special cases.
Reply all
Reply to author
Forward
0 new messages