[PATCH v2 2/6] exit: Put an upper limit on how often we can oops

Kees Cook

unread,

Nov 9, 2022, 3:00:53 PM11/9/22

to Jann Horn, Kees Cook, Greg KH, Linus Torvalds, Seth Jenkins, Andy Lutomirski, Petr Mladek, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Eric W. Biederman, Arnd Bergmann, Marco Elver, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Luis Chamberlain, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

From: Jann Horn <ja...@google.com>

Many Linux systems are configured to not panic on oops; but allowing an
attacker to oops the system **really** often can make even bugs that look
completely unexploitable exploitable (like NULL dereferences and such) if
each crash elevates a refcount by one or a lock is taken in read mode, and
this causes a counter to eventually overflow.

The most interesting counters for this are 32 bits wide (like open-coded
refcounts that don't use refcount_t). (The ldsem reader count on 32-bit
platforms is just 16 bits, but probably nobody cares about 32-bit platforms
that much nowadays.)

So let's panic the system if the kernel is constantly oopsing.

The speed of oopsing 2^32 times probably depends on several factors, like
how long the stack trace is and which unwinder you're using; an empirically
important one is whether your console is showing a graphical environment or
a text console that oopses will be printed to.
In a quick single-threaded benchmark, it looks like oopsing in a vfork()
child with a very short stack trace only takes ~510 microseconds per run
when a graphical console is active; but switching to a text console that
oopses are printed to slows it down around 87x, to ~45 milliseconds per
run.
(Adding more threads makes this faster, but the actual oops printing
happens under &die_lock on x86, so you can maybe speed this up by a factor
of around 2 and then any further improvement gets eaten up by lock
contention.)

It looks like it would take around 8-12 days to overflow a 32-bit counter
with repeated oopsing on a multi-core X86 system running a graphical
environment; both me (in an X86 VM) and Seth (with a distro kernel on
normal hardware in a standard configuration) got numbers in that ballpark.

12 days aren't *that* short on a desktop system, and you'd likely need much
longer on a typical server system (assuming that people don't run graphical
desktop environments on their servers), and this is a *very* noisy and
violent approach to exploiting the kernel; and it also seems to take orders
of magnitude longer on some machines, probably because stuff like EFI
pstore will slow it down a ton if that's active.

[Moved sysctl into kernel/exit.c -kees]

Signed-off-by: Jann Horn <ja...@google.com>
Signed-off-by: Kees Cook <kees...@chromium.org>
Link: https://lore.kernel.org/r/20221107201317...@google.com
---
Documentation/admin-guide/sysctl/kernel.rst | 8 ++++
kernel/exit.c | 42 +++++++++++++++++++++
2 files changed, 50 insertions(+)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 98d1b198b2b4..09f3fb2f8585 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -667,6 +667,14 @@ This is the default behavior.
an oops event is detected.

+oops_limit
+==========
+
+Number of kernel oopses after which the kernel should panic when
+``panic_on_oops`` is not set. Setting this to 0 or 1 has the same effect
+as setting ``panic_on_oops=1``.
+
+
osrelease, ostype & version
===========================

diff --git a/kernel/exit.c b/kernel/exit.c
index 35e0a31a0315..892f38aeb0a4 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -72,6 +72,33 @@
#include <asm/unistd.h>
#include <asm/mmu_context.h>

+/*
+ * The default value should be high enough to not crash a system that randomly
+ * crashes its kernel from time to time, but low enough to at least not permit
+ * overflowing 32-bit refcounts or the ldsem writer count.
+ */
+static unsigned int oops_limit = 10000;
+
+#if CONFIG_SYSCTL
+static struct ctl_table kern_exit_table[] = {
+ {
+ .procname = "oops_limit",
+ .data = &oops_limit,
+ .maxlen = sizeof(oops_limit),
+ .mode = 0644,
+ .proc_handler = proc_douintvec,
+ },
+ { }
+};
+
+static __init int kernel_exit_sysctls_init(void)
+{
+ register_sysctl_init("kernel", kern_exit_table);
+ return 0;
+}
+late_initcall(kernel_exit_sysctls_init);
+#endif
+
static void __unhash_process(struct task_struct *p, bool group_dead)
{
nr_threads--;
@@ -874,6 +901,8 @@ void __noreturn do_exit(long code)

void __noreturn make_task_dead(int signr)
{
+ static atomic_t oops_count = ATOMIC_INIT(0);
+
/*
* Take the task off the cpu after something catastrophic has
* happened.
@@ -897,6 +926,19 @@ void __noreturn make_task_dead(int signr)
preempt_count_set(PREEMPT_ENABLED);
}

+ /*
+ * Every time the system oopses, if the oops happens while a reference
+ * to an object was held, the reference leaks.
+ * If the oops doesn't also leak memory, repeated oopsing can cause
+ * reference counters to wrap around (if they're not using refcount_t).
+ * This means that repeated oopsing can make unexploitable-looking bugs
+ * exploitable through repeated oopsing.
+ * To make sure this can't happen, place an upper bound on how often the
+ * kernel may oops without panic().
+ */
+ if (atomic_inc_return(&oops_count) >= READ_ONCE(oops_limit))
+ panic("Oopsed too often (oops_limit is %d)", oops_limit);
+
/*
* We're taking recursive faults here in make_task_dead. Safest is to just
* leave this task alone and wait for reboot.
--
2.34.1

Kees Cook

unread,

Nov 9, 2022, 3:00:55 PM11/9/22

to Jann Horn, Kees Cook, Greg KH, Linus Torvalds, Seth Jenkins, Andy Lutomirski, Petr Mladek, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Eric W. Biederman, Arnd Bergmann, Marco Elver, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Luis Chamberlain, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Hi,

This builds on Jann's v1 patch[1]. Changes in v2:
- move sysctl into kernel/exit.c (where it belongs)
- expand Documentation slightly

New stuff in v2:
- expose oops_count to sysfs
- consolidate panic_on_warn usage
- introduce warn_limit
- expose warn_count to sysfs

[1] https://lore.kernel.org/lkml/20221107201317...@google.com

Jann Horn (1):
exit: Put an upper limit on how often we can oops

Kees Cook (5):
panic: Separate sysctl logic from CONFIG_SMP
exit: Expose "oops_count" to sysfs
panic: Consolidate open-coded panic_on_warn checks
panic: Introduce warn_limit
panic: Expose "warn_count" to sysfs

.../ABI/testing/sysfs-kernel-oops_count | 6 ++
.../ABI/testing/sysfs-kernel-warn_count | 6 ++
Documentation/admin-guide/sysctl/kernel.rst | 17 ++++++
MAINTAINERS | 2 +
include/linux/panic.h | 1 +
kernel/exit.c | 60 +++++++++++++++++++
kernel/kcsan/report.c | 3 +-
kernel/panic.c | 44 +++++++++++++-
kernel/sched/core.c | 3 +-
lib/ubsan.c | 3 +-
mm/kasan/report.c | 4 +-
mm/kfence/report.c | 3 +-
12 files changed, 139 insertions(+), 13 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-kernel-oops_count
create mode 100644 Documentation/ABI/testing/sysfs-kernel-warn_count

--
2.34.1

Kees Cook

unread,

Nov 9, 2022, 3:00:56 PM11/9/22

to Jann Horn, Kees Cook, Jonathan Corbet, Andrew Morton, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Petr Mladek, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, linu...@vger.kernel.org, Greg KH, Linus Torvalds, Seth Jenkins, Andy Lutomirski, Eric W. Biederman, Arnd Bergmann, Marco Elver, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Luis Chamberlain, David Gow, Paul E. McKenney, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linux-h...@vger.kernel.org

Like oops_limit, add warn_limit for limiting the number of warnings when
panic_on_warn is not set.

Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: Baolin Wang <baoli...@linux.alibaba.com>
Cc: "Jason A. Donenfeld" <Ja...@zx2c4.com>
Cc: Eric Biggers <ebig...@google.com>
Cc: Huang Ying <ying....@intel.com>
Cc: Petr Mladek <pml...@suse.com>
Cc: tangmeng <tang...@uniontech.com>
Cc: "Guilherme G. Piccoli" <gpic...@igalia.com>
Cc: Tiezhu Yang <yangt...@loongson.cn>
Cc: Sebastian Andrzej Siewior <big...@linutronix.de>
Cc: linu...@vger.kernel.org
Signed-off-by: Kees Cook <kees...@chromium.org>
---
Documentation/admin-guide/sysctl/kernel.rst | 9 +++++++++
kernel/panic.c | 13 +++++++++++++
2 files changed, 22 insertions(+)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 09f3fb2f8585..c385d5319cdf 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -1508,6 +1508,15 @@ entry will default to 2 instead of 0.
2 Unprivileged calls to ``bpf()`` are disabled
= =============================================================

+
+warn_limit
+==========
+
+Number of kernel warnings after which the kernel should panic when
+``panic_on_warn`` is not set. Setting this to 0 or 1 has the same effect
+as setting ``panic_on_warn=1``.
+
+
watchdog
========

diff --git a/kernel/panic.c b/kernel/panic.c
index 3afd234767bc..b235fa4a6fc8 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -58,6 +58,7 @@ bool crash_kexec_post_notifiers;
int panic_on_warn __read_mostly;
unsigned long panic_on_taint;
bool panic_on_taint_nousertaint = false;
+static unsigned int warn_limit __read_mostly = 10000;

int panic_timeout = CONFIG_PANIC_TIMEOUT;
EXPORT_SYMBOL_GPL(panic_timeout);
@@ -88,6 +89,13 @@ static struct ctl_table kern_panic_table[] = {
.extra2 = SYSCTL_ONE,
},
#endif
+ {
+ .procname = "warn_limit",
+ .data = &warn_limit,
+ .maxlen = sizeof(warn_limit),

+ .mode = 0644,
+ .proc_handler = proc_douintvec,
+ },

{ }
};

@@ -203,8 +211,13 @@ static void panic_print_sys_info(bool console_flush)

void check_panic_on_warn(const char *reason)
{
+ static atomic_t warn_count = ATOMIC_INIT(0);
+
if (panic_on_warn)
panic("%s: panic_on_warn set ...\n", reason);
+
+ if (atomic_inc_return(&warn_count) >= READ_ONCE(warn_limit))
+ panic("Warned too often (warn_limit is %d)", warn_limit);
}

/**
--
2.34.1

Kees Cook

unread,

Nov 9, 2022, 3:00:56 PM11/9/22

to Jann Horn, Kees Cook, Eric W. Biederman, Arnd Bergmann, Greg KH, Linus Torvalds, Seth Jenkins, Andy Lutomirski, Petr Mladek, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Marco Elver, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Luis Chamberlain, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Since Oops count is now tracked and is a fairly interesting signal, add
the entry /sys/kernel/oops_count to expose it to userspace.

Cc: "Eric W. Biederman" <ebie...@xmission.com>
Cc: Jann Horn <ja...@google.com>
Cc: Arnd Bergmann <ar...@arndb.de>

Signed-off-by: Kees Cook <kees...@chromium.org>
---

.../ABI/testing/sysfs-kernel-oops_count | 6 +++++
MAINTAINERS | 1 +
kernel/exit.c | 22 +++++++++++++++++--
3 files changed, 27 insertions(+), 2 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-kernel-oops_count

diff --git a/Documentation/ABI/testing/sysfs-kernel-oops_count b/Documentation/ABI/testing/sysfs-kernel-oops_count
new file mode 100644
index 000000000000..156cca9dbc96
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-oops_count
@@ -0,0 +1,6 @@
+What: /sys/kernel/oops_count
+Date: November 2022
+KernelVersion: 6.2.0
+Contact: Linux Kernel Hardening List <linux-h...@vger.kernel.org>
+Description:
+ Shows how many times the system has Oopsed since last boot.
diff --git a/MAINTAINERS b/MAINTAINERS
index 1cd80c113721..0a1e95a58e54 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11106,6 +11106,7 @@ M: Kees Cook <kees...@chromium.org>
L: linux-h...@vger.kernel.org
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/hardening
+F: Documentation/ABI/testing/sysfs-kernel-oops_count
F: include/linux/overflow.h
F: include/linux/randomize_kstack.h
F: mm/usercopy.c
diff --git a/kernel/exit.c b/kernel/exit.c
index 892f38aeb0a4..4bffef9f3f46 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -67,6 +67,7 @@
#include <linux/io_uring.h>
#include <linux/kprobes.h>
#include <linux/rethook.h>
+#include <linux/sysfs.h>

#include <linux/uaccess.h>
#include <asm/unistd.h>
@@ -99,6 +100,25 @@ static __init int kernel_exit_sysctls_init(void)
late_initcall(kernel_exit_sysctls_init);
#endif

+static atomic_t oops_count = ATOMIC_INIT(0);
+
+#ifdef CONFIG_SYSFS
+static ssize_t oops_count_show(struct kobject *kobj, struct kobj_attribute *attr,
+ char *page)
+{
+ return sysfs_emit(page, "%d\n", atomic_read(&oops_count));
+}
+
+static struct kobj_attribute oops_count_attr = __ATTR_RO(oops_count);
+
+static __init int kernel_exit_sysfs_init(void)
+{
+ sysfs_add_file_to_group(kernel_kobj, &oops_count_attr.attr, NULL);
+ return 0;
+}
+late_initcall(kernel_exit_sysfs_init);

+#endif
+
static void __unhash_process(struct task_struct *p, bool group_dead)
{
nr_threads--;

@@ -901,8 +921,6 @@ void __noreturn do_exit(long code)

void __noreturn make_task_dead(int signr)
{
- static atomic_t oops_count = ATOMIC_INIT(0);
-

/*
* Take the task off the cpu after something catastrophic has
* happened.

--
2.34.1

Kees Cook

unread,

Nov 9, 2022, 3:00:56 PM11/9/22

to Jann Horn, Kees Cook, Petr Mladek, Andrew Morton, tangmeng, Guilherme G. Piccoli, Sebastian Andrzej Siewior, Tiezhu Yang, Greg KH, Linus Torvalds, Seth Jenkins, Andy Lutomirski, Eric W. Biederman, Arnd Bergmann, Marco Elver, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Luis Chamberlain, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Since Warn count is now tracked and is a fairly interesting signal, add
the entry /sys/kernel/warn_count to expose it to userspace.

Cc: Petr Mladek <pml...@suse.com>
Cc: Andrew Morton <ak...@linux-foundation.org>

Cc: tangmeng <tang...@uniontech.com>
Cc: "Guilherme G. Piccoli" <gpic...@igalia.com>

Cc: Sebastian Andrzej Siewior <big...@linutronix.de>

Cc: Tiezhu Yang <yangt...@loongson.cn>

Signed-off-by: Kees Cook <kees...@chromium.org>
---

.../ABI/testing/sysfs-kernel-warn_count | 6 +++++
MAINTAINERS | 1 +
kernel/panic.c | 22 +++++++++++++++++--

3 files changed, 27 insertions(+), 2 deletions(-)

create mode 100644 Documentation/ABI/testing/sysfs-kernel-warn_count

diff --git a/Documentation/ABI/testing/sysfs-kernel-warn_count b/Documentation/ABI/testing/sysfs-kernel-warn_count
new file mode 100644
index 000000000000..08f083d2fd51
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-warn_count

@@ -0,0 +1,6 @@
+What: /sys/kernel/oops_count
+Date: November 2022
+KernelVersion: 6.2.0
+Contact: Linux Kernel Hardening List <linux-h...@vger.kernel.org>
+Description:

+ Shows how many times the system has Warned since last boot.
diff --git a/MAINTAINERS b/MAINTAINERS
index 0a1e95a58e54..282cd8a513fd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11107,6 +11107,7 @@ L: linux-h...@vger.kernel.org

S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/hardening

F: Documentation/ABI/testing/sysfs-kernel-oops_count
+F: Documentation/ABI/testing/sysfs-kernel-warn_count

F: include/linux/overflow.h
F: include/linux/randomize_kstack.h
F: mm/usercopy.c

diff --git a/kernel/panic.c b/kernel/panic.c
index b235fa4a6fc8..ddf0f8956d6e 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -32,6 +32,7 @@
#include <linux/bug.h>
#include <linux/ratelimit.h>
#include <linux/debugfs.h>
+#include <linux/sysfs.h>
#include <trace/events/error_report.h>
#include <asm/sections.h>

@@ -107,6 +108,25 @@ static __init int kernel_panic_sysctls_init(void)
late_initcall(kernel_panic_sysctls_init);
#endif

+static atomic_t warn_count = ATOMIC_INIT(0);
+
+#ifdef CONFIG_SYSFS
+static ssize_t warn_count_show(struct kobject *kobj, struct kobj_attribute *attr,
+ char *page)
+{
+ return sysfs_emit(page, "%d\n", atomic_read(&warn_count));
+}
+
+static struct kobj_attribute warn_count_attr = __ATTR_RO(warn_count);
+
+static __init int kernel_panic_sysfs_init(void)
+{
+ sysfs_add_file_to_group(kernel_kobj, &warn_count_attr.attr, NULL);
+ return 0;
+}
+late_initcall(kernel_panic_sysfs_init);
+#endif
+
static long no_blink(int state)
{
return 0;
@@ -211,8 +231,6 @@ static void panic_print_sys_info(bool console_flush)

void check_panic_on_warn(const char *reason)
{
- static atomic_t warn_count = ATOMIC_INIT(0);
-

if (panic_on_warn)
panic("%s: panic_on_warn set ...\n", reason);

--
2.34.1

Luis Chamberlain

unread,

Nov 9, 2022, 4:17:43 PM11/9/22

to Kees Cook, Jann Horn, Greg KH, Linus Torvalds, Seth Jenkins, Andy Lutomirski, Petr Mladek, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Eric W. Biederman, Arnd Bergmann, Marco Elver, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

On Wed, Nov 09, 2022 at 12:00:43PM -0800, Kees Cook wrote:
> Hi,
>
> This builds on Jann's v1 patch[1]. Changes in v2:
> - move sysctl into kernel/exit.c (where it belongs)
> - expand Documentation slightly
>
> New stuff in v2:
> - expose oops_count to sysfs
> - consolidate panic_on_warn usage
> - introduce warn_limit
> - expose warn_count to sysfs
>
> [1] https://lore.kernel.org/lkml/20221107201317...@google.com
>
> Jann Horn (1):
> exit: Put an upper limit on how often we can oops
>
> Kees Cook (5):
> panic: Separate sysctl logic from CONFIG_SMP
> exit: Expose "oops_count" to sysfs
> panic: Consolidate open-coded panic_on_warn checks
> panic: Introduce warn_limit
> panic: Expose "warn_count" to sysfs

For all:

Reviewed-by: Luis Chamberlain <mcg...@kernel.org>

Luis

Marco Elver

unread,

Nov 14, 2022, 4:49:15 AM11/14/22

to Kees Cook, Jann Horn, Jonathan Corbet, Andrew Morton, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Petr Mladek, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, linu...@vger.kernel.org, Greg KH, Linus Torvalds, Seth Jenkins, Andy Lutomirski, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Luis Chamberlain, David Gow, Paul E. McKenney, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linux-h...@vger.kernel.org

Shouldn't this also include the "reason", like above? (Presumably a
warning had just been generated to console so the reason is easy
enough to infer from the log, although in that case "reason" also
seems redundant above.)

Kees Cook

unread,

Nov 17, 2022, 6:27:58 PM11/17/22

to Marco Elver, Jann Horn, Jonathan Corbet, Andrew Morton, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Petr Mladek, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, linu...@vger.kernel.org, Greg KH, Linus Torvalds, Seth Jenkins, Andy Lutomirski, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Luis Chamberlain, David Gow, Paul E. McKenney, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linux-h...@vger.kernel.org

Yeah, that makes sense. I had been thinking that since it was an action
due to repeated prior actions, the current "reason" didn't matter here.
But thinking about it more, I see what you mean. :)

--
Kees Cook

Kees Cook

unread,

Nov 17, 2022, 6:43:31 PM11/17/22

to Jann Horn, Kees Cook, Seth Jenkins, Greg KH, Linus Torvalds, Andy Lutomirski, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Luis Chamberlain, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Hi,

This builds on Jann's v1 patch[1]. Changes in v3:
- fix #if/#ifdef confusion (Bill)
- rename from "reason" or "origin" and add it to the warn output (Marco)

v2: https://lore.kernel.org/lkml/20221109194404...@kernel.org/

Thanks,

-Kees

[1] https://lore.kernel.org/lkml/20221107201317...@google.com

Jann Horn (1):
exit: Put an upper limit on how often we can oops

Kees Cook (5):
panic: Separate sysctl logic from CONFIG_SMP
exit: Expose "oops_count" to sysfs
panic: Consolidate open-coded panic_on_warn checks
panic: Introduce warn_limit
panic: Expose "warn_count" to sysfs

kernel/panic.c | 45 +++++++++++++-

kernel/sched/core.c | 3 +-
lib/ubsan.c | 3 +-
mm/kasan/report.c | 4 +-
mm/kfence/report.c | 3 +-

12 files changed, 140 insertions(+), 13 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-kernel-oops_count

Kees Cook

unread,

Nov 17, 2022, 6:43:31 PM11/17/22

to Jann Horn, Kees Cook, Luis Chamberlain, Seth Jenkins, Greg KH, Linus Torvalds, Andy Lutomirski, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Signed-off-by: Jann Horn <ja...@google.com>
Link: https://lore.kernel.org/r/20221107201317...@google.com
Reviewed-by: Luis Chamberlain <mcg...@kernel.org>

Signed-off-by: Kees Cook <kees...@chromium.org>
---

Documentation/admin-guide/sysctl/kernel.rst | 8 ++++
kernel/exit.c | 42 +++++++++++++++++++++

2 files changed, 50 insertions(+)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 98d1b198b2b4..09f3fb2f8585 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst

@@ -667,6 +667,14 @@ This is the default behavior.
an oops event is detected.

+oops_limit

+==========
+

+Number of kernel oopses after which the kernel should panic when
+``panic_on_oops`` is not set. Setting this to 0 or 1 has the same effect
+as setting ``panic_on_oops=1``.
+
+
osrelease, ostype & version
===========================

diff --git a/kernel/exit.c b/kernel/exit.c
index 35e0a31a0315..799c5edd6be6 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c

@@ -72,6 +72,33 @@
#include <asm/unistd.h>
#include <asm/mmu_context.h>

+/*
+ * The default value should be high enough to not crash a system that randomly
+ * crashes its kernel from time to time, but low enough to at least not permit
+ * overflowing 32-bit refcounts or the ldsem writer count.
+ */
+static unsigned int oops_limit = 10000;
+
+#if CONFIG_SYSCTL
+static struct ctl_table kern_exit_table[] = {
+ {
+ .procname = "oops_limit",
+ .data = &oops_limit,
+ .maxlen = sizeof(oops_limit),

+ .mode = 0644,
+ .proc_handler = proc_douintvec,
+ },

+ { }
+};
+
+static __init int kernel_exit_sysctls_init(void)
+{
+ register_sysctl_init("kernel", kern_exit_table);
+ return 0;
+}
+late_initcall(kernel_exit_sysctls_init);

+#endif
+
static void __unhash_process(struct task_struct *p, bool group_dead)
{
nr_threads--;

@@ -874,6 +901,8 @@ void __noreturn do_exit(long code)

void __noreturn make_task_dead(int signr)
{
+ static atomic_t oops_count = ATOMIC_INIT(0);
+

/*
* Take the task off the cpu after something catastrophic has
* happened.

@@ -897,6 +926,19 @@ void __noreturn make_task_dead(int signr)
preempt_count_set(PREEMPT_ENABLED);
}

+ /*
+ * Every time the system oopses, if the oops happens while a reference
+ * to an object was held, the reference leaks.
+ * If the oops doesn't also leak memory, repeated oopsing can cause
+ * reference counters to wrap around (if they're not using refcount_t).
+ * This means that repeated oopsing can make unexploitable-looking bugs
+ * exploitable through repeated oopsing.
+ * To make sure this can't happen, place an upper bound on how often the
+ * kernel may oops without panic().
+ */
+ if (atomic_inc_return(&oops_count) >= READ_ONCE(oops_limit))

+ panic("Oopsed too often (kernel.oops_limit is %d)", oops_limit);

Kees Cook

unread,

Nov 17, 2022, 6:43:32 PM11/17/22

to Jann Horn, Kees Cook, Eric W. Biederman, Arnd Bergmann, Luis Chamberlain, Seth Jenkins, Greg KH, Linus Torvalds, Andy Lutomirski, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Since Oops count is now tracked and is a fairly interesting signal, add
the entry /sys/kernel/oops_count to expose it to userspace.

Cc: "Eric W. Biederman" <ebie...@xmission.com>
Cc: Jann Horn <ja...@google.com>
Cc: Arnd Bergmann <ar...@arndb.de>

Reviewed-by: Luis Chamberlain <mcg...@kernel.org>
Signed-off-by: Kees Cook <kees...@chromium.org>
---

.../ABI/testing/sysfs-kernel-oops_count | 6 +++++
MAINTAINERS | 1 +
kernel/exit.c | 22 +++++++++++++++++--
3 files changed, 27 insertions(+), 2 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-kernel-oops_count

diff --git a/Documentation/ABI/testing/sysfs-kernel-oops_count b/Documentation/ABI/testing/sysfs-kernel-oops_count
new file mode 100644

index 000000000000..156cca9dbc96
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-oops_count

@@ -0,0 +1,6 @@
+What: /sys/kernel/oops_count
+Date: November 2022
+KernelVersion: 6.2.0
+Contact: Linux Kernel Hardening List <linux-h...@vger.kernel.org>
+Description:

+ Shows how many times the system has Oopsed since last boot.
diff --git a/MAINTAINERS b/MAINTAINERS
index 1cd80c113721..0a1e95a58e54 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS

@@ -11106,6 +11106,7 @@ M: Kees Cook <kees...@chromium.org>

L: linux-h...@vger.kernel.org
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/hardening

+F: Documentation/ABI/testing/sysfs-kernel-oops_count

F: include/linux/overflow.h
F: include/linux/randomize_kstack.h
F: mm/usercopy.c

diff --git a/kernel/exit.c b/kernel/exit.c
index 799c5edd6be6..bc62bfe75bc7 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c

@@ -67,6 +67,7 @@
#include <linux/io_uring.h>
#include <linux/kprobes.h>
#include <linux/rethook.h>
+#include <linux/sysfs.h>

#include <linux/uaccess.h>
#include <asm/unistd.h>
@@ -99,6 +100,25 @@ static __init int kernel_exit_sysctls_init(void)
late_initcall(kernel_exit_sysctls_init);
#endif

+static atomic_t oops_count = ATOMIC_INIT(0);
+
+#ifdef CONFIG_SYSFS
+static ssize_t oops_count_show(struct kobject *kobj, struct kobj_attribute *attr,
+ char *page)
+{
+ return sysfs_emit(page, "%d\n", atomic_read(&oops_count));
+}
+

+static struct kobj_attribute oops_count_attr = __ATTR_RO(oops_count);
+
+static __init int kernel_exit_sysfs_init(void)
+{

+ sysfs_add_file_to_group(kernel_kobj, &oops_count_attr.attr, NULL);
+ return 0;
+}
+late_initcall(kernel_exit_sysfs_init);

+#endif
+
static void __unhash_process(struct task_struct *p, bool group_dead)
{
nr_threads--;

@@ -901,8 +921,6 @@ void __noreturn do_exit(long code)

void __noreturn make_task_dead(int signr)
{
- static atomic_t oops_count = ATOMIC_INIT(0);
-

/*
* Take the task off the cpu after something catastrophic has
* happened.

--
2.34.1

Kees Cook

unread,

Nov 17, 2022, 6:43:32 PM11/17/22

to Jann Horn, Kees Cook, Jonathan Corbet, Andrew Morton, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Petr Mladek, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, linu...@vger.kernel.org, Luis Chamberlain, Seth Jenkins, Greg KH, Linus Torvalds, Andy Lutomirski, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linux-h...@vger.kernel.org

Like oops_limit, add warn_limit for limiting the number of warnings when
panic_on_warn is not set.

Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: Baolin Wang <baoli...@linux.alibaba.com>
Cc: "Jason A. Donenfeld" <Ja...@zx2c4.com>
Cc: Eric Biggers <ebig...@google.com>
Cc: Huang Ying <ying....@intel.com>
Cc: Petr Mladek <pml...@suse.com>
Cc: tangmeng <tang...@uniontech.com>
Cc: "Guilherme G. Piccoli" <gpic...@igalia.com>
Cc: Tiezhu Yang <yangt...@loongson.cn>
Cc: Sebastian Andrzej Siewior <big...@linutronix.de>
Cc: linu...@vger.kernel.org

Reviewed-by: Luis Chamberlain <mcg...@kernel.org>
Signed-off-by: Kees Cook <kees...@chromium.org>
---

Documentation/admin-guide/sysctl/kernel.rst | 9 +++++++++
kernel/panic.c | 14 ++++++++++++++
2 files changed, 23 insertions(+)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 09f3fb2f8585..c385d5319cdf 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst

@@ -1508,6 +1508,15 @@ entry will default to 2 instead of 0.
2 Unprivileged calls to ``bpf()`` are disabled
= =============================================================

+
+warn_limit

+==========
+

+Number of kernel warnings after which the kernel should panic when
+``panic_on_warn`` is not set. Setting this to 0 or 1 has the same effect
+as setting ``panic_on_warn=1``.
+
+
watchdog
========

diff --git a/kernel/panic.c b/kernel/panic.c

index cfa354322d5f..e5aab27496d7 100644

--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -58,6 +58,7 @@ bool crash_kexec_post_notifiers;
int panic_on_warn __read_mostly;
unsigned long panic_on_taint;
bool panic_on_taint_nousertaint = false;
+static unsigned int warn_limit __read_mostly = 10000;

int panic_timeout = CONFIG_PANIC_TIMEOUT;
EXPORT_SYMBOL_GPL(panic_timeout);
@@ -88,6 +89,13 @@ static struct ctl_table kern_panic_table[] = {
.extra2 = SYSCTL_ONE,
},
#endif
+ {
+ .procname = "warn_limit",
+ .data = &warn_limit,
+ .maxlen = sizeof(warn_limit),

+ .mode = 0644,
+ .proc_handler = proc_douintvec,
+ },

{ }
};

@@ -203,8 +211,14 @@ static void panic_print_sys_info(bool console_flush)

void check_panic_on_warn(const char *origin)

{
+ static atomic_t warn_count = ATOMIC_INIT(0);
+
if (panic_on_warn)

panic("%s: panic_on_warn set ...\n", origin);

+
+ if (atomic_inc_return(&warn_count) >= READ_ONCE(warn_limit))

+ panic("%s: system warned too often (kernel.warn_limit is %d)",
+ warn_limit);
}

/**
--
2.34.1

Kees Cook

unread,

Nov 17, 2022, 6:43:32 PM11/17/22

to Jann Horn, Kees Cook, Marco Elver, Dmitry Vyukov, Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Andrew Morton, David Gow, tangmeng, Shuah Khan, Petr Mladek, Paul E. McKenney, Sebastian Andrzej Siewior, Guilherme G. Piccoli, Tiezhu Yang, kasa...@googlegroups.com, linu...@kvack.org, Luis Chamberlain, Seth Jenkins, Greg KH, Linus Torvalds, Andy Lutomirski, Eric W. Biederman, Arnd Bergmann, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Several run-time checkers (KASAN, UBSAN, KFENCE, KCSAN, sched) roll
their own warnings, and each check "panic_on_warn". Consolidate this
into a single function so that future instrumentation can be added in
a single location.

Cc: Marco Elver <el...@google.com>
Cc: Dmitry Vyukov <dvy...@google.com>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Juri Lelli <juri....@redhat.com>
Cc: Vincent Guittot <vincent...@linaro.org>
Cc: Dietmar Eggemann <dietmar....@arm.com>
Cc: Steven Rostedt <ros...@goodmis.org>
Cc: Ben Segall <bse...@google.com>
Cc: Mel Gorman <mgo...@suse.de>
Cc: Daniel Bristot de Oliveira <bri...@redhat.com>
Cc: Valentin Schneider <vsch...@redhat.com>
Cc: Andrey Ryabinin <ryabin...@gmail.com>
Cc: Alexander Potapenko <gli...@google.com>
Cc: Andrey Konovalov <andre...@gmail.com>
Cc: Vincenzo Frascino <vincenzo...@arm.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: David Gow <davi...@google.com>
Cc: tangmeng <tang...@uniontech.com>
Cc: Jann Horn <ja...@google.com>
Cc: Shuah Khan <sk...@linuxfoundation.org>
Cc: Petr Mladek <pml...@suse.com>
Cc: "Paul E. McKenney" <pau...@kernel.org>

Cc: Sebastian Andrzej Siewior <big...@linutronix.de>
Cc: "Guilherme G. Piccoli" <gpic...@igalia.com>
Cc: Tiezhu Yang <yangt...@loongson.cn>

Cc: kasa...@googlegroups.com
Cc: linu...@kvack.org

Reviewed-by: Luis Chamberlain <mcg...@kernel.org>
Signed-off-by: Kees Cook <kees...@chromium.org>
---

include/linux/panic.h | 1 +
kernel/kcsan/report.c | 3 +--
kernel/panic.c | 9 +++++++--
kernel/sched/core.c | 3 +--
lib/ubsan.c | 3 +--
mm/kasan/report.c | 4 ++--
mm/kfence/report.c | 3 +--
7 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/include/linux/panic.h b/include/linux/panic.h
index c7759b3f2045..979b776e3bcb 100644
--- a/include/linux/panic.h
+++ b/include/linux/panic.h
@@ -11,6 +11,7 @@ extern long (*panic_blink)(int state);
__printf(1, 2)
void panic(const char *fmt, ...) __noreturn __cold;
void nmi_panic(struct pt_regs *regs, const char *msg);
+void check_panic_on_warn(const char *origin);
extern void oops_enter(void);
extern void oops_exit(void);
extern bool oops_may_print(void);
diff --git a/kernel/kcsan/report.c b/kernel/kcsan/report.c
index 67794404042a..e95ce7d7a76e 100644
--- a/kernel/kcsan/report.c
+++ b/kernel/kcsan/report.c
@@ -492,8 +492,7 @@ static void print_report(enum kcsan_value_change value_change,
dump_stack_print_info(KERN_DEFAULT);
pr_err("==================================================================\n");

- if (panic_on_warn)
- panic("panic_on_warn set ...\n");
+ check_panic_on_warn("KCSAN");
}

static void release_report(unsigned long *flags, struct other_info *other_info)
diff --git a/kernel/panic.c b/kernel/panic.c
index d843d036651e..cfa354322d5f 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -201,6 +201,12 @@ static void panic_print_sys_info(bool console_flush)
ftrace_dump(DUMP_ALL);
}

+void check_panic_on_warn(const char *origin)
+{
+ if (panic_on_warn)
+ panic("%s: panic_on_warn set ...\n", origin);
+}
+
/**
* panic - halt the system
* @fmt: The text string to print
@@ -619,8 +625,7 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
if (regs)
show_regs(regs);

- if (panic_on_warn)
- panic("panic_on_warn set ...\n");
+ check_panic_on_warn("kernel");

if (!regs)
dump_stack();
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5800b0623ff3..285ef8821b4f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5729,8 +5729,7 @@ static noinline void __schedule_bug(struct task_struct *prev)
pr_err("Preemption disabled at:");
print_ip_sym(KERN_ERR, preempt_disable_ip);
}
- if (panic_on_warn)
- panic("scheduling while atomic\n");
+ check_panic_on_warn("scheduling while atomic");

dump_stack();
add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
diff --git a/lib/ubsan.c b/lib/ubsan.c
index 36bd75e33426..60c7099857a0 100644
--- a/lib/ubsan.c
+++ b/lib/ubsan.c
@@ -154,8 +154,7 @@ static void ubsan_epilogue(void)

current->in_ubsan--;

- if (panic_on_warn)
- panic("panic_on_warn set ...\n");
+ check_panic_on_warn("UBSAN");
}

void __ubsan_handle_divrem_overflow(void *_data, void *lhs, void *rhs)
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index df3602062bfd..cc98dfdd3ed2 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -164,8 +164,8 @@ static void end_report(unsigned long *flags, void *addr)
(unsigned long)addr);
pr_err("==================================================================\n");
spin_unlock_irqrestore(&report_lock, *flags);
- if (panic_on_warn && !test_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags))
- panic("panic_on_warn set ...\n");
+ if (!test_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags))
+ check_panic_on_warn("KASAN");
if (kasan_arg_fault == KASAN_ARG_FAULT_PANIC)
panic("kasan.fault=panic set ...\n");
add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
diff --git a/mm/kfence/report.c b/mm/kfence/report.c
index 7e496856c2eb..110c27ca597d 100644
--- a/mm/kfence/report.c
+++ b/mm/kfence/report.c
@@ -268,8 +268,7 @@ void kfence_report_error(unsigned long address, bool is_write, struct pt_regs *r

lockdep_on();

- if (panic_on_warn)
- panic("panic_on_warn set ...\n");
+ check_panic_on_warn("KFENCE");

/* We encountered a memory safety error, taint the kernel! */
add_taint(TAINT_BAD_PAGE, LOCKDEP_STILL_OK);
--
2.34.1

Kees Cook

unread,

Nov 17, 2022, 6:43:33 PM11/17/22

to Jann Horn, Kees Cook, Petr Mladek, Andrew Morton, tangmeng, Guilherme G. Piccoli, Sebastian Andrzej Siewior, Tiezhu Yang, Luis Chamberlain, Seth Jenkins, Greg KH, Linus Torvalds, Andy Lutomirski, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Since Warn count is now tracked and is a fairly interesting signal, add
the entry /sys/kernel/warn_count to expose it to userspace.

Cc: Petr Mladek <pml...@suse.com>

Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: tangmeng <tang...@uniontech.com>

Cc: "Guilherme G. Piccoli" <gpic...@igalia.com>
Cc: Sebastian Andrzej Siewior <big...@linutronix.de>

Cc: Tiezhu Yang <yangt...@loongson.cn>

Reviewed-by: Luis Chamberlain <mcg...@kernel.org>
Signed-off-by: Kees Cook <kees...@chromium.org>
---

.../ABI/testing/sysfs-kernel-warn_count | 6 +++++
MAINTAINERS | 1 +
kernel/panic.c | 22 +++++++++++++++++--

3 files changed, 27 insertions(+), 2 deletions(-)

create mode 100644 Documentation/ABI/testing/sysfs-kernel-warn_count

diff --git a/Documentation/ABI/testing/sysfs-kernel-warn_count b/Documentation/ABI/testing/sysfs-kernel-warn_count

new file mode 100644

index 000000000000..08f083d2fd51
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-warn_count

@@ -0,0 +1,6 @@
+What: /sys/kernel/oops_count
+Date: November 2022
+KernelVersion: 6.2.0
+Contact: Linux Kernel Hardening List <linux-h...@vger.kernel.org>
+Description:

+ Shows how many times the system has Warned since last boot.
diff --git a/MAINTAINERS b/MAINTAINERS
index 0a1e95a58e54..282cd8a513fd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11107,6 +11107,7 @@ L: linux-h...@vger.kernel.org

S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/hardening

F: Documentation/ABI/testing/sysfs-kernel-oops_count
+F: Documentation/ABI/testing/sysfs-kernel-warn_count

F: include/linux/overflow.h
F: include/linux/randomize_kstack.h
F: mm/usercopy.c

diff --git a/kernel/panic.c b/kernel/panic.c
index e5aab27496d7..d718531d8bf4 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c

@@ -32,6 +32,7 @@
#include <linux/bug.h>
#include <linux/ratelimit.h>
#include <linux/debugfs.h>
+#include <linux/sysfs.h>
#include <trace/events/error_report.h>
#include <asm/sections.h>

@@ -107,6 +108,25 @@ static __init int kernel_panic_sysctls_init(void)
late_initcall(kernel_panic_sysctls_init);
#endif

+static atomic_t warn_count = ATOMIC_INIT(0);
+
+#ifdef CONFIG_SYSFS
+static ssize_t warn_count_show(struct kobject *kobj, struct kobj_attribute *attr,
+ char *page)
+{
+ return sysfs_emit(page, "%d\n", atomic_read(&warn_count));
+}
+

+static struct kobj_attribute warn_count_attr = __ATTR_RO(warn_count);
+
+static __init int kernel_panic_sysfs_init(void)
+{

+ sysfs_add_file_to_group(kernel_kobj, &warn_count_attr.attr, NULL);
+ return 0;
+}

+late_initcall(kernel_panic_sysfs_init);
+#endif
+
static long no_blink(int state)
{
return 0;

@@ -211,8 +231,6 @@ static void panic_print_sys_info(bool console_flush)

void check_panic_on_warn(const char *origin)
{
- static atomic_t warn_count = ATOMIC_INIT(0);
-
if (panic_on_warn)

panic("%s: panic_on_warn set ...\n", origin);

--
2.34.1

Kees Cook

unread,

Nov 17, 2022, 7:27:45 PM11/17/22

to Jann Horn, Jonathan Corbet, Andrew Morton, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Petr Mladek, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, linu...@vger.kernel.org, Luis Chamberlain, Seth Jenkins, Greg KH, Linus Torvalds, Andy Lutomirski, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linux-h...@vger.kernel.org

Bah. This should be: origin, warn_limit.

--
Kees Cook

Marco Elver

unread,

Nov 18, 2022, 3:34:14 AM11/18/22

to Kees Cook, Jann Horn, Dmitry Vyukov, Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Andrew Morton, David Gow, tangmeng, Shuah Khan, Petr Mladek, Paul E. McKenney, Sebastian Andrzej Siewior, Guilherme G. Piccoli, Tiezhu Yang, kasa...@googlegroups.com, linu...@kvack.org, Luis Chamberlain, Seth Jenkins, Greg KH, Linus Torvalds, Andy Lutomirski, Eric W. Biederman, Arnd Bergmann, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Reviewed-by: Marco Elver <el...@google.com>

Andrey Konovalov

unread,

Nov 26, 2022, 12:09:41 PM11/26/22

to Kees Cook, Jann Horn, Marco Elver, Dmitry Vyukov, Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Vincenzo Frascino, Andrew Morton, David Gow, tangmeng, Shuah Khan, Petr Mladek, Paul E. McKenney, Sebastian Andrzej Siewior, Guilherme G. Piccoli, Tiezhu Yang, kasa...@googlegroups.com, linu...@kvack.org, Luis Chamberlain, Seth Jenkins, Greg KH, Linus Torvalds, Andy Lutomirski, Eric W. Biederman, Arnd Bergmann, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Reviewed-by: Andrey Konovalov <andre...@gmail.com>

SeongJae Park

unread,

Jan 19, 2023, 3:10:30 PM1/19/23

to Kees Cook, Jann Horn, Luis Chamberlain, Seth Jenkins, Greg KH, Linus Torvalds, Andy Lutomirski, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Hello,

I found a blog article[1] recommending LTS kernels to backport this as below.

While this patch is already upstream, it is important that distributed
kernels also inherit this oops limit and backport it to LTS releases if we
want to avoid treating such null-dereference bugs as full-fledged security
issues in the future.

Do you have a plan to backport this into upstream LTS kernels?

[1] https://googleprojectzero.blogspot.com/2023/01/exploiting-null-dereferences-in-linux.html

Thanks,
SJ

Seth Jenkins

unread,

Jan 19, 2023, 3:19:33 PM1/19/23

to SeongJae Park, Kees Cook, Jann Horn, Luis Chamberlain, Greg KH, Linus Torvalds, Andy Lutomirski, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

> Do you have a plan to backport this into upstream LTS kernels?

As I understand, the answer is "hopefully yes" with the big
presumption that all stakeholders are on board for the change. There
is *definitely* a plan to *submit* backports to the stable trees, but
ofc it will require some approvals.

Kees Cook

unread,

Jan 19, 2023, 7:28:44 PM1/19/23

to Seth Jenkins, SeongJae Park, Jann Horn, Luis Chamberlain, Greg KH, Linus Torvalds, Andy Lutomirski, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Eric Biggers, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

On Thu, Jan 19, 2023 at 03:19:21PM -0500, Seth Jenkins wrote:
> > Do you have a plan to backport this into upstream LTS kernels?
>
> As I understand, the answer is "hopefully yes" with the big
> presumption that all stakeholders are on board for the change. There
> is *definitely* a plan to *submit* backports to the stable trees, but
> ofc it will require some approvals.

I've asked for at least v6.1.x (it's a clean cherry-pick). Earlier
kernels will need some non-trivial backporting. Is there anyone that
would be interested in stepping up to do that?

https://lore.kernel.org/lkml/202301191532.AEEC765@keescook

-Kees

--
Kees Cook

Eric Biggers

unread,

Jan 24, 2023, 1:54:58 PM1/24/23

to Kees Cook, Seth Jenkins, SeongJae Park, Jann Horn, Luis Chamberlain, Greg KH, Linus Torvalds, Andy Lutomirski, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

I've sent out a backport to 5.15:
https://lore.kernel.org/stable/20230124185110.1...@kernel.org/T/#t

- Eric

Eric Biggers

unread,

Jan 24, 2023, 2:38:10 PM1/24/23

to Kees Cook, Seth Jenkins, SeongJae Park, Jann Horn, Luis Chamberlain, Greg KH, Linus Torvalds, Andy Lutomirski, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Also 5.10, which wasn't too hard after doing 5.15:
https://lore.kernel.org/stable/20230124193004.2...@kernel.org/T/#t

- Eric

Kees Cook

unread,

Jan 24, 2023, 6:10:00 PM1/24/23

to Eric Biggers, Kees Cook, Seth Jenkins, SeongJae Park, Jann Horn, Luis Chamberlain, Greg KH, Linus Torvalds, Andy Lutomirski, Andrew Morton, tangmeng, Guilherme G. Piccoli, Tiezhu Yang, Sebastian Andrzej Siewior, Eric W. Biederman, Arnd Bergmann, Dmitry Vyukov, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Daniel Bristot de Oliveira, Valentin Schneider, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, David Gow, Paul E. McKenney, Jonathan Corbet, Baolin Wang, Jason A. Donenfeld, Huang Ying, Anton Vorontsov, Mauro Carvalho Chehab, Laurent Dufour, Rob Herring, linux-...@vger.kernel.org, kasa...@googlegroups.com, linu...@kvack.org, linu...@vger.kernel.org, linux-h...@vger.kernel.org

Oh excellent! Thank you very much!

-Kees

--
Kees Cook

Reply all

Reply to author

Forward