INFO: rcu detected stall in bitmap_parselist

12 views
Skip to first unread message

syzbot

unread,
Apr 1, 2018, 1:13:02ā€ÆPM4/1/18
to cgr...@vger.kernel.org, linux-...@vger.kernel.org, liz...@huawei.com, syzkall...@googlegroups.com
Hello,

syzbot hit the following crash on upstream commit
3eb2ce825ea1ad89d20f7a3b5780df850e4be274 (Sun Mar 25 22:44:30 2018 +0000)
Linux 4.16-rc7
syzbot dashboard link:
https://syzkaller.appspot.com/bug?extid=6887cbb011c8054e8a3d

So far this crash happened 3 times on upstream.
Unfortunately, I don't have any reproducer for this crash yet.
Raw console output:
https://syzkaller.appspot.com/x/log.txt?id=5674881425342464
Kernel config:
https://syzkaller.appspot.com/x/.config?id=-8440362230543204781
compiler: gcc (GCC) 7.1.1 20170620

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+6887cb...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for
details.
If you forward the report, please keep this part and the footer.

INFO: rcu_sched self-detected stall on CPU
1-....: (124999 ticks this GP) idle=0da/1/4611686018427387906
softirq=49340/49340 fqs=31180
(t=125000 jiffies g=24134 c=24133 q=654)
NMI backtrace for cpu 1
CPU: 1 PID: 14671 Comm: syz-executor3 Not tainted 4.16.0-rc7+ #368
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103
nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62
arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
rcu_dump_cpu_stacks+0x186/0x1de kernel/rcu/tree.c:1375
print_cpu_stall kernel/rcu/tree.c:1524 [inline]
check_cpu_stall.isra.61+0xbb8/0x15b0 kernel/rcu/tree.c:1592
__rcu_pending kernel/rcu/tree.c:3361 [inline]
rcu_pending kernel/rcu/tree.c:3423 [inline]
rcu_check_callbacks+0x238/0xd20 kernel/rcu/tree.c:2763
update_process_times+0x30/0x60 kernel/time/timer.c:1636
tick_sched_handle+0x85/0x160 kernel/time/tick-sched.c:162
tick_sched_timer+0x42/0x120 kernel/time/tick-sched.c:1194
__run_hrtimer kernel/time/hrtimer.c:1349 [inline]
__hrtimer_run_queues+0x39c/0xec0 kernel/time/hrtimer.c:1411
hrtimer_interrupt+0x2a5/0x6f0 kernel/time/hrtimer.c:1469
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
smp_apic_timer_interrupt+0x14a/0x700 arch/x86/kernel/apic/apic.c:1050
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:857
</IRQ>
RIP: 0010:__bitmap_parselist+0x2f0/0x4b0 lib/bitmap.c:612
RSP: 0018:ffff88019ef0f6d8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12
RAX: 0000000000010000 RBX: 0000000000000001 RCX: ffffffff82af9d1d
RDX: 0000000000010000 RSI: ffffc900042fb000 RDI: ffff8801b5a023e0
RBP: ffff88019ef0f750 R08: ffffed0036b4047d R09: ffff8801b5a023e0
R10: 0000000000000001 R11: ffffed0036b4047c R12: 0000000000000008
R13: 0000000000000000 R14: 0000000000000000 R15: dffffc0000000000
bitmap_parselist+0x3a/0x50 lib/bitmap.c:628
cpulist_parse include/linux/cpumask.h:639 [inline]
update_cpumask kernel/cgroup/cpuset.c:974 [inline]
cpuset_write_resmask+0x1694/0x2850 kernel/cgroup/cpuset.c:1724
cgroup_file_write+0x2ae/0x710 kernel/cgroup/cgroup.c:3429
kernfs_fop_write+0x2bc/0x440 fs/kernfs/file.c:316
__vfs_write+0xef/0x970 fs/read_write.c:480
vfs_write+0x189/0x510 fs/read_write.c:544
SYSC_write fs/read_write.c:589 [inline]
SyS_write+0xef/0x220 fs/read_write.c:581
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x454879
RSP: 002b:00007f01ef5b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f01ef5b56d4 RCX: 0000000000454879
RDX: 0000000000000002 RSI: 0000000020000040 RDI: 0000000000000014
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000000006a1 R14: 00000000006fbfb8 R15: 0000000000000000


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzk...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.
Note: all commands must start from beginning of the line in the email body.

Tetsuo Handa

unread,
Apr 4, 2018, 8:21:46ā€ÆAM4/4/18
to Yury Norov, syzbot, cgr...@vger.kernel.org, linux-...@vger.kernel.org, liz...@huawei.com, syzkall...@googlegroups.com, Noam Camus, Rasmus Villemoes, Matthew Wilcox, Mauro Carvalho Chehab, Andrew Morton
Yury, are you OK with this patch?


From 7f21827cdfe9780b4949b22bcd19efa721b463d2 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Date: Wed, 4 Apr 2018 21:12:10 +0900
Subject: [PATCH] lib/bitmap: Rewrite __bitmap_parselist().

syzbot is catching stalls at __bitmap_parselist() [1]. The trigger is

unsigned long v = 0;
bitmap_parselist("7:,", &v, BITS_PER_LONG);

which results in hitting infinite loop at

while (a <= b) {
off = min(b - a + 1, used_size);
bitmap_set(maskp, a, off);
a += group_size;
}

due to used_size == group_size == 0.

Current code is difficult to read due to too many flag variables.
Let's rewrite it. My understanding of "range:used_size/group_size"
is "start[-end[:used_size/group_size]]" format.
Please check whether my understanding is correct...

[1] https://syzkaller.appspot.com/bug?id=ad7e0351fbc90535558514a71cd3edc11681997a

Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+6887cb...@syzkaller.appspotmail.com>
Fixes: 0a5ce0831d04382a ("lib/bitmap.c: make bitmap_parselist() thread-safe and much faster")
Cc: Yury Norov <yno...@caviumnetworks.com>
Cc: Noam Camus <noa...@mellanox.com>
Cc: Rasmus Villemoes <li...@rasmusvillemoes.dk>
Cc: Matthew Wilcox <mawi...@microsoft.com>
Cc: Mauro Carvalho Chehab <mch...@kernel.org>
Cc: Andrew Morton <ak...@linux-foundation.org>
---
lib/bitmap.c | 183 +++++++++++++++++++++++++----------------------------------
1 file changed, 78 insertions(+), 105 deletions(-)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 9e498c7..9cef440 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -485,6 +485,58 @@ int bitmap_print_to_pagebuf(bool list, char *buf, const unsigned long *maskp,
}
EXPORT_SYMBOL(bitmap_print_to_pagebuf);

+static bool get_uint(const char **buf, unsigned int *res)
+{
+ const char *p = *buf;
+
+ if (!isdigit(*p))
+ return false;
+ *res = simple_strtoul(p, (char **) buf, 10);
+ return p < *buf;
+}
+
+static int __bitmap_parse_one_chunk(const char *buf, unsigned long *maskp,
+ const unsigned int nmaskbits)
+{
+ unsigned int start;
+ unsigned int end;
+ unsigned int group_size;
+ unsigned int used_size;
+
+ while (*buf && isspace(*buf))
+ buf++;
+ if (!get_uint(&buf, &start))
+ return -EINVAL;
+ if (*buf == '-') {
+ buf++;
+ if (!get_uint(&buf, &end) || start > end)
+ return -EINVAL;
+ if (*buf == ':') {
+ buf++;
+ if (!get_uint(&buf, &used_size) || *buf++ != '/' ||
+ !get_uint(&buf, &group_size) ||
+ used_size > group_size)
+ return -EINVAL;
+ } else {
+ group_size = used_size = end - start + 1;
+ }
+ } else {
+ end = start;
+ group_size = used_size = 1;
+ }
+ if (end >= nmaskbits)
+ return -ERANGE;
+ while (start <= end) {
+ const unsigned int bits = min(end - start + 1, used_size);
+
+ bitmap_set(maskp, start, bits);
+ start += group_size;
+ }
+ while (*buf && isspace(*buf))
+ buf++;
+ return *buf ? -EINVAL : 0;
+}
+
/**
* __bitmap_parselist - convert list format ASCII string to bitmap
* @buf: read nul-terminated user string from this buffer
@@ -511,113 +563,34 @@ int bitmap_print_to_pagebuf(bool list, char *buf, const unsigned long *maskp,
* - ``-ERANGE``: bit number specified too large for mask
*/
static int __bitmap_parselist(const char *buf, unsigned int buflen,
- int is_user, unsigned long *maskp,
- int nmaskbits)
+ const int is_user, unsigned long *maskp,
+ const int nmaskbits)
{
- unsigned int a, b, old_a, old_b;
- unsigned int group_size, used_size, off;
- int c, old_c, totaldigits, ndigits;
- const char __user __force *ubuf = (const char __user __force *)buf;
- int at_start, in_range, in_partial_range;
-
- totaldigits = c = 0;
- old_a = old_b = 0;
- group_size = used_size = 0;
+ int err = 0;
bitmap_zero(maskp, nmaskbits);
- do {
- at_start = 1;
- in_range = 0;
- in_partial_range = 0;
- a = b = 0;
- ndigits = totaldigits;
-
- /* Get the next cpu# or a range of cpu#'s */
- while (buflen) {
- old_c = c;
- if (is_user) {
- if (__get_user(c, ubuf++))
- return -EFAULT;
- } else
- c = *buf++;
- buflen--;
- if (isspace(c))
- continue;
-
- /* A '\0' or a ',' signal the end of a cpu# or range */
- if (c == '\0' || c == ',')
- break;
- /*
- * whitespaces between digits are not allowed,
- * but it's ok if whitespaces are on head or tail.
- * when old_c is whilespace,
- * if totaldigits == ndigits, whitespace is on head.
- * if whitespace is on tail, it should not run here.
- * as c was ',' or '\0',
- * the last code line has broken the current loop.
- */
- if ((totaldigits != ndigits) && isspace(old_c))
- return -EINVAL;
-
- if (c == '/') {
- used_size = a;
- at_start = 1;
- in_range = 0;
- a = b = 0;
- continue;
- }
-
- if (c == ':') {
- old_a = a;
- old_b = b;
- at_start = 1;
- in_range = 0;
- in_partial_range = 1;
- a = b = 0;
- continue;
- }
-
- if (c == '-') {
- if (at_start || in_range)
- return -EINVAL;
- b = 0;
- in_range = 1;
- at_start = 1;
- continue;
- }
-
- if (!isdigit(c))
- return -EINVAL;
-
- b = b * 10 + (c - '0');
- if (!in_range)
- a = b;
- at_start = 0;
- totaldigits++;
- }
- if (ndigits == totaldigits)
- continue;
- if (in_partial_range) {
- group_size = a;
- a = old_a;
- b = old_b;
- old_a = old_b = 0;
- } else {
- used_size = group_size = b - a + 1;
- }
- /* if no digit is after '-', it's wrong*/
- if (at_start && in_range)
- return -EINVAL;
- if (!(a <= b) || !(used_size <= group_size))
- return -EINVAL;
- if (b >= nmaskbits)
- return -ERANGE;
- while (a <= b) {
- off = min(b - a + 1, used_size);
- bitmap_set(maskp, a, off);
- a += group_size;
- }
- } while (buflen && c == ',');
- return 0;
+ while (buflen && !err) {
+ char *cp;
+ char tmpbuf[256];
+ unsigned int size = min(buflen,
+ (unsigned int) sizeof(tmpbuf) - 1);
+
+ if (!is_user)
+ memcpy(tmpbuf, buf, size);
+ else if (copy_from_user(tmpbuf, (const char __user __force *)
+ buf, size))
+ return -EFAULT;
+ tmpbuf[size] = '\0';
+ cp = strchr(tmpbuf, ',');
+ if (cp) {
+ *cp = '\0';
+ size = cp - tmpbuf + 1;
+ } else if (size != buflen)
+ return -EINVAL; /* Chunk too long. */
+ buflen -= size;
+ buf += size;
+ err = __bitmap_parse_one_chunk(tmpbuf, maskp, nmaskbits);
+ }
+ return err;
}

int bitmap_parselist(const char *bp, unsigned long *maskp, int nmaskbits)
--
1.8.3.1


Yury Norov

unread,
Apr 4, 2018, 11:41:55ā€ÆAM4/4/18
to Tetsuo Handa, syzbot, cgr...@vger.kernel.org, linux-...@vger.kernel.org, liz...@huawei.com, syzkall...@googlegroups.com, Noam Camus, Rasmus Villemoes, Matthew Wilcox, Mauro Carvalho Chehab, Andrew Morton
Hi Tetsuo,

Thanks for the patch.

On Wed, Apr 04, 2018 at 09:21:43PM +0900, Tetsuo Handa wrote:
> Yury, are you OK with this patch?
>
>
> >From 7f21827cdfe9780b4949b22bcd19efa721b463d2 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
> Date: Wed, 4 Apr 2018 21:12:10 +0900
> Subject: [PATCH] lib/bitmap: Rewrite __bitmap_parselist().
>
> syzbot is catching stalls at __bitmap_parselist() [1]. The trigger is
>
> unsigned long v = 0;
> bitmap_parselist("7:,", &v, BITS_PER_LONG);

Could you add this case to the test_bitmap_parselist()?

> which results in hitting infinite loop at
>
> while (a <= b) {
> off = min(b - a + 1, used_size);
> bitmap_set(maskp, a, off);
> a += group_size;
> }
>
> due to used_size == group_size == 0.
>
> Current code is difficult to read due to too many flag variables.
> Let's rewrite it.

I also don't like current implementation of bitmap_parselist(), but
discussion on new code may take some time. Can you submit minimal
fix in separated patch to let people discuss your new implementation
without rush?

> My understanding of "range:used_size/group_size"
> is "start[-end[:used_size/group_size]]" format.
> Please check whether my understanding is correct...

My understanding is same. Can you add it to documentation or comment
to function?
In comment to simple_strtoul(): "This function is obsolete. Please
use kstrtoul instead."
So this is still not safe against "1-10:0/0", or I miss something?
(This is another testcase we should add to test_bitmap.c)

> + } else {
> + group_size = used_size = end - start + 1;
> + }
> + } else {
> + end = start;
> + group_size = used_size = 1;
> + }
> + if (end >= nmaskbits)
> + return -ERANGE;

This should be checked before we start parsing group, to avoid useless work.
This is not safe against this:
"[250 whitespaces] 567-890:123/456"

And it will be Schlemiel the painter's-styled algorithm for input like:
"1,2,3,4, ... ,98,99,100".

I think we need something like __bitmap_parse_get_chunk() to copy
coma-separated substrings.

> + tmpbuf[size] = '\0';
> + cp = strchr(tmpbuf, ',');
> + if (cp) {
> + *cp = '\0';
> + size = cp - tmpbuf + 1;
> + } else if (size != buflen)
> + return -EINVAL; /* Chunk too long. */
> + buflen -= size;
> + buf += size;
> + err = __bitmap_parse_one_chunk(tmpbuf, maskp, nmaskbits);
> + }
> + return err;
> }
>
> int bitmap_parselist(const char *bp, unsigned long *maskp, int nmaskbits)
> --
> 1.8.3.1

Thanks,
Yury

Tetsuo Handa

unread,
Apr 4, 2018, 11:58:58ā€ÆAM4/4/18
to yno...@caviumnetworks.com, syzbot+6887cb...@syzkaller.appspotmail.com, cgr...@vger.kernel.org, linux-...@vger.kernel.org, liz...@huawei.com, syzkall...@googlegroups.com, noa...@mellanox.com, li...@rasmusvillemoes.dk, mawi...@microsoft.com, mch...@kernel.org, ak...@linux-foundation.org
OK. Then you can write the patch. You know current code better than I.

> > @@ -485,6 +485,58 @@ int bitmap_print_to_pagebuf(bool list, char *buf, const unsigned long *maskp,
> > }
> > EXPORT_SYMBOL(bitmap_print_to_pagebuf);
> >
> > +static bool get_uint(const char **buf, unsigned int *res)
> > +{
> > + const char *p = *buf;
> > +
> > + if (!isdigit(*p))
> > + return false;
> > + *res = simple_strtoul(p, (char **) buf, 10);
>
> In comment to simple_strtoul(): "This function is obsolete. Please
> use kstrtoul instead."

I intentionally choose simple_strtoul() because next delimiter (e.g. '-')
starts at returned address. kstrtoul() fails if next letter starts.

>
> > + return p < *buf;

I think I should limit to "0 <= *res <= INT_MAX" range in order to avoid
overflow at start += group_size.
Indeed. We need to make more testcases.

> > + while (buflen && !err) {
> > + char *cp;
> > + char tmpbuf[256];
> > + unsigned int size = min(buflen,
> > + (unsigned int) sizeof(tmpbuf) - 1);
> > +
> > + if (!is_user)
> > + memcpy(tmpbuf, buf, size);
> > + else if (copy_from_user(tmpbuf, (const char __user __force *)
> > + buf, size))
> > + return -EFAULT;
>
> This is not safe against this:
> "[250 whitespaces] 567-890:123/456"

Do we need to accept such insane entry?

Yury Norov

unread,
Apr 4, 2018, 12:53:26ā€ÆPM4/4/18
to Tetsuo Handa, syzbot+6887cb...@syzkaller.appspotmail.com, cgr...@vger.kernel.org, linux-...@vger.kernel.org, liz...@huawei.com, syzkall...@googlegroups.com, noa...@mellanox.com, li...@rasmusvillemoes.dk, mawi...@microsoft.com, mch...@kernel.org, ak...@linux-foundation.org
Done.

> > > @@ -485,6 +485,58 @@ int bitmap_print_to_pagebuf(bool list, char *buf, const unsigned long *maskp,
> > > }
> > > EXPORT_SYMBOL(bitmap_print_to_pagebuf);
> > >
> > > +static bool get_uint(const char **buf, unsigned int *res)
> > > +{
> > > + const char *p = *buf;
> > > +
> > > + if (!isdigit(*p))
> > > + return false;
> > > + *res = simple_strtoul(p, (char **) buf, 10);
> >
> > In comment to simple_strtoul(): "This function is obsolete. Please
> > use kstrtoul instead."
>
> I intentionally choose simple_strtoul() because next delimiter (e.g. '-')
> starts at returned address. kstrtoul() fails if next letter starts.

OK, but then it should be explained in comment, I think.
This is how current implementation works - no limit on number of whitespaces
before and after the cunk. It's userspace interface, and we should be careful
adding new limitations. God forbid us break userspace. :-)

It looks insane, but this kind of things is quite possible if input string
is the result of heavy scripting.
Reply all
Reply to author
Forward
0 new messages