INFO: rcu detected stall in __process_echoes

52 views
Skip to first unread message

syzbot

unread,
Mar 28, 2018, 3:01:02 AM3/28/18
to ak...@linux-foundation.org, arya...@virtuozzo.com, dvy...@google.com, gre...@linuxfoundation.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, tch...@google.com, tg...@linutronix.de
Hello,

syzbot hit the following crash on upstream commit
3eb2ce825ea1ad89d20f7a3b5780df850e4be274 (Sun Mar 25 22:44:30 2018 +0000)
Linux 4.16-rc7
syzbot dashboard link:
https://syzkaller.appspot.com/bug?extid=108696293d7a21ab688f

So far this crash happened 10 times on upstream.
C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4941573204738048
syzkaller reproducer:
https://syzkaller.appspot.com/x/repro.syz?id=5314352743710720
Raw console output:
https://syzkaller.appspot.com/x/log.txt?id=5561943247028224
Kernel config:
https://syzkaller.appspot.com/x/.config?id=-8440362230543204781
compiler: gcc (GCC) 7.1.1 20170620

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+108696...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for
details.
If you forward the report, please keep this part and the footer.

INFO: rcu_sched self-detected stall on CPU
0-....: (124999 ticks this GP) idle=6aa/1/4611686018427387906
softirq=9528/9528 fqs=31245
(t=125000 jiffies g=4714 c=4713 q=8)
NMI backtrace for cpu 0
CPU: 0 PID: 51 Comm: kworker/u4:2 Not tainted 4.16.0-rc7+ #3
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events_unbound flush_to_ldisc
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103
nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62
arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
rcu_dump_cpu_stacks+0x186/0x1de kernel/rcu/tree.c:1375
print_cpu_stall kernel/rcu/tree.c:1524 [inline]
check_cpu_stall.isra.61+0xbb8/0x15b0 kernel/rcu/tree.c:1592
__rcu_pending kernel/rcu/tree.c:3361 [inline]
rcu_pending kernel/rcu/tree.c:3423 [inline]
rcu_check_callbacks+0x238/0xd20 kernel/rcu/tree.c:2763
update_process_times+0x30/0x60 kernel/time/timer.c:1636
tick_sched_handle+0x85/0x160 kernel/time/tick-sched.c:162
tick_sched_timer+0x42/0x120 kernel/time/tick-sched.c:1194
__run_hrtimer kernel/time/hrtimer.c:1349 [inline]
__hrtimer_run_queues+0x39c/0xec0 kernel/time/hrtimer.c:1411
hrtimer_interrupt+0x2a5/0x6f0 kernel/time/hrtimer.c:1469
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
smp_apic_timer_interrupt+0x14a/0x700 arch/x86/kernel/apic/apic.c:1050
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:857
</IRQ>
RIP: 0010:__write_once_size include/linux/compiler.h:215 [inline]
RIP: 0010:__sanitizer_cov_trace_pc+0x4b/0x50 kernel/kcov.c:109
RSP: 0018:ffff8801d96170a8 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff12
RAX: ffff8801d960a5c0 RBX: ffffc90001fb8000 RCX: ffffffff82f38349
RDX: 0000000000000000 RSI: 1ffff1003b2c15e4 RDI: ffffc90001fb9963
RBP: ffff8801d96170a8 R08: 1ffff1003b2c2dbd R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 000000057efa5704
R13: dffffc0000000000 R14: 000000057efa5704 R15: 0000000000003321
echo_buf drivers/tty/n_tty.c:144 [inline]
__process_echoes+0x5c9/0x770 drivers/tty/n_tty.c:732
flush_echoes drivers/tty/n_tty.c:799 [inline]
__receive_buf drivers/tty/n_tty.c:1615 [inline]
n_tty_receive_buf_common+0x1380/0x2520 drivers/tty/n_tty.c:1709
n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744
tty_ldisc_receive_buf+0xa7/0x180 drivers/tty/tty_buffer.c:456
tty_port_default_receive_buf+0x106/0x160 drivers/tty/tty_port.c:38
receive_buf drivers/tty/tty_buffer.c:475 [inline]
flush_to_ldisc+0x3c4/0x590 drivers/tty/tty_buffer.c:524
process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113
worker_thread+0x223/0x1990 kernel/workqueue.c:2247
kthread+0x33c/0x400 kernel/kthread.c:238
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzk...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.
Note: all commands must start from beginning of the line in the email body.

Tetsuo Handa

unread,
Apr 21, 2018, 10:05:12 PM4/21/18
to syzbot, dvy...@google.com, gre...@linuxfoundation.org, syzkall...@googlegroups.com, ak...@linux-foundation.org, arya...@virtuozzo.com, linux-...@vger.kernel.org, tch...@google.com, tg...@linutronix.de
On 2018/03/28 16:01, syzbot wrote:
> Hello,
>
> syzbot hit the following crash on upstream commit
> 3eb2ce825ea1ad89d20f7a3b5780df850e4be274 (Sun Mar 25 22:44:30 2018 +0000)
> Linux 4.16-rc7
> syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=108696293d7a21ab688f
>
> So far this crash happened 10 times on upstream.
> C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4941573204738048
> syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=5314352743710720
> Raw console output: https://syzkaller.appspot.com/x/log.txt?id=5561943247028224
> Kernel config: https://syzkaller.appspot.com/x/.config?id=-8440362230543204781
> compiler: gcc (GCC) 7.1.1 20170620

Reproducer and result are essentially same with another report. Thus,

#syz dup: INFO: rcu detected stall in n_tty_receive_char_special

Tetsuo Handa

unread,
May 17, 2018, 9:45:18 AM5/17/18
to syzbot, dvy...@google.com, gre...@linuxfoundation.org, syzkall...@googlegroups.com, ak...@linux-foundation.org, arya...@virtuozzo.com, tch...@google.com, tg...@linutronix.de
I assumed that this is a dup because I got both results using same reproducer.
But it seems that I need to handle separately. Thus,

#syz undup

Tetsuo Handa

unread,
May 17, 2018, 9:49:10 AM5/17/18
to syzbot, gre...@linuxfoundation.org, syzkall...@googlegroups.com, Peter Hurley, jsl...@suse.com, dvy...@google.com, ak...@linux-foundation.org, arya...@virtuozzo.com, tch...@google.com, tg...@linutronix.de
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

From ada7279f7f034c5fd79fc04e1120069ea5f6cef2 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Date: Thu, 17 May 2018 20:42:50 +0900
Subject: [PATCH] n_tty: Access echo_* variables carefully.

syzbot is reporting stalls at __process_echoes() [1]. This is because
since ldata->echo_commit < ldata->echo_tail becomes true for some reason,
the discard loop is serving as almost infinite loop. This patch tries to
avoid falling into ldata->echo_commit < ldata->echo_tail situation by
making access to echo_* variables more carefully.

Since reset_buffer_flags() is called without output_lock held, it should
not touch echo_* variables. And omit a call to reset_buffer_flags() from
n_tty_open() by using vzalloc().

Since add_echo_byte() is called without output_lock held, it needs memory
barrier between storing into echo_buf[] and incrementing echo_head counter.
echo_buf() needs corresponding memory barrier before reading echo_buf[].
Lack of handling the possibility of not-yet-stored multi-byte operation
might be the reason of falling into ldata->echo_commit < ldata->echo_tail
situation, for if I do WARN_ON(ldata->echo_commit == tail + 1) prior to
echo_buf(ldata, tail + 1), the WARN_ON() fires.

The echo_* variables are now updated with output_lock held. Thus, I guess
that ldata->echo_commit < ldata->echo_tail is no longer possible. But to
become more carefully, use ldata->echo_commit > ldata->echo_tail as
the preventive condition to continue the "while" loops.

[1] https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40

Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+108696...@syzkaller.appspotmail.com>
Cc: Peter Hurley <pe...@hurleysoftware.com>
---
drivers/tty/n_tty.c | 51 ++++++++++++++++++++++++++++++---------------------
1 file changed, 30 insertions(+), 21 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index cbe98bc..060ef91 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -90,9 +90,6 @@ struct n_tty_data {
size_t read_head;
size_t commit_head;
size_t canon_head;
- size_t echo_head;
- size_t echo_commit;
- size_t echo_mark;
DECLARE_BITMAP(char_map, 256);

/* private to n_tty_receive_overrun (single-threaded) */
@@ -119,6 +116,9 @@ struct n_tty_data {
unsigned int column;
unsigned int canon_column;
size_t echo_tail;
+ size_t echo_head;
+ size_t echo_commit;
+ size_t echo_mark;

struct mutex atomic_read_lock;
struct mutex output_lock;
@@ -141,6 +141,7 @@ static inline unsigned char *read_buf_addr(struct n_tty_data *ldata, size_t i)

static inline unsigned char echo_buf(struct n_tty_data *ldata, size_t i)
{
+ smp_rmb(); /* Matches smp_wmb() in add_echo_byte(). */
return ldata->echo_buf[i & (N_TTY_BUF_SIZE - 1)];
}

@@ -316,9 +317,7 @@ static inline void put_tty_queue(unsigned char c, struct n_tty_data *ldata)
static void reset_buffer_flags(struct n_tty_data *ldata)
{
ldata->read_head = ldata->canon_head = ldata->read_tail = 0;
- ldata->echo_head = ldata->echo_tail = ldata->echo_commit = 0;
ldata->commit_head = 0;
- ldata->echo_mark = 0;
ldata->line_start = 0;

ldata->erasing = 0;
@@ -617,13 +616,22 @@ static size_t __process_echoes(struct tty_struct *tty)
old_space = space = tty_write_room(tty);

tail = ldata->echo_tail;
- while (ldata->echo_commit != tail) {
+ if (WARN_ON(ldata->echo_commit < tail))
+ return 0;
+ while (ldata->echo_commit > tail) {
c = echo_buf(ldata, tail);
if (c == ECHO_OP_START) {
unsigned char op;
int no_space_left = 0;

/*
+ * Since add_echo_byte() is called without holding
+ * output_lock, we might see only portion of multi-byte
+ * operation.
+ */
+ if (ldata->echo_commit == tail + 1)
+ goto not_yet_stored;
+ /*
* If the buffer byte is the start of a multi-byte
* operation, get the next byte, which is either the
* op code or a control character value.
@@ -634,6 +642,8 @@ static size_t __process_echoes(struct tty_struct *tty)
unsigned int num_chars, num_bs;

case ECHO_OP_ERASE_TAB:
+ if (ldata->echo_commit == tail + 2)
+ goto not_yet_stored;
num_chars = echo_buf(ldata, tail + 2);

/*
@@ -728,7 +738,9 @@ static size_t __process_echoes(struct tty_struct *tty)
/* If the echo buffer is nearly full (so that the possibility exists
* of echo overrun before the next commit), then discard enough
* data at the tail to prevent a subsequent overrun */
- while (ldata->echo_commit - tail >= ECHO_DISCARD_WATERMARK) {
+ WARN_ON(ldata->echo_commit < tail);
+ while (ldata->echo_commit > tail &&
+ ldata->echo_commit - tail >= ECHO_DISCARD_WATERMARK) {
if (echo_buf(ldata, tail) == ECHO_OP_START) {
if (echo_buf(ldata, tail + 1) == ECHO_OP_ERASE_TAB)
tail += 3;
@@ -738,6 +750,7 @@ static size_t __process_echoes(struct tty_struct *tty)
tail++;
}

+ not_yet_stored:
ldata->echo_tail = tail;
return old_space - space;
}
@@ -748,6 +761,7 @@ static void commit_echoes(struct tty_struct *tty)
size_t nr, old, echoed;
size_t head;

+ mutex_lock(&ldata->output_lock);
head = ldata->echo_head;
ldata->echo_mark = head;
old = ldata->echo_commit - ldata->echo_tail;
@@ -756,10 +770,12 @@ static void commit_echoes(struct tty_struct *tty)
* is over the threshold (and try again each time another
* block is accumulated) */
nr = head - ldata->echo_tail;
- if (nr < ECHO_COMMIT_WATERMARK || (nr % ECHO_BLOCK > old % ECHO_BLOCK))
+ if (nr < ECHO_COMMIT_WATERMARK ||
+ (nr % ECHO_BLOCK > old % ECHO_BLOCK)) {
+ mutex_unlock(&ldata->output_lock);
return;
+ }

- mutex_lock(&ldata->output_lock);
ldata->echo_commit = head;
echoed = __process_echoes(tty);
mutex_unlock(&ldata->output_lock);
@@ -810,7 +826,9 @@ static void flush_echoes(struct tty_struct *tty)

static inline void add_echo_byte(unsigned char c, struct n_tty_data *ldata)
{
- *echo_buf_addr(ldata, ldata->echo_head++) = c;
+ *echo_buf_addr(ldata, ldata->echo_head) = c;
+ smp_wmb(); /* Matches smp_rmb() in echo_buf(). */
+ ldata->echo_head++;
}

/**
@@ -1878,30 +1896,21 @@ static int n_tty_open(struct tty_struct *tty)
struct n_tty_data *ldata;

/* Currently a malloc failure here can panic */
- ldata = vmalloc(sizeof(*ldata));
+ ldata = vzalloc(sizeof(*ldata));
if (!ldata)
- goto err;
+ return -ENOMEM;

ldata->overrun_time = jiffies;
mutex_init(&ldata->atomic_read_lock);
mutex_init(&ldata->output_lock);

tty->disc_data = ldata;
- reset_buffer_flags(tty->disc_data);
- ldata->column = 0;
- ldata->canon_column = 0;
- ldata->num_overrun = 0;
- ldata->no_room = 0;
- ldata->lnext = 0;
tty->closing = 0;
/* indicate buffer work may resume */
clear_bit(TTY_LDISC_HALTED, &tty->flags);
n_tty_set_termios(tty, NULL);
tty_unthrottle(tty);
-
return 0;
-err:
- return -ENOMEM;
}

static inline int input_available_p(struct tty_struct *tty, int poll)
--
1.8.3.1

syzbot

unread,
May 17, 2018, 10:11:02 AM5/17/18
to ak...@linux-foundation.org, arya...@virtuozzo.com, dvy...@google.com, gre...@linuxfoundation.org, jsl...@suse.com, penguin...@i-love.sakura.ne.jp, pe...@hurleysoftware.com, syzkall...@googlegroups.com, tch...@google.com, tg...@linutronix.de
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger
crash:

Reported-and-tested-by:
syzbot+108696...@syzkaller.appspotmail.com

Tested on:

commit: e6506eb24187 Merge tag 'trace-v4.17-rc4-2' of git://git.ke..
git tree: upstream
kernel config: https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
patch: https://syzkaller.appspot.com/x/patch.diff?x=16cdde97800000

Note: testing is done by a robot and is best-effort only.

Greg KH

unread,
May 25, 2018, 12:13:11 PM5/25/18
to syzbot, ak...@linux-foundation.org, arya...@virtuozzo.com, dvy...@google.com, jsl...@suse.com, penguin...@i-love.sakura.ne.jp, pe...@hurleysoftware.com, syzkall...@googlegroups.com, tch...@google.com, tg...@linutronix.de
On Thu, May 17, 2018 at 07:11:01AM -0700, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch and the reproducer did not trigger
> crash:

Great! Is the patch going to be submitted "properly" so that I can
queue it up? :)

thanks,

greg k-h
Reply all
Reply to author
Forward
0 new messages