tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc

50 views
Skip to first unread message

Dmitry Vyukov

unread,
Dec 30, 2015, 5:44:21 AM12/30/15
to Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Hello,

I've got the following lock order inversion report on
8513342170278468bac126640a5d2d12ffbff106 (Dec 28). Unfortunately I
failed to find a reproducer for it.


======================================================
[ INFO: possible circular locking dependency detected ]
4.4.0-rc7+ #181 Not tainted
-------------------------------------------------------
kworker/u9:2/35 is trying to acquire lock:
(routelock){+.+...}, at: [<ffffffff82bc77c0>]
n_tracerouter_receivebuf+0x20/0x40 drivers/tty/n_tracerouter.c:175

but task is already holding lock:
(&buf->lock){+.+...}, at: [<ffffffff82baf791>]
flush_to_ldisc+0xe1/0x7f0 drivers/tty/tty_buffer.c:487

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (&buf->lock){+.+...}:
[<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
kernel/locking/lockdep.c:3585
[< inline >] __raw_spin_lock_irqsave
include/linux/spinlock_api_smp.h:112
[<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70
kernel/locking/spinlock.c:159
[<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
[<ffffffff82b9a09a>] __isig+0x1a/0x50 drivers/tty/n_tty.c:1112
[<ffffffff82b9c16e>] isig+0xae/0x2c0 drivers/tty/n_tty.c:1131
[<ffffffff82b9ef02>] n_tty_receive_signal_char+0x22/0xf0
drivers/tty/n_tty.c:1243
[<ffffffff82ba4958>] n_tty_receive_char_special+0x1278/0x2bf0
drivers/tty/n_tty.c:1289
[< inline >] n_tty_receive_buf_fast drivers/tty/n_tty.c:1613
[< inline >] __receive_buf drivers/tty/n_tty.c:1647
[<ffffffff82ba7ca6>] n_tty_receive_buf_common+0x19d6/0x2450
drivers/tty/n_tty.c:1745
[<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40
drivers/tty/n_tty.c:1780
[< inline >] receive_buf drivers/tty/tty_buffer.c:450
[<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0
drivers/tty/tty_buffer.c:517
[<ffffffff8133833c>] process_one_work+0x76c/0x13e0
kernel/workqueue.c:2030
[<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
[<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

-> #2 (&o_tty->termios_rwsem/1){++++..}:
[<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
kernel/locking/lockdep.c:3585
[<ffffffff85c899cc>] down_read+0x4c/0x70 kernel/locking/rwsem.c:22
[<ffffffff82ba635d>] n_tty_receive_buf_common+0x8d/0x2450
drivers/tty/n_tty.c:1708
[<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40
drivers/tty/n_tty.c:1780
[< inline >] receive_buf drivers/tty/tty_buffer.c:450
[<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0
drivers/tty/tty_buffer.c:517
[<ffffffff8133833c>] process_one_work+0x76c/0x13e0
kernel/workqueue.c:2030
[<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
[<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

-> #1 (&port->buf.lock/1){+.+...}:
[<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
kernel/locking/lockdep.c:3585
[< inline >] __mutex_lock_common kernel/locking/mutex.c:518
[<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
kernel/locking/mutex.c:618
[<ffffffff82bc77ff>] n_tracerouter_open+0x1f/0x210
drivers/tty/n_tracerouter.c:75
[<ffffffff82bacd78>] tty_ldisc_open.isra.2+0x78/0xd0
drivers/tty/tty_ldisc.c:447
[<ffffffff82bad43a>] tty_set_ldisc+0x1ca/0xa30
drivers/tty/tty_ldisc.c:567
[< inline >] tiocsetd drivers/tty/tty_io.c:2650
[<ffffffff82b9483a>] tty_ioctl+0xb2a/0x2160 drivers/tty/tty_io.c:2883
[< inline >] vfs_ioctl fs/ioctl.c:43
[<ffffffff81750d41>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
[< inline >] SYSC_ioctl fs/ioctl.c:622
[<ffffffff8175158f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
[<ffffffff85c8eb36>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

-> #0 (routelock){+.+...}:
[< inline >] check_prev_add kernel/locking/lockdep.c:1853
[< inline >] check_prevs_add kernel/locking/lockdep.c:1958
[< inline >] validate_chain kernel/locking/lockdep.c:2144
[<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0
kernel/locking/lockdep.c:3206
[<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
kernel/locking/lockdep.c:3585
[< inline >] __mutex_lock_common kernel/locking/mutex.c:518
[<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
kernel/locking/mutex.c:618
[<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
drivers/tty/n_tracerouter.c:175
[< inline >] receive_buf drivers/tty/tty_buffer.c:454
[<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0
drivers/tty/tty_buffer.c:517
[<ffffffff8133833c>] process_one_work+0x76c/0x13e0
kernel/workqueue.c:2030
[<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
[<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

other info that might help us debug this:

Chain exists of:
routelock --> &o_tty->termios_rwsem/1 --> &buf->lock

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&buf->lock);
lock(&o_tty->termios_rwsem/1);
lock(&buf->lock);
lock(routelock);

*** DEADLOCK ***

4 locks held by kworker/u9:2/35:
#0: ("events_unbound"){.+.+.+}, at: [< inline >]
__write_once_size include/linux/compiler.h:247
#0: ("events_unbound"){.+.+.+}, at: [< inline >]
atomic64_set ./arch/x86/include/asm/atomic64_64.h:33
#0: ("events_unbound"){.+.+.+}, at: [< inline >]
atomic_long_set include/asm-generic/atomic-long.h:56
#0: ("events_unbound"){.+.+.+}, at: [< inline >]
set_work_data kernel/workqueue.c:614
#0: ("events_unbound"){.+.+.+}, at: [< inline >]
set_work_pool_and_clear_pending kernel/workqueue.c:641
#0: ("events_unbound"){.+.+.+}, at: [<ffffffff81338242>]
process_one_work+0x672/0x13e0 kernel/workqueue.c:2023
#1: ((&buf->work)){+.+...}, at: [<ffffffff81338272>]
process_one_work+0x6a2/0x13e0 kernel/workqueue.c:2027
#2: (&tty->ldisc_sem){++++++}, at: [<ffffffff82bac99b>]
tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:283
#3: (&buf->lock){+.+...}, at: [<ffffffff82baf791>]
flush_to_ldisc+0xe1/0x7f0 drivers/tty/tty_buffer.c:487

stack backtrace:
CPU: 0 PID: 35 Comm: kworker/u9:2 Not tainted 4.4.0-rc7+ #181
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: events_unbound flush_to_ldisc
00000000ffffffff ffff88003d8e77f8 ffffffff8289d9dd ffffffff8866fbf0
ffffffff8863f050 ffffffff886422f0 ffff88003d8e7840 ffffffff813e7081
ffff88003d8e7930 000000003d8b1780 ffff88003d8b1fc0 ffff88003d8b1fe2
Call Trace:
[< inline >] validate_chain kernel/locking/lockdep.c:2144
[<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0 kernel/locking/lockdep.c:3206
[<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
[< inline >] __mutex_lock_common kernel/locking/mutex.c:518
[<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940 kernel/locking/mutex.c:618
[<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
drivers/tty/n_tracerouter.c:175
[< inline >] receive_buf drivers/tty/tty_buffer.c:454
[<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0 drivers/tty/tty_buffer.c:517
[<ffffffff8133833c>] process_one_work+0x76c/0x13e0 kernel/workqueue.c:2030
[<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
[<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
======================================================

Dmitry Vyukov

unread,
Jan 15, 2016, 2:52:00 AM1/15/16
to Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On Wed, Dec 30, 2015 at 11:44 AM, Dmitry Vyukov <dvy...@google.com> wrote:
> Hello,
>
> I've got the following lock order inversion report on
> 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). Unfortunately I
> failed to find a reproducer for it.

Ping. This is the most frequent failure during my testing now. J
Freyensee, you seem to wrote drivers/tty/n_tracerouter.c. Please take
a look. Thanks.

One Thousand Gnomes

unread,
Jan 15, 2016, 11:33:09 AM1/15/16
to Dmitry Vyukov, Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On Fri, 15 Jan 2016 08:51:39 +0100
Dmitry Vyukov <dvy...@google.com> wrote:

> On Wed, Dec 30, 2015 at 11:44 AM, Dmitry Vyukov <dvy...@google.com> wrote:
> > Hello,
> >
> > I've got the following lock order inversion report on
> > 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). Unfortunately I
> > failed to find a reproducer for it.
>
> Ping. This is the most frequent failure during my testing now. J
> Freyensee, you seem to wrote drivers/tty/n_tracerouter.c. Please take
> a look. Thanks.

This driver hasn't changed since forever, nor does it look like a bug in
n_tracerouter.

This first two threads of execution is for the old line discipline so
don't occur at the same time as executing tracerouter.

This is opening the port with the tracerouter discipline as you change
line discipline. The core code locks the tty and then takes the ldisc
lock. Closes the old ldisc, opens the new, opening the new takes the
router lock.

(tty_lock, tty_ldisc_lock, termios_sem, route_lock)


> > -> #1 (&port->buf.lock/1){+.+...}:
> > [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> > kernel/locking/lockdep.c:3585
> > [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
> > [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
> > kernel/locking/mutex.c:618
> > [<ffffffff82bc77ff>] n_tracerouter_open+0x1f/0x210
> > drivers/tty/n_tracerouter.c:75
> > [<ffffffff82bacd78>] tty_ldisc_open.isra.2+0x78/0xd0
> > drivers/tty/tty_ldisc.c:447
> > [<ffffffff82bad43a>] tty_set_ldisc+0x1ca/0xa30
> > drivers/tty/tty_ldisc.c:567
> > [< inline >] tiocsetd drivers/tty/tty_io.c:2650
> > [<ffffffff82b9483a>] tty_ioctl+0xb2a/0x2160 drivers/tty/tty_io.c:2883
> > [< inline >] vfs_ioctl fs/ioctl.c:43
> > [<ffffffff81750d41>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
> > [< inline >] SYSC_ioctl fs/ioctl.c:622
> > [<ffffffff8175158f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
> > [<ffffffff85c8eb36>] entry_SYSCALL_64_fastpath+0x16/0x7a
> > arch/x86/entry/entry_64.S:185

This is data being received by the new line discipline.

(buf->lock, routelock)

> >
> > -> #0 (routelock){+.+...}:
> > [< inline >] check_prev_add kernel/locking/lockdep.c:1853
> > [< inline >] check_prevs_add kernel/locking/lockdep.c:1958
> > [< inline >] validate_chain kernel/locking/lockdep.c:2144
> > [<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0
> > kernel/locking/lockdep.c:3206
> > [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> > kernel/locking/lockdep.c:3585
> > [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
> > [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
> > kernel/locking/mutex.c:618
> > [<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
> > drivers/tty/n_tracerouter.c:175
> > [< inline >] receive_buf drivers/tty/tty_buffer.c:454
> > [<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0
> > drivers/tty/tty_buffer.c:517
> > [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
> > kernel/workqueue.c:2030
> > [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
> > [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
> > [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
> > arch/x86/entry/entry_64.S:468
> >

The lock inversion the log shows seems to be a problem in n_tty not in
the tracerouter code.

However that seems to be

tty_set_ldisc to N_TTY
takes tty lock
takes tty_ldisc lock
down termios_rwsem (R)
tty->ops->set_ldisc
down termios rwsem (W)

verus

receive_driver_bufs
buf->lock
termios rwsem (R)


which also looks fine

So I think it's just a false report caused by other changes in the tty
layer, and nothing to do with the tracerouter code at all.

Alan

Dmitry Vyukov

unread,
Jan 15, 2016, 12:23:13 PM1/15/16
to One Thousand Gnomes, Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Thanks for looking into this.

The report says:

CPU0 CPU1
---- ----
lock(&buf->lock);
lock(&o_tty->termios_rwsem/1);
lock(&buf->lock);
lock(routelock);

I don't see routelock in your stacks.
But I also I don't understand why this is a deadlock. I would
understand if one thread takes A then B; while another takes B then A.
But here are 3 non-circular lock acquisitions. Does it have something
to do with spinlocks vs mutex/rwsem?

J Freyensee

unread,
Jan 19, 2016, 9:09:07 PM1/19/16
to One Thousand Gnomes, Dmitry Vyukov, Greg Kroah-Hartman, Jiri Slaby, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On Fri, 2016-01-15 at 16:33 +0000, One Thousand Gnomes wrote:
> On Fri, 15 Jan 2016 08:51:39 +0100
> Dmitry Vyukov <dvy...@google.com> wrote:
>
> > On Wed, Dec 30, 2015 at 11:44 AM, Dmitry Vyukov <dvy...@google.com
> > > wrote:
> > > Hello,
> > >
> > > I've got the following lock order inversion report on
> > > 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). Unfortunately
> > > I
> > > failed to find a reproducer for it.
> >
> > Ping. This is the most frequent failure during my testing now. J
> > Freyensee, you seem to wrote drivers/tty/n_tracerouter.c. Please
> > take
> > a look. Thanks.

I did look at the report but didn't see how n_tracerouter could be
causing the issue. I apologize for not responding, and I thank you so
very much for providing the thorough explanation, Alan.

Jay

>
> Alan

Dmitry Vyukov

unread,
Jan 20, 2016, 4:36:21 AM1/20/16
to One Thousand Gnomes, Peter Zijlstra, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
+Peter, Ingo, can you please suggest a way to annotate this locking
pattern to avoid lockdep false positive?
It is cause by the fact that old and new ldisc locks are aliased?

Peter Zijlstra

unread,
Jan 20, 2016, 6:44:34 AM1/20/16
to Dmitry Vyukov, One Thousand Gnomes, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On Wed, Jan 20, 2016 at 10:36:01AM +0100, Dmitry Vyukov wrote:

> +Peter, Ingo, can you please suggest a way to annotate this locking
> pattern to avoid lockdep false positive?

No, because I haven't a clue what the actual problem is.

The report is fairly clear on how:

routelock
&port->buf.lock/1
&o_tty->termios_rwsem/1
&buf->lock

gets established, and shows where we try:

&buf->lock
routelock

Which gives a circle, ergo deadlock.

Show which link is wrong and why, and I can suggest ways of annotating
that.

Dmitry Vyukov

unread,
Jan 20, 2016, 6:55:16 AM1/20/16
to Peter Zijlstra, One Thousand Gnomes, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Alan provided an explanation calling this "false report":
https://groups.google.com/d/msg/syzkaller/YrV0bzdfa-g/cCVoUf1OFQAJ


And I don't understand how the following is a deadlock, since there is
no cycle...

Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&buf->lock);
lock(&o_tty->termios_rwsem/1);
lock(&buf->lock);
lock(routelock);


Shouldn't it look like:

Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&buf->lock);
lock(routelock);
lock(&buf->lock);
lock(routelock);

?

Peter Zijlstra

unread,
Jan 20, 2016, 7:08:01 AM1/20/16
to Dmitry Vyukov, One Thousand Gnomes, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On Wed, Jan 20, 2016 at 12:54:55PM +0100, Dmitry Vyukov wrote:
> On Wed, Jan 20, 2016 at 12:44 PM, Peter Zijlstra <pet...@infradead.org> wrote:
> > On Wed, Jan 20, 2016 at 10:36:01AM +0100, Dmitry Vyukov wrote:
> >
> >> +Peter, Ingo, can you please suggest a way to annotate this locking
> >> pattern to avoid lockdep false positive?
> >
> > No, because I haven't a clue what the actual problem is.
> >
> > The report is fairly clear on how:
> >
> > routelock
> > &port->buf.lock/1
> > &o_tty->termios_rwsem/1
> > &buf->lock
> >
> > gets established, and shows where we try:
> >
> > &buf->lock
> > routelock
> >
> > Which gives a circle, ergo deadlock.
> >
> > Show which link is wrong and why, and I can suggest ways of annotating
> > that.
>
>
> Alan provided an explanation calling this "false report":
> https://groups.google.com/d/msg/syzkaller/YrV0bzdfa-g/cCVoUf1OFQAJ

I read that, I didn't understand it. Which link is wrong and why?

> And I don't understand how the following is a deadlock, since there is
> no cycle...
>
> Possible unsafe locking scenario:
> CPU0 CPU1
> ---- ----
> lock(&buf->lock);
> lock(&o_tty->termios_rwsem/1);
> lock(&buf->lock);
> lock(routelock);

Ignore the stupid picture, it only really works for simple cases.

Jiri Slaby

unread,
Jan 20, 2016, 7:47:42 AM1/20/16
to Dmitry Vyukov, J Freyensee, Greg Kroah-Hartman, LKML, Eric Dumazet, Alexander Potapenko, Kostya Serebryany, syzkaller, Sasha Levin, Peter Zijlstra
On 12/30/2015, 11:44 AM, Dmitry Vyukov wrote:
> the existing dependency chain (in reverse order) is:
>
> -> #3 (&buf->lock){+.+...}:
> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> kernel/locking/lockdep.c:3585
> [< inline >] __raw_spin_lock_irqsave
> include/linux/spinlock_api_smp.h:112
> [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70
> kernel/locking/spinlock.c:159
> [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502

But at this line, there is &tty->ctrl_lock, not &buf->lock. I am
confused now :/.
spin_lock_irqsave(&tty->ctrl_lock, flags);

> [<ffffffff82b9a09a>] __isig+0x1a/0x50 drivers/tty/n_tty.c:1112
> [<ffffffff82b9c16e>] isig+0xae/0x2c0 drivers/tty/n_tty.c:1131
> [<ffffffff82b9ef02>] n_tty_receive_signal_char+0x22/0xf0
> drivers/tty/n_tty.c:1243
> [<ffffffff82ba4958>] n_tty_receive_char_special+0x1278/0x2bf0
> drivers/tty/n_tty.c:1289
> [< inline >] n_tty_receive_buf_fast drivers/tty/n_tty.c:1613
> [< inline >] __receive_buf drivers/tty/n_tty.c:1647
> [<ffffffff82ba7ca6>] n_tty_receive_buf_common+0x19d6/0x2450
> drivers/tty/n_tty.c:1745
> [<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40
> drivers/tty/n_tty.c:1780
> [< inline >] receive_buf drivers/tty/tty_buffer.c:450
> [<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0
> drivers/tty/tty_buffer.c:517
> [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
> kernel/workqueue.c:2030
> [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
> [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
> [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
> arch/x86/entry/entry_64.S:468


--
js
suse labs

Peter Zijlstra

unread,
Jan 20, 2016, 8:02:09 AM1/20/16
to Dmitry Vyukov, Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, gno...@lxorguk.ukuu.org.uk
On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
> -> #3 (&buf->lock){+.+...}:
> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
> [< inline >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
> [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
> [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502

So in any recent code that I look at this function tries to acquire
tty->ctrl_lock, not buf->lock. Am I missing something ?!

Dmitry Vyukov

unread,
Jan 20, 2016, 8:08:02 AM1/20/16
to Peter Zijlstra, Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, One Thousand Gnomes
On Wed, Jan 20, 2016 at 2:02 PM, Peter Zijlstra <pet...@infradead.org> wrote:
> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>> -> #3 (&buf->lock){+.+...}:
>> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>> [< inline >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>> [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>> [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>
> So in any recent code that I look at this function tries to acquire
> tty->ctrl_lock, not buf->lock. Am I missing something ?!

I am pretty sure that I did not alter the report... let me collect
another report on a new kernel.

One Thousand Gnomes

unread,
Jan 20, 2016, 9:58:53 AM1/20/16
to Peter Zijlstra, Dmitry Vyukov, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
> I read that, I didn't understand it. Which link is wrong and why?
>
> > And I don't understand how the following is a deadlock, since there is
> > no cycle...
> >
> > Possible unsafe locking scenario:
> > CPU0 CPU1
> > ---- ----
> > lock(&buf->lock);
> > lock(&o_tty->termios_rwsem/1);
> > lock(&buf->lock);
> > lock(routelock);
>
> Ignore the stupid picture, it only really works for simple cases.

There are two line disciplines using two different locking orders

The two line disciplines never execute at once. A given tty is either
using one or the other and there is a clear and correctly locked
changeover.


semantically its something a bit like


foo(x)
{
if (x == 1) {
lock(A)
lock(B)
} else {
lock(B)
lock(A)
}

Do stuff();

if (x == 1) {
unlock(B)
unlock(A)
} else {
unlock(A)
unlock(B)
}
}

with the guarantee made elsewhere that no instances of foo(1) and foo(0)
are ever executing at the same time.

That's not by dumb design - it's an interesting "nobody ever noticed
this" turned up by the lock detector between two totaly unrelated bits of
code.

Alan

Dmitry Vyukov

unread,
Jan 20, 2016, 10:17:12 AM1/20/16
to One Thousand Gnomes, Peter Zijlstra, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
In out user-space deadlock detector we have an annotation along the
lines of "forget all info this particular mutex" for such cases
(between foo(0) and foo(1)). Is there something similar in lockdep?

Peter Hurley

unread,
Jan 20, 2016, 11:08:52 AM1/20/16
to Peter Zijlstra, Dmitry Vyukov, Jiri Slaby, gno...@lxorguk.ukuu.org.uk, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>> -> #3 (&buf->lock){+.+...}:
>> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>> [< inline >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>> [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>> [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>
> So in any recent code that I look at this function tries to acquire
> tty->ctrl_lock, not buf->lock. Am I missing something ?!

Yes.

The tty locks were annotated with __lockfunc so were being elided from lockdep
stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
annotation ("tty: Remove __lockfunc annotation from tty lock functions").

Unfortunately, I think syzkaller's post-processing stack trace isn't helping
either, giving the impression that the stack is still inside tty_get_pgrp().

It's not.

It's in pty_flush_buffer(), which is taking the 'other tty' buf->lock.

Looks to me like the lock inversion is caused by the tty_driver_flush_buffer()
in n_tracerouter_open()/_close(), but I need to look at this mess a little
closer.

Regards,
Peter Hurley

Peter Zijlstra

unread,
Jan 20, 2016, 11:33:01 AM1/20/16
to Dmitry Vyukov, One Thousand Gnomes, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Its a fairly common thing; the typical solution uses lockdep_class_key
instances per type. See for example inodes, they have a per
file_system_type classes exactly because of this. See
inode_init_always()'s use of lockdep_set_class() and struct file_system_type.

The thing with lockdep_class_key's is that they must reside in static
storage, a quick grep shows that all tty_ldisc_ops structures are indeed
in static storage.

I'm just not sure when to apply the lock classes and to which locks though.
The one thing to keep in mind is that lockdep_set_class() cannot be used
on a held lock (I'm not sure we have debug msgs for that, /me makes a
note).

> In out user-space deadlock detector we have an annotation along the
> lines of "forget all info this particular mutex" for such cases
> (between foo(0) and foo(1)). Is there something similar in lockdep?

No, since we don't track this data per instance, and the lock chains
that were recorded are still valid, just not for this instance.


Peter Hurley

unread,
Jan 20, 2016, 3:47:19 PM1/20/16
to Dmitry Vyukov, Jiri Slaby, gno...@lxorguk.ukuu.org.uk, Peter Zijlstra, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On 01/20/2016 08:08 AM, Peter Hurley wrote:
> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>> -> #3 (&buf->lock){+.+...}:
>>> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>> [< inline >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>> [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>> [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>
>> So in any recent code that I look at this function tries to acquire
>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>
> Yes.
>
> The tty locks were annotated with __lockfunc so were being elided from lockdep
> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>
> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
> either, giving the impression that the stack is still inside tty_get_pgrp().
>
> It's not.
>
> It's in pty_flush_buffer(), which is taking the 'other tty' buf->lock.
>
> Looks to me like the lock inversion is caused by the tty_driver_flush_buffer()
> in n_tracerouter_open()/_close(), but I need to look at this mess a little
> closer.

Unfortunately, there's not enough information in the lockdep report to
conclusively determine if there really is a potential for deadlock
(for one, the reports don't show what locks are held by each trace which
is a problem if one of the traces terminates at a global mutex while holding
subclassed locks).

Nevertheless, the patch below should "fix" the problem.

Regards,
Peter Hurley

--- >% ---
Subject: [PATCH] tty: Fix lock inversion in N_TRACEROUTER

n_tracerouter_open()/_close() may cause a lock inversion when
N_TRACEROUTER is set as the ldisc for a master pty.

Unfortunately, the original lockdep report [1] does not contain
enough information to conclusively show a deadlock is possible.

However, the call to tty_driver_flush_buffer() is completely pointless
as this line discipline does not allow and never performs output to
this tty driver; remove.

[1] Email subject "tty: deadlock between n_tracerouter_receivebuf and
flush_to_ldisc" https://lkml.org/lkml/2015/12/30/71

Reported-by: Dmitry Vyukov <dvy...@google.com>
Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
drivers/tty/n_tracerouter.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/tty/n_tracerouter.c b/drivers/tty/n_tracerouter.c
index ac57169..a5fd45d 100644
--- a/drivers/tty/n_tracerouter.c
+++ b/drivers/tty/n_tracerouter.c
@@ -82,7 +82,6 @@ static int n_tracerouter_open(struct tty_struct *tty)
tr_data->opencalled = 1;
tty->disc_data = tr_data;
tty->receive_room = RECEIVE_ROOM;
- tty_driver_flush_buffer(tty);
retval = 0;
}
}
@@ -102,7 +101,6 @@ static void n_tracerouter_close(struct tty_struct *tty)

mutex_lock(&routelock);
WARN_ON(tptr->kref_tty != tr_data->kref_tty);
- tty_driver_flush_buffer(tty);
tty_kref_put(tr_data->kref_tty);
tr_data->kref_tty = NULL;
tr_data->opencalled = 0;
--
2.7.0


Dmitry Vyukov

unread,
Jan 21, 2016, 5:07:06 AM1/21/16
to Peter Hurley, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <pe...@hurleysoftware.com> wrote:
> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>> -> #3 (&buf->lock){+.+...}:
>>> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>> [< inline >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>> [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>> [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>
>> So in any recent code that I look at this function tries to acquire
>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>
> Yes.
>
> The tty locks were annotated with __lockfunc so were being elided from lockdep
> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>
> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
> either, giving the impression that the stack is still inside tty_get_pgrp().
>
> It's not.

I've got a new report on commit
a200dcb34693084e56496960d855afdeaaf9578f (Jan 18).
Here is unprocessed version:
https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
and here is processed one:
https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt

Peter, what exactly is wrong with the post-processed version? I would
be interested in fixing the processing script.

As far as I see it contains the same stacks just with line numbers and
inlined frames. I am using a significantly different compilation mode
(kasan + kcov + very recent gcc), so nobody except me won't be able to
figure out line numbers based on offsets.

Peter Zijlstra

unread,
Jan 21, 2016, 5:20:17 AM1/21/16
to Dmitry Vyukov, Peter Hurley, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
I'm not sure anything is wrong with the post-processing. My confusion
(and Jiri) was that the stack trace lists
tty_get_pgrp()->_raw_spin_lock_irqsave() but that function acquires
tty->ctrl_lock, not as the report claims buf->lock.

PeterH says that lockdep omits functions tagged with __lockfunc, but my
reading disagrees with that.

kernel/locking/lockdep.c:save_trace()
save_stack_trace()
dump_trace(.ops = &save_stack_ops)

which has: .address = save_stack_address
__save_stack_address(.nosched = false)

So lockdep should very much include lock functions.


But this confusion is part of the original report, not because of the
post-processing.

Peter Hurley

unread,
Jan 21, 2016, 12:43:34 PM1/21/16
to Dmitry Vyukov, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Yeah, ok, I assumed the problem with this report was post-processing
because of the other report that had mixed-up info.

However, the #3 stacktrace is obviously wrong, as others have already noted.
Plus, the #1 stacktrace is wrong as well.

> I would be interested in fixing the processing script.

Not that it's related (since the original, not-edited report has bogus
stacktraces), but how are you doing debug symbol lookup?

Because below is not correct. Should be kernel/kthread.c:177 (or thereabouts)

[<ffffffff813b423f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303


> As far as I see it contains the same stacks just with line numbers and
> inlined frames.

Agree, now that I see the original report.

> I am using a significantly different compilation mode
> (kasan + kcov + very recent gcc), so nobody except me won't be able to
> figure out line numbers based on offsets.

Weird. Maybe something to do with the compiler.

Can you get me the dmesg output running the patch below?

Please first just get me the output shortly after boot (please do a
ctrl+C in a terminal window to trigger the expected stacktrace
corresponding to #3 stacktrace from your report).

Then please run your test that generates the circular lockdep report.

--- >% ---
Subject: [PATCH] debug: dump stacktraces for tty lock dependencies

---
kernel/locking/lockdep.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 60ace56..b67cafb 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -332,7 +332,7 @@ EXPORT_SYMBOL(lockdep_on);
* Debugging switches:
*/

-#define VERBOSE 0
+#define VERBOSE 1
#define VERY_VERBOSE 0

#if VERBOSE
@@ -351,13 +351,15 @@ EXPORT_SYMBOL(lockdep_on);
*/
static int class_filter(struct lock_class *class)
{
-#if 0
- /* Example */
+#if 1
if (class->name_version == 1 &&
- !strcmp(class->name, "lockname"))
+ !strcmp(class->name, "&buf->lock"))
return 1;
if (class->name_version == 1 &&
- !strcmp(class->name, "&struct->lockfield"))
+ !strcmp(class->name, "&port->buf.lock/1"))
+ return 1;
+ if (class->name_version == 1 &&
+ !strcmp(class->name, "routelock"))
return 1;
#endif
/* Filter everything else. 1 would be to allow everything else */
--
2.7.0


Peter Hurley

unread,
Jan 21, 2016, 12:51:16 PM1/21/16
to Peter Zijlstra, Dmitry Vyukov, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
I think this is the other way around; that lockdep has graphed a circular
dependency chain, but that something is wrong with the stack traces.

> PeterH says that lockdep omits functions tagged with __lockfunc, but my
> reading disagrees with that.
>
> kernel/locking/lockdep.c:save_trace()
> save_stack_trace()
> dump_trace(.ops = &save_stack_ops)
>
> which has: .address = save_stack_address
> __save_stack_address(.nosched = false)
>
> So lockdep should very much include lock functions.

Wild hypothesis on my part.

> But this confusion is part of the original report, not because of the
> post-processing.

Yeah, agreed. The original report is very odd.


Dmitry Vyukov

unread,
Jan 22, 2016, 9:10:59 AM1/22/16
to Peter Hurley, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Peter H, I have not yet applied your patch, but got a slightly
differently looking report (n_tracesink_datadrain wasn't mentioned
before):

[ 1483.472724] ======================================================
[ 1483.473033] [ INFO: possible circular locking dependency detected ]
[ 1483.473033] 4.4.0+ #275 Not tainted
[ 1483.473033] -------------------------------------------------------
[ 1483.473033] kworker/u9:0/10808 is trying to acquire lock:
[ 1483.473033] (writelock){+.+...}, at: [<ffffffff82d1ee14>]
n_tracesink_datadrain+0x24/0xc0
[ 1483.473033]
[ 1483.473033] but task is already holding lock:
[ 1483.473033] (routelock){+.+...}, at: [<ffffffff82d1e9b0>]
n_tracerouter_receivebuf+0x20/0x40
[ 1483.473033]
[ 1483.473033] which lock already depends on the new lock.
[ 1483.473033]
[ 1483.473033]
[ 1483.473033] the existing dependency chain (in reverse order) is:
[ 1483.473033]
-> #4 (routelock){+.+...}:
[ 1483.473033] [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033] [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
[ 1483.473033] [<ffffffff82d1e9b0>] n_tracerouter_receivebuf+0x20/0x40
[ 1483.473033] [<ffffffff82d06484>] flush_to_ldisc+0x584/0x7f0
[ 1483.473033] [<ffffffff8139fd36>] process_one_work+0x796/0x1440
[ 1483.473033] [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
[ 1483.473033] [<ffffffff813b424f>] kthread+0x23f/0x2d0
[ 1483.473033] [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
[ 1483.473033]
-> #3 (&buf->lock){+.+...}:
[ 1483.473033] [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033] [<ffffffff8633255f>] _raw_spin_lock_irqsave+0x9f/0xd0
[ 1483.473033] [<ffffffff82ce4210>] tty_get_pgrp+0x20/0x80
[ 1483.473033] [<ffffffff82cf28da>] __isig+0x1a/0x50
[ 1483.473033] [<ffffffff82cf49ae>] isig+0xae/0x2c0
[ 1483.473033] [<ffffffff82cf7762>] n_tty_receive_signal_char+0x22/0xf0
[ 1483.473033] [<ffffffff82cfb332>]
n_tty_receive_char_special+0x10d2/0x2b30
[ 1483.473033] [<ffffffff82cfe733>]
n_tty_receive_buf_common+0x19a3/0x2400
[ 1483.473033] [<ffffffff82cff1c3>] n_tty_receive_buf2+0x33/0x40
[ 1483.473033] [<ffffffff82d062bf>] flush_to_ldisc+0x3bf/0x7f0
[ 1483.473033] [<ffffffff8139fd36>] process_one_work+0x796/0x1440
[ 1483.473033] [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
[ 1483.473033] [<ffffffff813b424f>] kthread+0x23f/0x2d0
[ 1483.473033] [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
[ 1483.473033]
-> #2 (&o_tty->termios_rwsem/1){++++..}:
[ 1483.473033] [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033] [<ffffffff8632e337>] down_read+0x47/0x60
[ 1483.473033] [<ffffffff82cfce1d>] n_tty_receive_buf_common+0x8d/0x2400
[ 1483.473033] [<ffffffff82cff1c3>] n_tty_receive_buf2+0x33/0x40
[ 1483.473033] [<ffffffff82d062bf>] flush_to_ldisc+0x3bf/0x7f0
[ 1483.473033] [<ffffffff8139fd36>] process_one_work+0x796/0x1440
[ 1483.473033] [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
[ 1483.473033] [<ffffffff813b424f>] kthread+0x23f/0x2d0
[ 1483.473033] [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
[ 1483.473033]
-> #1 (&port->buf.lock/1){+.+...}:
[ 1483.473033] [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033] [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
[ 1483.473033] [<ffffffff82d1ef53>] n_tracesink_open+0x23/0xf0
[ 1483.473033] [<ffffffff82d03978>] tty_ldisc_open.isra.2+0x78/0xd0
[ 1483.473033] [<ffffffff82d03f62>] tty_set_ldisc+0x292/0x8a0
[ 1483.473033] [<ffffffff82cecbbe>] tty_ioctl+0xb2e/0x2160
[ 1483.473033] [<ffffffff817ebfbc>] do_vfs_ioctl+0x18c/0xfb0
[ 1483.473033] [<ffffffff817ece6f>] SyS_ioctl+0x8f/0xc0
[ 1483.473033] [<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 1483.473033]
-> #0 (writelock){+.+...}:
[ 1483.473033] [<ffffffff8145692b>] __lock_acquire+0x31eb/0x4700
[ 1483.473033] [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033] [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
[ 1483.473033] [<ffffffff82d1ee14>] n_tracesink_datadrain+0x24/0xc0
[ 1483.473033] [<ffffffff82d1e9bb>] n_tracerouter_receivebuf+0x2b/0x40
[ 1483.473033] [<ffffffff82d06484>] flush_to_ldisc+0x584/0x7f0
[ 1483.473033] [<ffffffff8139fd36>] process_one_work+0x796/0x1440
[ 1483.473033] [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
[ 1483.473033] [<ffffffff813b424f>] kthread+0x23f/0x2d0
[ 1483.473033] [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
[ 1483.473033]
[ 1483.473033] other info that might help us debug this:
[ 1483.473033]
[ 1483.473033] Chain exists of:
writelock --> &buf->lock --> routelock

[ 1483.473033] Possible unsafe locking scenario:
[ 1483.473033]
[ 1483.473033] CPU0 CPU1
[ 1483.473033] ---- ----
[ 1483.473033] lock(routelock);
[ 1483.473033] lock(&buf->lock);
[ 1483.473033] lock(routelock);
[ 1483.473033] lock(writelock);
[ 1483.473033]
[ 1483.473033] *** DEADLOCK ***
[ 1483.473033]
[ 1483.473033] 5 locks held by kworker/u9:0/10808:
[ 1483.473033] #0: ("events_unbound"){.+.+.+}, at:
[<ffffffff8139fc3a>] process_one_work+0x69a/0x1440
[ 1483.473033] #1: ((&buf->work)){+.+...}, at: [<ffffffff8139fc6a>]
process_one_work+0x6ca/0x1440
[ 1483.473033] #2: (&tty->ldisc_sem){++++++}, at:
[<ffffffff82d0339b>] tty_ldisc_ref+0x1b/0x80
[ 1483.473033] #3: (&buf->lock){+.+...}, at: [<ffffffff82d05fe1>]
flush_to_ldisc+0xe1/0x7f0
[ 1483.473033] #4: (routelock){+.+...}, at: [<ffffffff82d1e9b0>]
n_tracerouter_receivebuf+0x20/0x40
[ 1483.473033]
[ 1483.473033] stack backtrace:
[ 1483.473033] CPU: 1 PID: 10808 Comm: kworker/u9:0 Not tainted 4.4.0+ #275
[ 1483.473033] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[ 1483.473033] Workqueue: events_unbound flush_to_ldisc
[ 1483.473033] 00000000ffffffff ffff8800334ff7a0 ffffffff82994c8d
ffffffff88faf110
[ 1483.473033] ffffffff88fdabb0 ffffffff88ff1ef0 ffff8800334ff7f0
ffffffff8144fb58
[ 1483.473033] ffff88003a290000 ffff88003a2908ea 0000000000000000
ffff88003a2908c8
[ 1483.473033] Call Trace:
[ 1483.473033] [<ffffffff82994c8d>] dump_stack+0x6f/0xa2
[ 1483.473033] [<ffffffff8144fb58>] print_circular_bug+0x288/0x340
[ 1483.473033] [<ffffffff8145692b>] __lock_acquire+0x31eb/0x4700
[ 1483.473033] [<ffffffff81453740>] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 1483.473033] [<ffffffff81453740>] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 1483.473033] [<ffffffff81453740>] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 1483.473033] [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033] [<ffffffff82d1ee14>] ? n_tracesink_datadrain+0x24/0xc0
[ 1483.473033] [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
[ 1483.473033] [<ffffffff82d1ee14>] ? n_tracesink_datadrain+0x24/0xc0
[ 1483.473033] [<ffffffff81452edd>] ? trace_hardirqs_on+0xd/0x10
[ 1483.473033] [<ffffffff82d1e9b0>] ? n_tracerouter_receivebuf+0x20/0x40
[ 1483.473033] [<ffffffff81452edd>] ? trace_hardirqs_on+0xd/0x10
[ 1483.473033] [<ffffffff86328e40>] ?
mutex_lock_interruptible_nested+0xbe0/0xbe0
[ 1483.473033] [<ffffffff86328e40>] ?
mutex_lock_interruptible_nested+0xbe0/0xbe0
[ 1483.473033] [<ffffffff86328e40>] ?
mutex_lock_interruptible_nested+0xbe0/0xbe0
[ 1483.473033] [<ffffffff82d1ee14>] n_tracesink_datadrain+0x24/0xc0
[ 1483.473033] [<ffffffff82d1e990>] ? n_tracerouter_write+0x20/0x20
[ 1483.473033] [<ffffffff82d1e9bb>] n_tracerouter_receivebuf+0x2b/0x40
[ 1483.473033] [<ffffffff82d06484>] flush_to_ldisc+0x584/0x7f0
[ 1483.473033] [<ffffffff82d05f00>] ? tty_buffer_unlock_exclusive+0x120/0x120
[ 1483.473033] [<ffffffff8139fd36>] process_one_work+0x796/0x1440
[ 1483.473033] [<ffffffff8139fc6a>] ? process_one_work+0x6ca/0x1440
[ 1483.473033] [<ffffffff8139f5a0>] ? pwq_dec_nr_in_flight+0x2e0/0x2e0
[ 1483.473033] [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
[ 1483.473033] [<ffffffff813b424f>] kthread+0x23f/0x2d0
[ 1483.473033] [<ffffffff813a09e0>] ? process_one_work+0x1440/0x1440
[ 1483.473033] [<ffffffff813b4010>] ? kthread_create_on_node+0x3b0/0x3b0
[ 1483.473033] [<ffffffff813b4010>] ? kthread_create_on_node+0x3b0/0x3b0
[ 1483.473033] [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
[ 1483.473033] [<ffffffff813b4010>] ? kthread_create_on_node+0x3b0/0x3b0

Peter Hurley

unread,
Jan 25, 2016, 11:56:31 AM1/25/16
to Dmitry Vyukov, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
May or may not be related. And this report suffers from the same
deficiencies as the previous report which is that the saved
stack trace doesn't match the lock dependency.

Without understanding why the stack trace doesn't match the lock
dependency, any solution is just based on conjecture.

Regards,
Peter Hurley

Peter Hurley

unread,
Feb 2, 2016, 11:24:30 PM2/2/16
to Dmitry Vyukov, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Hi Dmitry,
Wondering if this is still the priority it was not so long ago?
If not, that's fine and I'll drop this from my followup list.

Dmitry Vyukov

unread,
Feb 3, 2016, 12:33:07 PM2/3/16
to Peter Hurley, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Yes, it is still the priority for me.
I've tried to apply your debugging patch, but I noticed that it prints
dependencies stacks as it discovers them. But in my setup I don't have
all output from machine start (there is just too many of it). And I
don't have a localized reproducer for this.

I will try again.

Do you want me to debug with your "tty: Fix lock inversion in
N_TRACEROUTER" patch applied or not (I still see slightly different
deadlock reports with it)?

Peter Hurley

unread,
Feb 3, 2016, 2:09:10 PM2/3/16
to Dmitry Vyukov, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Yeah, that's the point; I need to understand why lockdep doesn't
store the correct stack trace at dependency discovery.

Since the correct stack trace will be printed instead, it will help
debug the lockdep problem.

Hopefully, once the problem with the bad stacktraces are fixed, the
actual circular lock dependencies will be clear.

> But in my setup I don't have
> all output from machine start (there is just too many of it).

Kernel parameter:

log_buf_len=1G


> And I don't have a localized reproducer for this.

I really just need the lockdep dependency stacks generated during boot,
and the ctrl+C in a terminal window to trigger one of the dependency
stacks.

> I will try again.

Ok.

> Do you want me to debug with your "tty: Fix lock inversion in
> N_TRACEROUTER" patch applied or not (I still see slightly different
> deadlock reports with it)?

Not.

I think that probably does fix at least one circular dependency, but
I want to figure out the bad stack trace problem first.

There's probably another circular dependency there, as indicated by
your other report.

Regards,
Peter Hurley

Dmitry Vyukov

unread,
Feb 4, 2016, 7:39:46 AM2/4/16
to Peter Hurley, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Here is debug output:
https://gist.githubusercontent.com/dvyukov/b18181c849fdd3d51c80/raw/e91ead683fec020f64eed6750aa9f6347d43b9f9/gistfile1.txt

In particular the ctrl+C dependency is:

new dependency: (&o_tty->termios_rwsem/1){++++..} => (&buf->lock){+.+...}
[ 216.817400] Call Trace:
[ 216.817400] [<ffffffff82be450d>] dump_stack+0x6f/0xa2
[ 216.817400] [<ffffffff8145b149>] __lock_acquire+0x4859/0x5710
[ 216.817400] [<ffffffff8145e61c>] lock_acquire+0x1dc/0x430
[ 216.817400] [<ffffffff86656871>] mutex_lock_nested+0xb1/0xa50
[ 216.817400] [<ffffffff82f9f08f>] tty_buffer_flush+0xbf/0x3c0
[ 216.817400] [<ffffffff82fa330c>] pty_flush_buffer+0x5c/0x180
[ 216.817400] [<ffffffff82f97a05>] tty_driver_flush_buffer+0x65/0x80
[ 216.817400] [<ffffffff82f8d162>] isig+0x172/0x2c0
[ 216.817400] [<ffffffff82f8fe52>] n_tty_receive_signal_char+0x22/0xf0
[ 216.817400] [<ffffffff82f93a4e>] n_tty_receive_char_special+0x126e/0x2b30
[ 216.817400] [<ffffffff82f96cb3>] n_tty_receive_buf_common+0x19a3/0x2400
[ 216.817400] [<ffffffff82f97743>] n_tty_receive_buf2+0x33/0x40
[ 216.817400] [<ffffffff82f9e83f>] flush_to_ldisc+0x3bf/0x7f0
[ 216.817400] [<ffffffff813a3eb6>] process_one_work+0x796/0x1440
[ 216.817400] [<ffffffff813a4c3b>] worker_thread+0xdb/0xfc0
[ 216.817400] [<ffffffff813b7edf>] kthread+0x23f/0x2d0
[ 216.817400] [<ffffffff866608ef>] ret_from_fork+0x3f/0x70

While in report it still looks as:

-> #3 (&buf->lock){+.+...}:
[ 1544.187872] [<ffffffff8145e61c>] lock_acquire+0x1dc/0x430
[ 1544.187872] [<ffffffff8665fecf>] _raw_spin_lock_irqsave+0x9f/0xd0
[ 1544.187872] [<ffffffff82f7c810>] tty_get_pgrp+0x20/0x80
[ 1544.187872] [<ffffffff82f8afca>] __isig+0x1a/0x50
[ 1544.187872] [<ffffffff82f8d09e>] isig+0xae/0x2c0
[ 1544.187872] [<ffffffff82f8fe52>] n_tty_receive_signal_char+0x22/0xf0
[ 1544.187872] [<ffffffff82f93a6d>]
n_tty_receive_char_special+0x128d/0x2b30
[ 1544.187872] [<ffffffff82f96cb3>]
n_tty_receive_buf_common+0x19a3/0x2400
[ 1544.187872] [<ffffffff82f97743>] n_tty_receive_buf2+0x33/0x40
[ 1544.187872] [<ffffffff82f9e83f>] flush_to_ldisc+0x3bf/0x7f0
[ 1544.187872] [<ffffffff813a3eb6>] process_one_work+0x796/0x1440
[ 1544.187872] [<ffffffff813a4c3b>] worker_thread+0xdb/0xfc0
[ 1544.187872] [<ffffffff813b7edf>] kthread+0x23f/0x2d0
[ 1544.187872] [<ffffffff866608ef>] ret_from_fork+0x3f/0x70


It seems to me that tty_get_pgrp is red herring. Ctrl lock is not
mentioned in reports, and isig indeed calls __isig/tty_get_pgrp just
before tty_driver_flush_buffer, so it looks like stack unwinding bug.

Dmitry Vyukov

unread,
Feb 4, 2016, 8:18:01 AM2/4/16
to Peter Hurley, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Found a bug in lockdep. Yes, the first stack is correct, and the saved
stack is just a random, unrelated stack. Will mail a fix.

Peter Hurley

unread,
Feb 4, 2016, 1:46:21 PM2/4/16
to Dmitry Vyukov, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On 02/04/2016 05:17 AM, Dmitry Vyukov wrote:
> On Thu, Feb 4, 2016 at 1:39 PM, Dmitry Vyukov <dvy...@google.com> wrote:
>> On Wed, Feb 3, 2016 at 8:09 PM, Peter Hurley <pe...@hurleysoftware.com> wrote:

>>> Hopefully, once the problem with the bad stacktraces are fixed, the
>>> actual circular lock dependencies will be clear.

> Found a bug in lockdep. Yes, the first stack is correct, and the saved
> stack is just a random, unrelated stack. Will mail a fix.

Ok, so with the lockdep bug fixed, those reports should now have the
correct stack traces.

I'd appreciate if you could re-generate the original report, so I can examine
if my speculative fix was appropriate.

And then if you would re-generate the more recent report, I'll relook at
that.

Regards,
Peter Hurley



Dmitry Vyukov

unread,
Feb 4, 2016, 1:48:33 PM2/4/16
to Peter Hurley, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Working hard on it now.

Dmitry Vyukov

unread,
Feb 5, 2016, 4:22:30 PM2/5/16
to Peter Hurley, Peter Zijlstra, Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On Thu, Feb 4, 2016 at 7:48 PM, Dmitry Vyukov <dvy...@google.com> wrote:
>>>>> Hopefully, once the problem with the bad stacktraces are fixed, the
>>>>> actual circular lock dependencies will be clear.
>>
>>> Found a bug in lockdep. Yes, the first stack is correct, and the saved
>>> stack is just a random, unrelated stack. Will mail a fix.
>>
>> Ok, so with the lockdep bug fixed, those reports should now have the
>> correct stack traces.
>>
>> I'd appreciate if you could re-generate the original report, so I can examine
>> if my speculative fix was appropriate.
>>
>> And then if you would re-generate the more recent report, I'll relook at
>> that.
>
> Working hard on it now.

No luck so far. I will leave it running, but I will be OOO whole next week.
Reply all
Reply to author
Forward
0 new messages