Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

NBD: Disconnect connection/kill NBD server cause kernel bug even kernel hang

298 views

Skip to first unread message

Sheng Yang

unread,

Sep 16, 2015, 9:20:05 PM9/16/15

Hi, Markus,

I've experienced kernel bug and system hang regarding using
nbd-client/server on the same machine. Kernel verison is v4.2.

I know http://nbd.sourceforge.net/ said:
"Please note that read-write nbd with client and server on the same
machine is a bad idea: expect deadlock within seconds". But it's
mentioned at http://thread.gmane.org/gmane.linux.kernel/869784 that
NBD should be able to handle client/server on the same machine.

The bug can be easily reproduce like this;

truncate -s 1G nbd.vol
mkfs.ext4 nbd.vol
<add nbd.vol to nbd-server's config, named "nbd-vol">
service nbd-server restart

nbd-client localhost /dev/nbd0 -N nbd-vol
mount /dev/ndb0 /mnt/tmp
wget https://www.kernel.org/pub/linux/kernel/v4.x/linux-4.2.tar.xz
--limit-rate=100k

<then in another console>
pkill nbd-server

Then the following would be triggered:

[ 2886.348891] block nbd0: Attempted send on closed socket
[ 2886.350579] blk_update_request: I/O error, dev nbd0, sector 1016
[ 2886.352210] EXT4-fs warning (device nbd0):
htree_dirblock_to_tree:959: inode #2: lblock 0: comm ls: error -5
reading directory block
[ 2886.352236] block nbd0: Attempted send on closed socket
[ 2886.353646] blk_update_request: I/O error, dev nbd0, sector 524
[ 2886.355267] block nbd0: Attempted send on closed socket
[ 2886.356665] blk_update_request: I/O error, dev nbd0, sector 526
[ 2886.358271] block nbd0: Attempted send on closed socket
[ 2886.359654] blk_update_request: I/O error, dev nbd0, sector 528
[ 2886.361253] block nbd0: Attempted send on closed socket
[ 2886.362597] blk_update_request: I/O error, dev nbd0, sector 530
[ 2886.364106] block nbd0: Attempted send on closed socket
[ 2886.365400] blk_update_request: I/O error, dev nbd0, sector 532
[ 2886.366905] block nbd0: Attempted send on closed socket
[ 2886.368205] blk_update_request: I/O error, dev nbd0, sector 534
[ 2886.369714] block nbd0: Attempted send on closed socket
[ 2886.371025] blk_update_request: I/O error, dev nbd0, sector 536
[ 2886.372560] block nbd0: Attempted send on closed socket
[ 2886.373871] blk_update_request: I/O error, dev nbd0, sector 538
[ 2886.375283] block nbd0: Attempted send on closed socket
[ 2886.376504] blk_update_request: I/O error, dev nbd0, sector 540
...
[ 2886.405309] block nbd0: Attempted send on closed socket
[ 2886.406440] EXT4-fs error (device nbd0): __ext4_get_inode_loc:3927:
inode #2: block 261: comm ls: unable to read itable block
[ 2886.408816] ------------[ cut here ]------------
[ 2886.409800] kernel BUG at fs/buffer.c:3025!
[ 2886.409800] invalid opcode: 0000 [#1] SMP
[ 2886.409800] Modules linked in:
[ 2886.409800] CPU: 3 PID: 17137 Comm: ls Not tainted 4.2.0 #11
[ 2886.409800] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.7.5-20150306_163512-brownie 04/01/2014
[ 2886.409800] task: ffff8800b8a50000 ti: ffff8800b7b64000 task.ti:
ffff8800b7b64000
[ 2886.409800] RIP: 0010:[<ffffffff811a778e>] [<ffffffff811a778e>]
submit_bh_wbc+0x16e/0x180
[ 2886.409800] RSP: 0018:ffff8800b7b67b58 EFLAGS: 00010246
[ 2886.409800] RAX: 0000000000000005 RBX: ffff8800adb849c0 RCX: 0000000000000000
[ 2886.409800] RDX: 0000000000000000 RSI: ffff8800adb849c0 RDI: 0000000000000411
[ 2886.409800] RBP: ffff8800b7b67b88 R08: fffffffffffffff0 R09: 0000000000000000
[ 2886.409800] R10: 0000000000000218 R11: 0000000000000006 R12: 0000000000000411
[ 2886.409800] R13: ffff8800b88fc400 R14: 000000000002facc R15: ffff88013a390000
[ 2886.409800] FS: 00007f83cf856840(0000) GS:ffff88013fd80000(0000)
knlGS:0000000000000000
[ 2886.409800] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2886.409800] CR2: 0000000000f75388 CR3: 0000000139896000 CR4: 00000000000006e0
[ 2886.409800] Stack:
[ 2886.409800] ffff88013a395358 ffff8800adb849c0 0000000000000411
ffff8800b88fc400
[ 2886.409800] 000000000002facc ffff88013a390000 ffff8800b7b67ba8
ffffffff811a9449
[ 2886.409800] ffffffff81f209e0 ffff8800adb849c0 ffff8800b7b67bb8
ffffffff811a94ee
[ 2886.409800] Call Trace:
[ 2886.409800] [<ffffffff811a9449>] __sync_dirty_buffer+0x59/0xf0
[ 2886.409800] [<ffffffff811a94ee>] sync_dirty_buffer+0xe/0x10
[ 2886.409800] [<ffffffff8120de31>] ext4_commit_super+0x181/0x200
[ 2886.409800] [<ffffffff8120e255>] __ext4_error_inode+0x85/0x140
[ 2886.409800] [<ffffffff811f0fdd>] __ext4_get_inode_loc+0x1fd/0x3d0
[ 2886.409800] [<ffffffff811f2a78>] ext4_get_inode_loc+0x18/0x20
[ 2886.409800] [<ffffffff811f442f>] ext4_reserve_inode_write+0x1f/0x80
[ 2886.409800] [<ffffffff811f7773>] ? ext4_dirty_inode+0x43/0x60
[ 2886.409800] [<ffffffff811f44d8>] ext4_mark_inode_dirty+0x48/0x1e0
[ 2886.409800] [<ffffffff811f7773>] ext4_dirty_inode+0x43/0x60
[ 2886.409800] [<ffffffff8119e603>] __mark_inode_dirty+0x163/0x310
[ 2886.409800] [<ffffffff8118d932>] generic_update_time+0x72/0xc0
[ 2886.409800] [<ffffffff810a8bd1>] ? current_fs_time+0x11/0x60
[ 2886.409800] [<ffffffff8118f143>] touch_atime+0x83/0xa0
[ 2886.409800] [<ffffffff81187096>] iterate_dir+0xd6/0x120
[ 2886.409800] [<ffffffff8118749c>] SyS_getdents+0x7c/0xe0
[ 2886.409800] [<ffffffff811871a0>] ? fillonedir+0xc0/0xc0
[ 2886.409800] [<ffffffff810485fc>] ? trace_do_page_fault+0x3c/0xb0
[ 2886.409800] [<ffffffff818cc8ae>] entry_SYSCALL_64_fastpath+0x12/0x71
[ 2886.409800] Code: 09 e7 14 00 48 83 c4 08 31 c0 5b 41 5c 41 5d 41
5e 41 5f 5d c3 40 f6 c7 01 0f 84 ff fe ff ff f0 80 63 01 f7 e9 f5 fe
ff ff 0f 0b <0f> 0b 0f 0b 0f 0b 0f 0b 66 2e 0f 1f 84 00 00 00 00 00 55
31 c9
[ 2886.409800] RIP [<ffffffff811a778e>] submit_bh_wbc+0x16e/0x180
[ 2886.409800] RSP <ffff8800b7b67b58>
[ 2886.461658] ---[ end trace 46f00e222737016e ]---

I am trying to use NBD for some HA scenario, I got above BUG using
default nbd-server. After this, any access to /mnt/tmp or /dev/nbd0
would hang.

We also has a customerized nbd server, and in the case of killing the
connection would result in resursive fault and complete hang the
system. Something like this happened:

[ 170.552688] block nbd0: Receive control failed (result -32)
[ 171.253234] block nbd0: Attempted send on closed socket
[ 171.254772] blk_update_request: I/O error, dev nbd0, sector 8464
[ 171.256586] block nbd0: Attempted send on closed socket
[ 171.258044] blk_update_request: I/O error, dev nbd0, sector 8720
[ 171.259953] block nbd0: Attempted send on closed socket
[ 171.261415] blk_update_request: I/O error, dev nbd0, sector 8976
[ 171.263300] block nbd0: Attempted send on closed socket
[ 171.264719] blk_update_request: I/O error, dev nbd0, sector 9232
[ 171.266570] block nbd0: Attempted send on closed socket
[ 171.268016] blk_update_request: I/O error, dev nbd0, sector 9488
[ 171.269604] block nbd0: Attempted send on closed socket
[ 171.270903] blk_update_request: I/O error, dev nbd0, sector 9744
[ 171.272470] block nbd0: Attempted send on closed socket
[ 171.273904] blk_update_request: I/O error, dev nbd0, sector 10000
[ 171.275523] block nbd0: Attempted send on closed socket
[ 171.276815] blk_update_request: I/O error, dev nbd0, sector 10256
[ 171.278499] block nbd0: Attempted send on closed socket
[ 171.279851] blk_update_request: I/O error, dev nbd0, sector 10512
[ 171.281383] block nbd0: Attempted send on closed socket
[ 171.282592] blk_update_request: I/O error, dev nbd0, sector 10768
[ 171.284021] block nbd0: Attempted send on closed socket
[ 171.285231] block nbd0: Attempted send on closed socket
[ 171.286568] block nbd0: Attempted send on closed socket
[ 171.287973] block nbd0: Attempted send on closed socket
[ 171.289283] block nbd0: Attempted send on closed socket
[ 171.290565] block nbd0: Attempted send on closed socket
[ 173.711085] ------------[ cut here ]------------
[ 173.712036] kernel BUG at fs/buffer.c:3025!
[ 173.712036] invalid opcode: 0000 [#1] SMP
[ 173.712036] Modules linked in:
[ 173.712036] CPU: 0 PID: 2328 Comm: jbd2/nbd0-8 Not tainted 4.2.0 #10
[ 173.712036] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.7.5-0150306_163512-brownie 04/01/2014
[ 173.712036] task: ffff8800bb8d0000 ti: ffff8800ad154000 task.ti:
ffff8800ad154000
[ 173.712036] RIP: 0010:[<ffffffff811a778e>] [<ffffffff811a778e>]
submit_bh_wbc+0x16e/0x180
[ 173.712036] RSP: 0018:ffff8800ad157bf8 EFLAGS: 00010246
[ 173.712036] RAX: 0000000004000005 RBX: ffff880139eb6e38 RCX: 0000000000000000
[ 173.712036] RDX: 0000000000000000 RSI: ffff880139eb6e38 RDI: 0000000000000411
[ 173.712036] RBP: ffff8800ad157c28 R08: ffff8800ad154000 R09: 0000000000000001
[ 173.712036] R10: 00000000fffe12ab R11: 0000000000000020 R12: ffff880135f04000
[ 173.712036] R13: 0000000000000411 R14: ffff880139eb6e38 R15: ffff8800b7855000
[ 173.712036] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000)
knlGS:0000000000000000
[ 173.712036] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 173.712036] CR2: 00007fe561c38000 CR3: 0000000001e0c000 CR4: 00000000000006f0
[ 173.712036] Stack:
[ 173.712036] 0000000000000000 ffff880135f04000 ffff880135f04000
0000000000000411
[ 173.712036] ffff880139eb6e38 ffff8800b7855000 ffff8800ad157c38
ffffffff811a77bd
[ 173.712036] ffff8800ad157c78 ffffffff81241c17 0000000000000000
ffff880135f04000
[ 173.712036] Call Trace:
[ 173.712036] [<ffffffff811a77bd>] submit_bh+0xd/0x10
[ 173.712036] [<ffffffff81241c17>] jbd2_write_superblock+0x87/0x170
[ 173.712036] [<ffffffff812425f2>] jbd2_journal_update_sb_log_tail+0x32/0x80
[ 173.712036] [<ffffffff818cab91>] ? mutex_lock+0x11/0x40
[ 173.712036] [<ffffffff8123b4ef>]
jbd2_journal_commit_transaction+0x5ff/0x1680
[ 173.712036] [<ffffffff81080513>] ? dequeue_entity+0x163/0x710
[ 173.712036] [<ffffffff81083b7d>] ? dequeue_task_fair+0x50d/0x7b0
[ 173.712036] [<ffffffff810a9c33>] ? lock_timer_base.isra.35+0x53/0x70
[ 173.712036] [<ffffffff810a9e0a>] ? try_to_del_timer_sync+0x4a/0x60
[ 173.712036] [<ffffffff8123fa36>] kjournald2+0xb6/0x230
[ 173.712036] [<ffffffff8108c3c0>] ? wait_woken+0x80/0x80
[ 173.712036] [<ffffffff8123f980>] ? commit_timeout+0x10/0x10
[ 173.712036] [<ffffffff8106ec44>] kthread+0xc4/0xe0
[ 173.712036] [<ffffffff8106eb80>] ? kthread_create_on_node+0x180/0x180
[ 173.712036] [<ffffffff818ccc9f>] ret_from_fork+0x3f/0x70
[ 173.712036] [<ffffffff8106eb80>] ? kthread_create_on_node+0x180/0x180
[ 173.712036] Code: 09 e7 14 00 48 83 c4 08 31 c0 5b 41 5c 41 5d 41
5e 41 5f 5d c3 40 f6 c7 01 0f 84 ff fe ff ff f0 80 63 01 f7 e9 f5 fe
ff ff 0f 0b <0f> 0b 0f 0b 0f 0b 0f 0b 66 2e 0f 1f 84 00 00 00 00 00 55
31 c9
[ 173.712036] RIP [<ffffffff811a778e>] submit_bh_wbc+0x16e/0x180
[ 173.712036] RSP <ffff8800ad157bf8>
[ 173.776031] ---[ end trace d41ec1552544a207 ]---
[ 199.615574] block nbd0: Attempted send on closed socket
[ 199.617102] blk_update_request: 6 callbacks suppressed
[ 199.618090] blk_update_request: I/O error, dev nbd0, sector 8456
[ 199.620187] EXT4-fs error (device nbd0): __ext4_get_inode_loc:3927:
inode #12: block 1057: comm kworker/u8:1: unable to read itable block
[ 199.623876] ------------[ cut here ]------------
[ 199.624839] kernel BUG at fs/buffer.c:3025!
[ 199.624839] invalid opcode: 0000 [#2] SMP
[ 199.624839] Modules linked in:
[ 199.624839] CPU: 2 PID: 25 Comm: kworker/u8:1 Tainted: G D
4.2.0 #10
[ 199.624839] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.7.5-20150306_163512-brownie 04/01/2014
[ 199.624839] Workqueue: writeback wb_workfn (flush-43:0)
[ 199.624839] task: ffff88013abbd240 ti: ffff88013a424000 task.ti:
ffff88013a424000
[ 199.624839] RIP: 0010:[<ffffffff811a778e>] [<ffffffff811a778e>]
submit_bh_wbc+0x16e/0x180
[ 199.624839] RSP: 0018:ffff88013a4274e8 EFLAGS: 00010246
[ 199.624839] RAX: 0000000000620005 RBX: ffff8800bb78af70 RCX: 0000000000000000
[ 199.624839] RDX: 0000000000000000 RSI: ffff8800bb78af70 RDI: 0000000000000411
[ 199.624839] RBP: ffff88013a427518 R08: fffffffffffffff0 R09: 0000000000000000
[ 199.624839] R10: 0000000000000226 R11: 0000000000000006 R12: 0000000000000411
[ 199.624839] R13: ffff8800b8d70400 R14: 0000000000043220 R15: ffff880135f02800
[ 199.624839] FS: 0000000000000000(0000) GS:ffff88013fd00000(0000)
knlGS:0000000000000000
[ 199.624839] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 199.624839] CR2: 0000000002024808 CR3: 00000000b8630000 CR4: 00000000000006e0
[ 199.624839] Stack:
[ 199.624839] ffff880135f06358 ffff8800bb78af70 0000000000000411
ffff8800b8d70400
[ 199.624839] 0000000000043220 ffff880135f02800 ffff88013a427538
ffffffff811a9449
[ 199.624839] ffffffff81f209e0 ffff8800bb78af70 ffff88013a427548
ffffffff811a94ee
[ 199.624839] Call Trace:
[ 199.624839] [<ffffffff811a9449>] __sync_dirty_buffer+0x59/0xf0
[ 199.624839] [<ffffffff811a94ee>] sync_dirty_buffer+0xe/0x10
[ 199.624839] [<ffffffff8120de31>] ext4_commit_super+0x181/0x200
[ 199.624839] [<ffffffff8120e255>] __ext4_error_inode+0x85/0x140
[ 199.624839] [<ffffffff8108c400>] ? autoremove_wake_function+0x40/0x40
[ 199.624839] [<ffffffff811f0fdd>] __ext4_get_inode_loc+0x1fd/0x3d0
[ 199.624839] [<ffffffff811f2a78>] ext4_get_inode_loc+0x18/0x20
[ 199.624839] [<ffffffff811f442f>] ext4_reserve_inode_write+0x1f/0x80
[ 199.624839] [<ffffffff81218fdd>] ? __ext4_ext_dirty+0x6d/0x90
[ 199.624839] [<ffffffff811f44d8>] ext4_mark_inode_dirty+0x48/0x1e0
[ 199.624839] [<ffffffff81218fdd>] __ext4_ext_dirty+0x6d/0x90
[ 199.624839] [<ffffffff8121abdc>] ext4_ext_insert_extent+0x6ac/0x1310
[ 199.624839] [<ffffffff8121e098>] ext4_ext_map_blocks+0x258/0xdc0
[ 199.624839] [<ffffffff811f1ba4>] ext4_map_blocks+0x124/0x4f0
[ 199.624839] [<ffffffff811f4b51>] ext4_writepages+0x4e1/0xc50
[ 199.624839] [<ffffffff8132ab85>] ? find_next_bit+0x15/0x20
[ 199.624839] [<ffffffff81084cbb>] ? find_busiest_group+0x11b/0x980
[ 199.624839] [<ffffffff81123199>] do_writepages+0x19/0x30
[ 199.624839] [<ffffffff8119e7f9>] __writeback_single_inode+0x49/0x2b0
[ 199.624839] [<ffffffff8119ef64>] writeback_sb_inodes+0x274/0x4c0
[ 199.624839] [<ffffffff8119f234>] __writeback_inodes_wb+0x84/0xb0
[ 199.624839] [<ffffffff8119f47f>] wb_writeback+0x21f/0x290
[ 199.624839] [<ffffffff8119fc9e>] wb_workfn+0x2fe/0x480
[ 199.624839] [<ffffffff810698a6>] process_one_work+0x156/0x3e0
[ 199.624839] [<ffffffff81069e59>] worker_thread+0x49/0x450
[ 199.624839] [<ffffffff81069e10>] ? rescuer_thread+0x2e0/0x2e0
[ 199.624839] [<ffffffff81069e10>] ? rescuer_thread+0x2e0/0x2e0
[ 199.624839] [<ffffffff8106ec44>] kthread+0xc4/0xe0
[ 199.624839] [<ffffffff8106eb80>] ? kthread_create_on_node+0x180/0x180
[ 199.624839] [<ffffffff818ccc9f>] ret_from_fork+0x3f/0x70
[ 199.624839] [<ffffffff8106eb80>] ? kthread_create_on_node+0x180/0x180
[ 199.624839] Code: 09 e7 14 00 48 83 c4 08 31 c0 5b 41 5c 41 5d 41
5e 41 5f 5d c3 40 f6 c7 01 0f 84 ff fe ff ff f0 80 63 01 f7 e9 f5 fe
ff ff 0f 0b <0f> 0b 0f 0b 0f 0b 0f 0b 66 2e 0f 1f 84 00 00 00 00 00 55
31 c9
[ 199.624839] RIP [<ffffffff811a778e>] submit_bh_wbc+0x16e/0x180
[ 199.624839] RSP <ffff88013a4274e8>
[ 199.706502] ---[ end trace d41ec1552544a208 ]---
[ 199.707614] kworker/u8:1 (25) used greatest stack depth: 11104 bytes left
[ 199.708873] BUG: unable to handle kernel paging request at ffffffffffffffd8
[ 199.709861] IP: [<ffffffff8106f1cb>] kthread_data+0xb/0x20
[ 199.709861] PGD 1e0d067 PUD 1e0f067 PMD 0
[ 199.709861] Oops: 0000 [#3] SMP
[ 199.709861] Modules linked in:
[ 199.709861] CPU: 2 PID: 25 Comm: kworker/u8:1 Tainted: G D
4.2.0 #10
[ 199.709861] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.7.5-0150306_163512-brownie 04/01/2014
[ 199.709861] task: ffff88013abbd240 ti: ffff88013a424000 task.ti:
ffff88013a424000
[ 199.709861] RIP: 0010:[<ffffffff8106f1cb>] [<ffffffff8106f1cb>]
kthread_data+0xb/0x20
[ 199.709861] RSP: 0018:ffff88013a427188 EFLAGS: 00010092
[ 199.709861] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 000000000000000d
[ 199.709861] RDX: 000000000000000d RSI: 0000000000000002 RDI: ffff88013abbd240
[ 199.709861] RBP: ffff88013a427188 R08: 0000000000000000 R09: 0000000000000000
[ 199.709861] R10: 0000000000000000 R11: 0000000000000006 R12: 0000000000014fc0
[ 199.709861] R13: ffff88013fd14fc0 R14: ffff88013abbd240 R15: 0000000000000002
[ 199.709861] FS: 0000000000000000(0000) GS:ffff88013fd00000(0000)
nlGS:0000000000000000
[ 199.709861] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 199.709861] CR2: 0000000000000028 CR3: 00000000b8630000 CR4: 00000000000006e0
[ 199.709861] Stack:
[ 199.709861] ffff88013a4271a8 ffffffff8106a2d0 ffff88013a4271a8
ffff88013fd14fc0
[ 199.709861] ffff88013a4271f8 ffffffff818c8dac ffff880100000000
ffff88013abbd240
[ 199.709861] ffff88013a4271f8 ffff88013a428000 0000000000002b60
ffff88013a426dd8
[ 199.709861] Call Trace:
[ 199.709861] [<ffffffff8106a2d0>] wq_worker_sleeping+0x10/0x90
[ 199.709861] [<ffffffff818c8dac>] __schedule+0x70c/0x900
[ 199.709861] [<ffffffff818c8fd2>] schedule+0x32/0x80
[ 199.709861] [<ffffffff810561b8>] do_exit+0x718/0xb00
[ 199.709861] [<ffffffff81006c79>] oops_end+0x99/0xd0
[ 199.709861] [<ffffffff81007036>] die+0x46/0x60
[ 199.709861] [<ffffffff81003d38>] do_trap+0x138/0x140
[ 199.709861] [<ffffffff810043a5>] do_error_trap+0x75/0xe0
[ 199.709861] [<ffffffff811a778e>] ? submit_bh_wbc+0x16e/0x180
[ 199.709861] [<ffffffff813d1e1f>] ? univ8250_console_write+0x19f/0x2e0
[ 199.709861] [<ffffffff81099597>] ? print_time.part.13+0x67/0x90
[ 199.709861] [<ffffffff81099625>] ? print_prefix+0x65/0xa0
[ 199.709861] [<ffffffff8100451b>] do_invalid_op+0x1b/0x20
[ 199.709861] [<ffffffff818ce19e>] invalid_op+0x1e/0x30
[ 199.709861] [<ffffffff811a778e>] ? submit_bh_wbc+0x16e/0x180
[ 199.709861] [<ffffffff810a9bcc>] ? internal_add_timer+0x6c/0x80
[ 199.709861] [<ffffffff811a9449>] __sync_dirty_buffer+0x59/0xf0
[ 199.709861] [<ffffffff811a94ee>] sync_dirty_buffer+0xe/0x10
[ 199.709861] [<ffffffff8120de31>] ext4_commit_super+0x181/0x200
[ 199.709861] [<ffffffff8120e255>] __ext4_error_inode+0x85/0x140
[ 199.709861] [<ffffffff8108c400>] ? autoremove_wake_function+0x40/0x40
[ 199.709861] [<ffffffff811f0fdd>] __ext4_get_inode_loc+0x1fd/0x3d0
[ 199.709861] [<ffffffff811f2a78>] ext4_get_inode_loc+0x18/0x20
[ 199.709861] [<ffffffff811f442f>] ext4_reserve_inode_write+0x1f/0x80
[ 199.709861] [<ffffffff81218fdd>] ? __ext4_ext_dirty+0x6d/0x90
[ 199.709861] [<ffffffff811f44d8>] ext4_mark_inode_dirty+0x48/0x1e0
[ 199.709861] [<ffffffff81218fdd>] __ext4_ext_dirty+0x6d/0x90
[ 199.709861] [<ffffffff8121abdc>] ext4_ext_insert_extent+0x6ac/0x1310
[ 199.709861] [<ffffffff8121e098>] ext4_ext_map_blocks+0x258/0xdc0
[ 199.709861] [<ffffffff811f1ba4>] ext4_map_blocks+0x124/0x4f0
[ 199.709861] [<ffffffff811f4b51>] ext4_writepages+0x4e1/0xc50
[ 199.709861] [<ffffffff8132ab85>] ? find_next_bit+0x15/0x20
[ 199.709861] [<ffffffff81084cbb>] ? find_busiest_group+0x11b/0x980
[ 199.709861] [<ffffffff81123199>] do_writepages+0x19/0x30
[ 199.709861] [<ffffffff8119e7f9>] __writeback_single_inode+0x49/0x2b0
[ 199.709861] [<ffffffff8119ef64>] writeback_sb_inodes+0x274/0x4c0
[ 199.709861] [<ffffffff8119f234>] __writeback_inodes_wb+0x84/0xb0
[ 199.709861] [<ffffffff8119f47f>] wb_writeback+0x21f/0x290
[ 199.709861] [<ffffffff8119fc9e>] wb_workfn+0x2fe/0x480
[ 199.709861] [<ffffffff810698a6>] process_one_work+0x156/0x3e0
[ 199.709861] [<ffffffff81069e59>] worker_thread+0x49/0x450
[ 199.709861] [<ffffffff81069e10>] ? rescuer_thread+0x2e0/0x2e0
[ 199.709861] [<ffffffff81069e10>] ? rescuer_thread+0x2e0/0x2e0
[ 199.709861] [<ffffffff8106ec44>] kthread+0xc4/0xe0
[ 199.709861] [<ffffffff8106eb80>] ? kthread_create_on_node+0x180/0x180
[ 199.709861] [<ffffffff818ccc9f>] ret_from_fork+0x3f/0x70
[ 199.709861] [<ffffffff8106eb80>] ? kthread_create_on_node+0x180/0x180
[ 199.709861] Code: 8b 03 48 8b 7b 08 48 83 c3 10 4c 89 e6 ff d0 48
8b 03 48 85 c0 75 eb e9 32 ff ff ff 0f 1f 40 00 48 8b 87 f8 04 00 00
55 48 89 e5 <48> 8b 40 d8 5d c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00
00 00
[ 199.709861] RIP [<ffffffff8106f1cb>] kthread_data+0xb/0x20
[ 199.709861] RSP <ffff88013a427188>
[ 199.709861] CR2: ffffffffffffffd8
[ 199.709861] ---[ end trace d41ec1552544a209 ]---
[ 199.709861] Fixing recursive fault but reboot is needed!

It looks like when underlaying device is gone, filesystem still trying
to write to it, and wasn't handle gracefully.

Is there anyway we can fix it or probably add a queue for pending
actions for a relative long timeout? Just in case later the connection
would come back.

And I would be glad to help working on this issue if needed.

Thanks!

--Sheng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Sheng Yang

unread,

Sep 21, 2015, 6:20:11 PM9/21/15

Thank you Wouter!

(Add back lkml since I got another kernel issue...)

In fact I just hit another issue with nbd-server/client on the same
machine. The following commands hang after a while:

dd if=/dev/zero of=./test

When process hanging, dd/nbd-server are no long showing in high cpu
utilization. Ctrl-C or any operation on the NBD device mount point
won't work and hang immediately.

I enabled kernel lockup detection and found this:

[ 240.127132] INFO: task kworker/u8:1:28 blocked for more than 120 seconds.
[ 240.129350] Not tainted 4.3.0-rc2+ #13
[ 240.131392] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.135290] kworker/u8:1 D ffff88013fc95f58 12160 28 2 0x00000000
[ 240.138358] Workqueue: writeback wb_workfn (flush-43:0)
[ 240.140886] ffff88013a46b5d8 0000000000000082 ffffffff8106bd3a
ffffffff00000000
[ 240.144799] ffff8800bafe3d80 ffff88013a460000 ffff88013a46c000
ffff88013fc95f40
[ 240.148712] 7fffffffffffffff ffff880139524e30 ffff880139524e60
ffff88013a46b5f0
[ 240.152568] Call Trace:
[ 240.153915] [<ffffffff8106bd3a>] ? __queue_delayed_work+0x9a/0x190
[ 240.156202] [<ffffffff8191734e>] schedule+0x2e/0x70
[ 240.158188] [<ffffffff8191b41c>] schedule_timeout+0x1bc/0x2b0
[ 240.160373] [<ffffffff810f677a>] ? __delayacct_blkio_start+0x1a/0x30
[ 240.162714] [<ffffffff810f677a>] ? __delayacct_blkio_start+0x1a/0x30
[ 240.164964] [<ffffffff810446de>] ? kvm_clock_get_cycles+0x1e/0x20
[ 240.167126] [<ffffffff810be5b0>] ? ktime_get+0x90/0x110
[ 240.169074] [<ffffffff810f677a>] ? __delayacct_blkio_start+0x1a/0x30
[ 240.171301] [<ffffffff8191677f>] io_schedule_timeout+0x9f/0x110
[ 240.173409] [<ffffffff813193d5>] get_request+0x425/0x7d0
[ 240.175385] [<ffffffff81318fb0>] ?
trace_event_raw_event_block_get_rq+0xe0/0xe0
[ 240.178256] [<ffffffff813201c8>] ? ioc_lookup_icq+0x88/0xd0
[ 240.180206] [<ffffffff81092990>] ? wait_woken+0x80/0x80
[ 240.182038] [<ffffffff8131ceeb>] blk_queue_bio+0xcb/0x2f0
[ 240.183948] [<ffffffff8131870f>] generic_make_request+0xaf/0xf0
[ 240.185932] [<ffffffff813187b2>] submit_bio+0x62/0x140
[ 240.187778] [<ffffffff8113583b>] ? __test_set_page_writeback+0x15b/0x240
[ 240.189837] [<ffffffff81216699>] ext4_io_submit+0x39/0x50
[ 240.191648] [<ffffffff8121685a>] ext4_bio_write_page+0x18a/0x3b0
[ 240.193548] [<ffffffff8120da18>] mpage_submit_page+0x58/0x80
[ 240.195405] [<ffffffff8120dc75>] mpage_map_and_submit_buffers+0x145/0x260
[ 240.197481] [<ffffffff81212d5a>] ext4_writepages+0x5ba/0xcb0
[ 240.199335] [<ffffffff813d5291>] ? vp_notify+0x11/0x20
[ 240.201590] [<ffffffff811367cc>] do_writepages+0x1c/0x30
[ 240.203935] [<ffffffff811b9820>] __writeback_single_inode+0x50/0x300
[ 240.206589] [<ffffffff8191c012>] ? _raw_spin_unlock+0x22/0x30
[ 240.209054] [<ffffffff811ba108>] writeback_sb_inodes+0x298/0x520
[ 240.211589] [<ffffffff811ba414>] __writeback_inodes_wb+0x84/0xb0
[ 240.214146] [<ffffffff811ba730>] wb_writeback+0x250/0x2c0
[ 240.216427] [<ffffffff811bae43>] wb_workfn+0x213/0x390
[ 240.218507] [<ffffffff8106d113>] process_one_work+0x1c3/0x4c0
[ 240.220757] [<ffffffff8106d0a1>] ? process_one_work+0x151/0x4c0
[ 240.223061] [<ffffffff8106d736>] worker_thread+0x46/0x440
[ 240.225225] [<ffffffff8106d6f0>] ? rescuer_thread+0x2e0/0x2e0
[ 240.227508] [<ffffffff81072aff>] kthread+0xdf/0x100
[ 240.229433] [<ffffffff81072a20>] ? kthread_create_on_node+0x200/0x200
[ 240.231809] [<ffffffff8191cb9f>] ret_from_fork+0x3f/0x70
[ 240.233888] [<ffffffff81072a20>] ? kthread_create_on_node+0x200/0x200
[ 240.236301] 4 locks held by kworker/u8:1/28:
[ 240.238074] #0: ("writeback"){......}, at: [<ffffffff8106d0a1>]
process_one_work+0x151/0x4c0
[ 240.241916] #1: ((&(&wb->dwork)->work)){......}, at:
[<ffffffff8106d0a1>] process_one_work+0x151/0x4c0
[ 240.245960] #2: (&type->s_umount_key#31){......}, at:
[<ffffffff8118e9e6>] trylock_super+0x16/0x50
[ 240.249995] #3: (jbd2_handle){......}, at: [<ffffffff81256f72>]
start_this_handle+0x112/0x440
[ 240.253729] INFO: task jbd2/nbd0-8:2235 blocked for more than 120 seconds.
[ 240.256081] Not tainted 4.3.0-rc2+ #13
[ 240.257692] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.260691] jbd2/nbd0-8 D ffff88013fd95f58 13488 2235 2 0x00000000
[ 240.263250] ffff8800ba29bc58 0000000000000086 ffff8800ba29bd40
0000000000000000
[ 240.266402] ffff8801394c9480 ffff8800ba156680 ffff8800ba29c000
ffff8800ba29bd58
[ 240.269519] ffff8800ba156680 ffff8800ba29bd40 ffff8800bb4cea00
ffff8800ba29bc70
[ 240.272677] Call Trace:
[ 240.273805] [<ffffffff8191734e>] schedule+0x2e/0x70
[ 240.275565] [<ffffffff81259d35>]
jbd2_journal_commit_transaction+0x245/0x1760
[ 240.278277] [<ffffffff8108001c>] ? build_sched_domain+0x24c/0x2e0
[ 240.280335] [<ffffffff81092990>] ? wait_woken+0x80/0x80
[ 240.282200] [<ffffffff810b644a>] ? try_to_del_timer_sync+0x4a/0x60
[ 240.284285] [<ffffffff8125e780>] kjournald2+0xc0/0x260
[ 240.286059] [<ffffffff81092990>] ? wait_woken+0x80/0x80
[ 240.287865] [<ffffffff8125e6c0>] ? commit_timeout+0x10/0x10
[ 240.289761] [<ffffffff81072aff>] kthread+0xdf/0x100
[ 240.291485] [<ffffffff81072a20>] ? kthread_create_on_node+0x200/0x200
[ 240.293585] [<ffffffff8191cb9f>] ret_from_fork+0x3f/0x70
[ 240.295404] [<ffffffff81072a20>] ? kthread_create_on_node+0x200/0x200
[ 240.297513] no locks held by jbd2/nbd0-8/2235.
[ 240.299098] INFO: task dd:2255 blocked for more than 120 seconds.
[ 240.301078] Not tainted 4.3.0-rc2+ #13
[ 240.302586] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 240.305422] dd D ffff88013fd95f58 13240 2255 2254 0x00000004
[ 240.307864] ffff880139b3ba38 0000000000000086 0000000000000035
0000000000000000
[ 240.310966] ffff88013ab68000 ffff8801394c9480 ffff880139b3c000
0000000000000000
[ 240.314027] ffff8800bacc8900 0000000000000035 0000000000000000
ffff880139b3ba50
[ 240.317130] Call Trace:
[ 240.318222] [<ffffffff8191734e>] schedule+0x2e/0x70
[ 240.319979] [<ffffffff81256b0f>] wait_transaction_locked+0x7f/0xb0
[ 240.322010] [<ffffffff81092990>] ? wait_woken+0x80/0x80
[ 240.323872] [<ffffffff81256d82>] add_transaction_credits+0x212/0x2a0
[ 240.325980] [<ffffffff81256f65>] start_this_handle+0x105/0x440
[ 240.327882] [<ffffffff81079cf5>] ? finish_task_switch+0xa5/0x1b0
[ 240.329948] [<ffffffff81256a5b>] ? new_handle+0x1b/0x50
[ 240.331771] [<ffffffff812574ea>] jbd2__journal_start+0xba/0x190
[ 240.333716] [<ffffffff8121416c>] ? ext4_da_write_begin+0x10c/0x2d0
[ 240.335806] [<ffffffff8123f13c>] __ext4_journal_start_sb+0x6c/0xd0
[ 240.337835] [<ffffffff8121416c>] ext4_da_write_begin+0x10c/0x2d0
[ 240.339875] [<ffffffff8123f1dd>] ? __ext4_journal_stop+0x3d/0xb0
[ 240.341940] [<ffffffff81128f62>] generic_perform_write+0xa2/0x1e0
[ 240.343930] [<ffffffff811a66d0>] ? file_update_time+0xa0/0xe0
[ 240.345849] [<ffffffff8112b13b>] __generic_file_write_iter+0x18b/0x1e0
[ 240.347949] [<ffffffff81209d60>] ext4_file_write_iter+0x210/0x420
[ 240.349988] [<ffffffff8118e26c>] ? __sb_start_write+0xec/0x100
[ 240.351963] [<ffffffff8118b195>] __vfs_write+0xa5/0xe0
[ 240.353733] [<ffffffff8118b7e4>] vfs_write+0xa4/0x190
[ 240.355512] [<ffffffff8118b71f>] ? vfs_read+0x10f/0x130
[ 240.357310] [<ffffffff8118c4a4>] SyS_write+0x44/0xa0
[ 240.359014] [<ffffffff8191c832>] entry_SYSCALL_64_fastpath+0x12/0x76
[ 240.361142] 2 locks held by dd/2255:
[ 240.362552] #0: (sb_writers#5){......}, at: [<ffffffff8118e26c>]
__sb_start_write+0xec/0x100
[ 240.366119] #1: (&sb->s_type->i_mutex_key#9){......}, at:
[<ffffffff81209cdf>] ext4_file_write_iter+0x18f/0x420

--Sheng

On Sun, Sep 20, 2015 at 11:52 PM, Wouter Verhelst <w...@uter.be> wrote:
> Hi,
>
> [linux-kernel CC dropped, since this is irrelevant for them]

>
> On Wed, Sep 16, 2015 at 06:18:50PM -0700, Sheng Yang wrote:
>> Hi, Markus,
>>
>> I've experienced kernel bug and system hang regarding using
>> nbd-client/server on the same machine. Kernel verison is v4.2.
>>
>> I know http://nbd.sourceforge.net/ said:
>> "Please note that read-write nbd with client and server on the same
>> machine is a bad idea: expect deadlock within seconds". But it's
>> mentioned at http://thread.gmane.org/gmane.linux.kernel/869784 that
>> NBD should be able to handle client/server on the same machine.
>

> Yes, that bug has been fixed. I need to figure out how I can update
> those pages again (sourceforge changed the procedure a few times), and
> remove that note.
>
> (the rest is kernel, so for Markus, not me)
>
> --
> It is easy to love a country that is famous for chocolate and beer
>
> -- Barack Obama, speaking in Brussels, Belgium, 2014-03-26

Sheng Yang

unread,

Sep 21, 2015, 8:40:13 PM9/21/15

Thank you Paul! That's exactly the issue I met. I've read the whole
thread and got a general idea of the issue.

I try to summarize it and please correct me if I'm wrong:

1. The issue is the result of kill_bdev() when connection has been cut
when IO is still flying.
2. Other block devices driver didn't have this issue because they
normally keep the buffer and device, deny all the requests and only
kill the device when all request has been settled down and device has
been umounted.
3. Why NBD cannot handle it in the same way because NBD has dependency
on userspace nfs-client, which would handle the reconnection/retry. If
DO_IT ioctl didn't return, then there is no way for userspace to
reconnect. How to coordinate between kernel and userspace on
reconnecting while leaving the device open until it's safe to close
leads to several choices and would leads to quite amount of change.

Sounds like Goswin's suggestion really make sense(
http://sourceforge.net/p/nbd/mailman/message/31678340/ ). This issue
has been there for years and impact the stability of NBD a lot. I
think we should get it fixed.

--Sheng

On Mon, Sep 21, 2015 at 4:06 PM, Paul Clements
<paul.c...@us.sios.com> wrote:
> Hi Sheng,
>
> this is a known issue (not related to nbd-client/nbd-server on the same box,
> but any termination of the nbd connection with mounted fs and I/O ongoing)
>
> It was discussed at length in Nov 2013 in this thread (along with possible
> solutions):
>
> http://sourceforge.net/p/nbd/mailman/nbd-general/thread/20131114075827.GA13554%40quack.suse.cz/#msg31636917
>
> Maybe Markus has some thoughts on how to fix it. I think there are a couple
> options.
>
> --
> Paul

>> ------------------------------------------------------------------------------
>> Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
>> Get real-time metrics from all of your servers, apps and tools
>> in one place.
>> SourceForge users - Click here to start your Free Trial of Datadog now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
>> _______________________________________________
>> Nbd-general mailing list
>> Nbd-g...@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nbd-general

Markus Pargmann

unread,

Sep 24, 2015, 2:20:07 AM9/24/15

Hi,

Thanks for the pointer Paul.

On Mon, Sep 21, 2015 at 05:33:21PM -0700, Sheng Yang wrote:
> Thank you Paul! That's exactly the issue I met. I've read the whole
> thread and got a general idea of the issue.
>
> I try to summarize it and please correct me if I'm wrong:
>
> 1. The issue is the result of kill_bdev() when connection has been cut
> when IO is still flying.
> 2. Other block devices driver didn't have this issue because they
> normally keep the buffer and device, deny all the requests and only
> kill the device when all request has been settled down and device has
> been umounted.
> 3. Why NBD cannot handle it in the same way because NBD has dependency
> on userspace nfs-client, which would handle the reconnection/retry. If
> DO_IT ioctl didn't return, then there is no way for userspace to
> reconnect. How to coordinate between kernel and userspace on
> reconnecting while leaving the device open until it's safe to close
> leads to several choices and would leads to quite amount of change.
>
> Sounds like Goswin's suggestion really make sense(
> http://sourceforge.net/p/nbd/mailman/message/31678340/ ). This issue
> has been there for years and impact the stability of NBD a lot. I
> think we should get it fixed.

It may make sense as well to create a kernel thread to replace the
userspace thread which is blocking in the DO_IT ioctl.

This way we could even react on degraded (long RTT) but working
connections. The userspace thread could return while the nbd in general
does still work. As soon as all users closed the block device, we can
kill the threads that keep it alive. Otherwise the userspace thread can
update the socket while the nbd is still working and jump into DO_IT
again.

To force a disconnect we still have the NBD_DISCONNECT ioctl.

This would require some mechanism to switch sockets on the fly, so it
may be necessary to listen on multiple sockets for a time. Or have a
small inactive phase where we wait on the remaining requests and switch
afterwards. However this could allow some softer mechanism than the
currently available 'timeout' which completely removes the device.

Best Regards,

Markus

signature.asc

Pavel Machek

unread,

Oct 7, 2015, 4:20:08 AM10/7/15

On Mon 2015-09-21 17:33:21, Sheng Yang wrote:
> Thank you Paul! That's exactly the issue I met. I've read the whole
> thread and got a general idea of the issue.
>
> I try to summarize it and please correct me if I'm wrong:
>
> 1. The issue is the result of kill_bdev() when connection has been cut
> when IO is still flying.
> 2. Other block devices driver didn't have this issue because they
> normally keep the buffer and device, deny all the requests and only
> kill the device when all request has been settled down and device has
> been umounted.
> 3. Why NBD cannot handle it in the same way because NBD has dependency
> on userspace nfs-client, which would handle the reconnection/retry. If
> DO_IT ioctl didn't return, then there is no way for userspace to
> reconnect. How to coordinate between kernel and userspace on
> reconnecting while leaving the device open until it's safe to close
> leads to several choices and would leads to quite amount of change.
>
> Sounds like Goswin's suggestion really make sense(
> http://sourceforge.net/p/nbd/mailman/message/31678340/ ). This issue
> has been there for years and impact the stability of NBD a lot. I
> think we should get it fixed.

You can get fix this one, but getting memory management right so that
client & server on same machine is safe is not going to be easy.

Think "nbd-server swapped out, because memory was needed by dirty buffers".

Think "nbd-server can't allocate memory because it is all in the dirty buffers".

Pavel

0 new messages