BUG: soft lockup - CPU#0 stuck for 10s! 各位请帮我分析一下这错误原因,谢谢大家

98 views
Skip to first unread message

亚峰

unread,
Jun 17, 2011, 5:18:28 AM6/17/11
to linux-...@zh-kernel.org
我自己编写的文件系统(绕过了文件的地址空间,大量的采用了sb_bread()函数和brelse()函数),在拷贝文件的时候,BUG: soft lockup - CPU#0 stuck for 10s! 是死锁了吗?

另外,我如果以 -o loop 形式来挂载文件或者分区,就没问题,即时是600秒,也不会当机。

请各位帮我分析一下
_______________________________________________
Linux 内核开发中文邮件列表
Linux-...@zh-kernel.org
http://zh-kernel.org/mailman/listinfo/linux-kernel
Linux 内核开发中文社区: http://zh-kernel.org

liuchang

unread,
Jun 17, 2011, 5:42:41 AM6/17/11
to linux-...@zh-kernel.org
是谁soft lockup了? 贴一下calltrace 吧。

亚峰

unread,
Jun 17, 2011, 5:47:48 AM6/17/11
to liuchang, linux-...@zh-kernel.org
这个可以吗?红色是我自己的printk

my-debug:function my_write_super() begin...
function my_prepare_write() be called
function my_commit_write() be called
at the begin of my_get_block(),the create is 1
want get block 62 in file
hit the region:[12]
my-debuy:get the depth from block_to_path() is 2
my_inode_info->[0] is 34728
my_inode_info->[1] is 34729
my_inode_info->[2] is 34730
my_inode_info->[3] is 34731
my_inode_info->[4] is 34732
my_inode_info->[5] is 34750
my_inode_info->[6] is 34751
my_inode_info->[7] is 34733
my_inode_info->[8] is 34770
my_inode_info->[9] is 34771
my_inode_info->[10] is 34772
my_inode_info->[11] is 34773
my_inode_info->[12] is 34706
my_inode_info->[13] is 0
my_inode_info->[14] is 0
offsets[0] is 12
offsets[1] is 50
50 of offsets[1] is a hole!!!
my-debug:get the next_nr is 0
not found the disk-block, want alloc block on disk
my-debug: the real_depth is 1
my-debug:function my_write_super() end.....
BUG: soft lockup - CPU#0 stuck for 10s! [ld:2829]
CPU 0:
Modules linked in: test(U) autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api loop dm_mirror dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac lp floppy pcspkr i2c_piix4 i2c_core 8139too 8139cp mii ide_cd cdrom serio_raw parport_pc parport dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 2829, comm: ld Tainted: G 2.6.18-194.el5 #1
RIP: 0010:[<ffffffff800076ea>] [<ffffffff800076ea>] find_get_page+0x4b/0x51
RSP: 0018:ffff81003c1539e0 EFLAGS: 00000216
RAX: 0000000000000000 RBX: ffff810037d4f240 RCX: 0000000000000000
RDX: 0000000000000002 RSI: 0000000000008792 RDI: 0000000000000000
RBP: 0000000000000001 R08: ffff8100095789a0 R09: 0000000000000018
R10: 0000000000000018 R11: 0000000000000001 R12: ffff810037d4f240
R13: 0000000000000286 R14: ffffffff8001727d R15: 0000000000000046
FS: 00002b31916ab7e0(0000) GS:ffffffff803cb000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000008eba3c8 CR3: 000000003d496000 CR4: 00000000000006e0

Call Trace:
[<ffffffff80010c84>] __find_get_block_slow+0x2f/0xf7
[<ffffffff8000b338>] __find_get_block+0x97/0x16c
[<ffffffff80025bce>] find_or_create_page+0x22/0x72
[<ffffffff80019c6a>] __getblk+0xc6/0x236
[<ffffffff80025643>] __bread+0x6/0x86
[<ffffffff884da5c3>] :test:my_get_block+0x169/0x28b
[<ffffffff8000e6ed>] __block_prepare_write+0x1ad/0x3a6
[<ffffffff884da45a>] :test:my_get_block+0x0/0x28b
[<ffffffff8000f2ff>] __alloc_pages+0x78/0x308
[<ffffffff8003cf1b>] block_prepare_write+0x1a/0x25
[<ffffffff884db2a1>] :test:my_prepare_write+0x16/0x3c
[<ffffffff8000ffac>] generic_file_buffered_write+0x3b7/0x675
[<ffffffff884da45a>] :test:my_get_block+0x0/0x28b
[<ffffffff8000b99f>] touch_atime+0x67/0xaa
[<ffffffff80016641>] __generic_file_aio_write_nolock+0x369/0x3b6
[<ffffffff800c74f0>] __generic_file_write_nolock+0x8f/0xa8
[<ffffffff800a1ba4>] autoremove_wake_function+0x0/0x2e
[<ffffffff80064b05>] mutex_lock+0xd/0x1d
[<ffffffff800459d9>] generic_file_write+0x49/0xa7
[<ffffffff80016a49>] vfs_write+0xce/0x174
[<ffffffff80017316>] sys_write+0x45/0x6e
[<ffffffff8005e28d>] tracesys+0xd5/0xe0

亚峰

unread,
Jun 17, 2011, 5:55:31 AM6/17/11
to liuchang, linux-...@zh-kernel.org
其实,我为了测试,在自己的文件系统上,拷贝了一份源码,
然后 执行了一下make clean,在ld 的时候,或者mv或者新建文件的时候,都有可能死机,
(其实也不是死机,就是不断的打印 BUG: soft lockup - CPU#0 stuck for 10s)

所以crash-kernel也没法捕获问题所在。。。

At 2011-06-17 17:42:41,liuchang <liuc...@nrchpc.ac.cn> wrote:

Reply all
Reply to author
Forward
0 new messages