Kernel is 2.6.33-rc8-next-20100217+:
At some point setkey(8) and modprobe hanged:
3541 pts/1 D 0:00 /usr/sbin/setkey -f /root/ipcomp-tunnel.setkey
3737 pts/2 D 0:00 modprobe -r des_generic sha1_generic md5 hmac xfrm_user crypto_null af_packet xfrm4_mode_beet ipcomp xfrm_ipcomp xfrm4_tunnel deflate zlib_deflate zlib_inflate esp4 xfrm4_mode_tran
SysRq : Show Blocked State
task PC stack pid father
ipcomp-tunnel D 0000000000000000 4944 3541 3532 0x00000000
ffff8800b98636f8 0000000000000046 0000000000000082 ffff8800b9862000
ffff8800b9863fd8 ffff8800b9862000 0000000000004000 ffff8800b9863fd8
ffff8800b9863fd8 ffff8800bd40eba0 ffffffff81625020 0000000000000046
Call Trace:
[<ffffffff813f218d>] schedule_timeout+0x2ad/0x320
[<ffffffff813f1239>] ? wait_for_common+0x49/0x180
[<ffffffff813f134b>] wait_for_common+0x15b/0x180
[<ffffffff81038960>] ? default_wake_function+0x0/0x20
[<ffffffff813f142d>] wait_for_completion+0x1d/0x20
[<ffffffff8105c841>] call_usermodehelper_exec+0xe1/0x110
[<ffffffff813ddaf0>] ? xfrm_init_state+0x120/0x3a0
[<ffffffff8105cb24>] __request_module+0x194/0x200
[<ffffffff81060568>] ? __kernel_text_address+0x58/0x80
[<ffffffff813dd68c>] ? xfrm_state_get_afinfo+0x2c/0x60
[<ffffffff813ddaf0>] xfrm_init_state+0x120/0x3a0
[<ffffffff810e7bbe>] ? kmem_cache_alloc_notrace+0xbe/0x130
[<ffffffffa05750e8>] pfkey_add+0x738/0x980 [af_key]
[<ffffffffa0570b86>] ? pfkey_broadcast+0x1e6/0x210 [af_key]
[<ffffffffa05709a0>] ? pfkey_broadcast+0x0/0x210 [af_key]
[<ffffffff8136b90e>] ? __skb_clone+0x2e/0x110
[<ffffffffa0573655>] pfkey_sendmsg+0x375/0x3b0 [af_key]
[<ffffffff81364a60>] sock_sendmsg+0xc0/0xf0
[<ffffffff81364b1a>] ? sockfd_lookup_light+0x2a/0x80
[<ffffffff8136532f>] sys_sendto+0xef/0x120
[<ffffffff813681c4>] ? release_sock+0x34/0x110
[<ffffffff8136828b>] ? release_sock+0xfb/0x110
[<ffffffff8136a30b>] ? sock_setsockopt+0xcb/0x850
[<ffffffff810d4176>] ? remove_vma+0x66/0x80
[<ffffffff8100279b>] system_call_fastpath+0x16/0x1b
modprobe D ffff8800bd072ba0 5280 3737 3734 0x00000000
ffff8800b9ae1ed8 0000000000000046 0000000000000000 ffff8800b9ae0000
ffff8800b9ae1fd8 ffff8800b9ae0000 0000000000004000 ffff8800b9ae1fd8
ffff8800b9ae1fd8 ffff8800b98bae20 ffff8800bad4dc40 ffffffff81630f68
Call Trace:
[<ffffffff810761dd>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff81083306>] ? module_refcount+0x46/0xb0
[<ffffffff81084150>] sys_delete_module+0x230/0x250
[<ffffffff813f4522>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8100279b>] system_call_fastpath+0x16/0x1b
INFO: task ipcomp-tunnel.s:3541 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ipcomp-tunnel D 0000000000000000 4944 3541 3532 0x00000000
ffff8800b98636f8 0000000000000046 0000000000000082 ffff8800b9862000
ffff8800b9863fd8 ffff8800b9862000 0000000000004000 ffff8800b9863fd8
ffff8800b9863fd8 ffff8800bd40eba0 ffffffff81625020 0000000000000046
Call Trace:
[<ffffffff813f218d>] schedule_timeout+0x2ad/0x320
[<ffffffff813f1239>] ? wait_for_common+0x49/0x180
[<ffffffff813f134b>] wait_for_common+0x15b/0x180
[<ffffffff81038960>] ? default_wake_function+0x0/0x20
[<ffffffff813f142d>] wait_for_completion+0x1d/0x20
[<ffffffff8105c841>] call_usermodehelper_exec+0xe1/0x110
[<ffffffff813ddaf0>] ? xfrm_init_state+0x120/0x3a0
[<ffffffff8105cb24>] __request_module+0x194/0x200
[<ffffffff81060568>] ? __kernel_text_address+0x58/0x80
[<ffffffff813dd68c>] ? xfrm_state_get_afinfo+0x2c/0x60
[<ffffffff813ddaf0>] xfrm_init_state+0x120/0x3a0
[<ffffffff810e7bbe>] ? kmem_cache_alloc_notrace+0xbe/0x130
[<ffffffffa05750e8>] pfkey_add+0x738/0x980 [af_key]
[<ffffffffa0570b86>] ? pfkey_broadcast+0x1e6/0x210 [af_key]
[<ffffffffa05709a0>] ? pfkey_broadcast+0x0/0x210 [af_key]
[<ffffffff8136b90e>] ? __skb_clone+0x2e/0x110
[<ffffffffa0573655>] pfkey_sendmsg+0x375/0x3b0 [af_key]
[<ffffffff81364a60>] sock_sendmsg+0xc0/0xf0
[<ffffffff81364b1a>] ? sockfd_lookup_light+0x2a/0x80
[<ffffffff8136532f>] sys_sendto+0xef/0x120
[<ffffffff813681c4>] ? release_sock+0x34/0x110
[<ffffffff8136828b>] ? release_sock+0xfb/0x110
[<ffffffff8136a30b>] ? sock_setsockopt+0xcb/0x850
[<ffffffff810d4176>] ? remove_vma+0x66/0x80
[<ffffffff8100279b>] system_call_fastpath+0x16/0x1b
INFO: lockdep is turned off.
INFO: task modprobe:3737 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
modprobe D ffff8800bd072ba0 5280 3737 3734 0x00000000
ffff8800b9ae1ed8 0000000000000046 0000000000000000 ffff8800b9ae0000
ffff8800b9ae1fd8 ffff8800b9ae0000 0000000000004000 ffff8800b9ae1fd8
ffff8800b9ae1fd8 ffff8800b98bae20 ffff8800bad4dc40 ffffffff81630f68
Call Trace:
[<ffffffff810761dd>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff81083306>] ? module_refcount+0x46/0xb0
[<ffffffff81084150>] sys_delete_module+0x230/0x250
[<ffffffff813f4522>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8100279b>] system_call_fastpath+0x16/0x1b
INFO: lockdep is turned off.
INFO: task ipcomp-tunnel.s:3541 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ipcomp-tunnel D 0000000000000000 4944 3541 3532 0x00000000
ffff8800b98636f8 0000000000000046 0000000000000082 ffff8800b9862000
ffff8800b9863fd8 ffff8800b9862000 0000000000004000 ffff8800b9863fd8
ffff8800b9863fd8 ffff8800bd40eba0 ffffffff81625020 0000000000000046
Call Trace:
[<ffffffff813f218d>] schedule_timeout+0x2ad/0x320
[<ffffffff813f1239>] ? wait_for_common+0x49/0x180
[<ffffffff813f134b>] wait_for_common+0x15b/0x180
[<ffffffff81038960>] ? default_wake_function+0x0/0x20
[<ffffffff813f142d>] wait_for_completion+0x1d/0x20
[<ffffffff8105c841>] call_usermodehelper_exec+0xe1/0x110
[<ffffffff813ddaf0>] ? xfrm_init_state+0x120/0x3a0
[<ffffffff8105cb24>] __request_module+0x194/0x200
[<ffffffff81060568>] ? __kernel_text_address+0x58/0x80
[<ffffffff813dd68c>] ? xfrm_state_get_afinfo+0x2c/0x60
[<ffffffff813ddaf0>] xfrm_init_state+0x120/0x3a0
[<ffffffff810e7bbe>] ? kmem_cache_alloc_notrace+0xbe/0x130
[<ffffffffa05750e8>] pfkey_add+0x738/0x980 [af_key]
[<ffffffffa0570b86>] ? pfkey_broadcast+0x1e6/0x210 [af_key]
[<ffffffffa05709a0>] ? pfkey_broadcast+0x0/0x210 [af_key]
[<ffffffff8136b90e>] ? __skb_clone+0x2e/0x110
[<ffffffffa0573655>] pfkey_sendmsg+0x375/0x3b0 [af_key]
[<ffffffff81364a60>] sock_sendmsg+0xc0/0xf0
[<ffffffff81364b1a>] ? sockfd_lookup_light+0x2a/0x80
[<ffffffff8136532f>] sys_sendto+0xef/0x120
[<ffffffff813681c4>] ? release_sock+0x34/0x110
[<ffffffff8136828b>] ? release_sock+0xfb/0x110
[<ffffffff8136a30b>] ? sock_setsockopt+0xcb/0x850
[<ffffffff810d4176>] ? remove_vma+0x66/0x80
[<ffffffff8100279b>] system_call_fastpath+0x16/0x1b
INFO: lockdep is turned off.
INFO: task modprobe:3737 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
modprobe D ffff8800bd072ba0 5280 3737 3734 0x00000000
ffff8800b9ae1ed8 0000000000000046 0000000000000000 ffff8800b9ae0000
ffff8800b9ae1fd8 ffff8800b9ae0000 0000000000004000 ffff8800b9ae1fd8
ffff8800b9ae1fd8 ffff8800b98bae20 ffff8800bad4dc40 ffffffff81630f68
Call Trace:
[<ffffffff810761dd>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff81083306>] ? module_refcount+0x46/0xb0
[<ffffffff81084150>] sys_delete_module+0x230/0x250
[<ffffffff813f4522>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8100279b>] system_call_fastpath+0x16/0x1b
INFO: lockdep is turned off.
INFO: task ipcomp-tunnel.s:3541 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ipcomp-tunnel D 0000000000000000 4944 3541 3532 0x00000000
ffff8800b98636f8 0000000000000046 0000000000000082 ffff8800b9862000
ffff8800b9863fd8 ffff8800b9862000 0000000000004000 ffff8800b9863fd8
ffff8800b9863fd8 ffff8800bd40eba0 ffffffff81625020 0000000000000046
Call Trace:
[<ffffffff813f218d>] schedule_timeout+0x2ad/0x320
[<ffffffff813f1239>] ? wait_for_common+0x49/0x180
[<ffffffff813f134b>] wait_for_common+0x15b/0x180
[<ffffffff81038960>] ? default_wake_function+0x0/0x20
[<ffffffff813f142d>] wait_for_completion+0x1d/0x20
[<ffffffff8105c841>] call_usermodehelper_exec+0xe1/0x110
[<ffffffff813ddaf0>] ? xfrm_init_state+0x120/0x3a0
[<ffffffff8105cb24>] __request_module+0x194/0x200
[<ffffffff81060568>] ? __kernel_text_address+0x58/0x80
[<ffffffff813dd68c>] ? xfrm_state_get_afinfo+0x2c/0x60
[<ffffffff813ddaf0>] xfrm_init_state+0x120/0x3a0
[<ffffffff810e7bbe>] ? kmem_cache_alloc_notrace+0xbe/0x130
[<ffffffffa05750e8>] pfkey_add+0x738/0x980 [af_key]
[<ffffffffa0570b86>] ? pfkey_broadcast+0x1e6/0x210 [af_key]
[<ffffffffa05709a0>] ? pfkey_broadcast+0x0/0x210 [af_key]
[<ffffffff8136b90e>] ? __skb_clone+0x2e/0x110
[<ffffffffa0573655>] pfkey_sendmsg+0x375/0x3b0 [af_key]
[<ffffffff81364a60>] sock_sendmsg+0xc0/0xf0
[<ffffffff81364b1a>] ? sockfd_lookup_light+0x2a/0x80
[<ffffffff8136532f>] sys_sendto+0xef/0x120
[<ffffffff813681c4>] ? release_sock+0x34/0x110
[<ffffffff8136828b>] ? release_sock+0xfb/0x110
[<ffffffff8136a30b>] ? sock_setsockopt+0xcb/0x850
[<ffffffff810d4176>] ? remove_vma+0x66/0x80
[<ffffffff8100279b>] system_call_fastpath+0x16/0x1b
INFO: lockdep is turned off.
INFO: task modprobe:3737 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
modprobe D ffff8800bd072ba0 5280 3737 3734 0x00000000
ffff8800b9ae1ed8 0000000000000046 0000000000000000 ffff8800b9ae0000
ffff8800b9ae1fd8 ffff8800b9ae0000 0000000000004000 ffff8800b9ae1fd8
ffff8800b9ae1fd8 ffff8800b98bae20 ffff8800bad4dc40 ffffffff81630f68
Call Trace:
[<ffffffff810761dd>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff81083306>] ? module_refcount+0x46/0xb0
[<ffffffff81084150>] sys_delete_module+0x230/0x250
[<ffffffff813f4522>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8100279b>] system_call_fastpath+0x16/0x1b
INFO: lockdep is turned off.
INFO: task ipcomp-tunnel.s:3541 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ipcomp-tunnel D 0000000000000000 4944 3541 3532 0x00000000
ffff8800b98636f8 0000000000000046 0000000000000082 ffff8800b9862000
ffff8800b9863fd8 ffff8800b9862000 0000000000004000 ffff8800b9863fd8
ffff8800b9863fd8 ffff8800bd40eba0 ffffffff81625020 0000000000000046
Call Trace:
[<ffffffff813f218d>] schedule_timeout+0x2ad/0x320
[<ffffffff813f1239>] ? wait_for_common+0x49/0x180
[<ffffffff813f134b>] wait_for_common+0x15b/0x180
[<ffffffff81038960>] ? default_wake_function+0x0/0x20
[<ffffffff813f142d>] wait_for_completion+0x1d/0x20
[<ffffffff8105c841>] call_usermodehelper_exec+0xe1/0x110
[<ffffffff813ddaf0>] ? xfrm_init_state+0x120/0x3a0
[<ffffffff8105cb24>] __request_module+0x194/0x200
[<ffffffff81060568>] ? __kernel_text_address+0x58/0x80
[<ffffffff813dd68c>] ? xfrm_state_get_afinfo+0x2c/0x60
[<ffffffff813ddaf0>] xfrm_init_state+0x120/0x3a0
[<ffffffff810e7bbe>] ? kmem_cache_alloc_notrace+0xbe/0x130
[<ffffffffa05750e8>] pfkey_add+0x738/0x980 [af_key]
[<ffffffffa0570b86>] ? pfkey_broadcast+0x1e6/0x210 [af_key]
[<ffffffffa05709a0>] ? pfkey_broadcast+0x0/0x210 [af_key]
[<ffffffff8136b90e>] ? __skb_clone+0x2e/0x110
[<ffffffffa0573655>] pfkey_sendmsg+0x375/0x3b0 [af_key]
[<ffffffff81364a60>] sock_sendmsg+0xc0/0xf0
[<ffffffff81364b1a>] ? sockfd_lookup_light+0x2a/0x80
[<ffffffff8136532f>] sys_sendto+0xef/0x120
[<ffffffff813681c4>] ? release_sock+0x34/0x110
[<ffffffff8136828b>] ? release_sock+0xfb/0x110
[<ffffffff8136a30b>] ? sock_setsockopt+0xcb/0x850
[<ffffffff810d4176>] ? remove_vma+0x66/0x80
[<ffffffff8100279b>] system_call_fastpath+0x16/0x1b
INFO: lockdep is turned off.
INFO: task modprobe:3737 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
modprobe D ffff8800bd072ba0 5280 3737 3734 0x00000000
ffff8800b9ae1ed8 0000000000000046 0000000000000000 ffff8800b9ae0000
ffff8800b9ae1fd8 ffff8800b9ae0000 0000000000004000 ffff8800b9ae1fd8
ffff8800b9ae1fd8 ffff8800b98bae20 ffff8800bad4dc40 ffffffff81630f68
Call Trace:
[<ffffffff810761dd>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff81083306>] ? module_refcount+0x46/0xb0
[<ffffffff81084150>] sys_delete_module+0x230/0x250
[<ffffffff813f4522>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8100279b>] system_call_fastpath+0x16/0x1b
INFO: lockdep is turned off.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Hmm, looks like sys_delete_module() is stuck trying to re-grab the
module_mutex. But who's holding it? We'd need a full task dump to see.
We do a stop_machine with the module mutex held, but not much else. This
doesn't seem like that. Do you have tracepoints, or dynamic debug enabled?
Puzzled,
Rusty.
--
Away travelling 25Feb-26Mar (6 .de + 1 .pl + 17 .lt + 2 .sg)
I don't know if this kernel has Andi's patchset, which is:
http://lkml.org/lkml/2010/1/4/506
If yes, then please try Neil's patch:
http://lkml.org/lkml/2010/2/12/181
Thanks.
Looking at this, my guess is that your hang is in userspace, and that you have
an old version of module-utils. I fixed this with iptables awhile back. What
happens is that process 3737, the modprobe is removing modules, while the setkey
utility is requesting kernel services that require those modules, so you are in
fact trying to load and unload the same module(s) in parallel. While thats not
a huge deal, it results in the setkey utility forking another copy of modprobe
(using callusermode_helper). The two copies of modprobe actually take file
locks on the module they are workign with, serializing them in user space. If
the setkey path and the modprobe path serialize at all in kernel space (by
taking the module_mutex for instance), you get deadlock.
I fixed this by making modprobe be non-blocking in user space (which is the
default for insmod as well), and it fixed the problem. The gory details are
here:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=16fcec35e7d7c4faaa4709f6434a4a25b06d25e3
My guess is that if you get the latest module-utils you can make this go away.
Neil
I see, excellent! Thank you!