Johannes and I see this on our quad G5s... it -could- be similar to
one reported a short while ago by Xiaotian Feng <xtf...@gmail.com>
under the subject [2.6.33-rc4] sysfs lockdep warnings on cpu hotplug.
Basically, the machine deadlocks right after printing the following
when doing a shutdown:
halt/4071 is trying to acquire lock:
(s_active){++++.+}, at: [<c0000000001ef868>] .sysfs_addrm_finish+0x58/0xc0
but task is already holding lock:
(&per_cpu(cpu_policy_rwsem, cpu)){+.+.+.}, at: [<c0000000004cd6ac>] .lock_policy_rwsem_write+0x84/0xf4
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
<nothing else ... machine deadlocked here>
Any idea ?
Cheers,
Ben.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This is still happening with -rc8. Any news?
johannes
On Thu, 18 Feb 2010, Johannes Berg wrote:
> >
> > Basically, the machine deadlocks right after printing the following
> > when doing a shutdown:
> >
> > halt/4071 is trying to acquire lock:
> > (s_active){++++.+}, at: [<c0000000001ef868>]
> > .sysfs_addrm_finish+0x58/0xc0
> >
> > but task is already holding lock:
> > (&per_cpu(cpu_policy_rwsem, cpu)){+.+.+.}, at: [<c0000000004cd6ac>]
> > .lock_policy_rwsem_write+0x84/0xf4
> >
> > which lock already depends on the new lock.
You don't have a full backtrace for these things?
We've had lots of trouble with the cpu governors, and I suspect the
problem isn't new, but the lockdep warning is likely new (see commit
846f99749ab68bbc7f75c74fec305de675b1a1bf: "sysfs: Add lockdep annotations
for the sysfs active reference").
So it is likely to be an old issue that (a) now gets warned about and (b)
might have had timing changes enough to trigger it.
I suspect it is G5-specific (or specific to whatever CPU frequency code
that gets used there), since I think we'd have had lots of reports if this
happened on x86.
Linus
> > > halt/4071 is trying to acquire lock:
> > > (s_active){++++.+}, at: [<c0000000001ef868>]
> > > .sysfs_addrm_finish+0x58/0xc0
> > >
> > > but task is already holding lock:
> > > (&per_cpu(cpu_policy_rwsem, cpu)){+.+.+.}, at:
> [<c0000000004cd6ac>]
> > > .lock_policy_rwsem_write+0x84/0xf4
> > >
> > > which lock already depends on the new lock.
>
> You don't have a full backtrace for these things?
No, it deadlocks right there, unfortunately.
> We've had lots of trouble with the cpu governors, and I suspect the
> problem isn't new, but the lockdep warning is likely new (see commit
> 846f99749ab68bbc7f75c74fec305de675b1a1bf: "sysfs: Add lockdep
> annotations
> for the sysfs active reference").
>
> So it is likely to be an old issue that (a) now gets warned about and
> (b) might have had timing changes enough to trigger it.
Well, it used to not deadlock and actually shut down the machine :) So
in that sense it's definitely new. It might have printed a lockdep
warning before, which you wouldn't normally see since the machine turns
off right after this.
> I suspect it is G5-specific (or specific to whatever CPU frequency
> code
> that gets used there), since I think we'd have had lots of reports if
> this
> happened on x86.
Yeah, that's puzzling me as well.
johannes
Right.
This is a real deadlock case found by lockdep added to s_active.
The problem is that we did kobject_put(&data->kobj) while holding policy_rwsem
which is used to protect 'data'. It is not so easy to fix this,
probably we need to
do more work on cpufreq code.
Thanks.
before shutdown, you can:
echo N > /proc/sys/kernel/printk_delay
to see the printk messages, N is 0-10000 in milliseconds
>
>> I suspect it is G5-specific (or specific to whatever CPU frequency
>> code
>> that gets used there), since I think we'd have had lots of reports if
>> this
>> happened on x86.
>
> Yeah, that's puzzling me as well.
>
> johannes
>
--
Regards
dave
> This is a real deadlock case found by lockdep added to s_active.
>
> The problem is that we did kobject_put(&data->kobj) while holding
> policy_rwsem
> which is used to protect 'data'. It is not so easy to fix this,
> probably we need to do more work on cpufreq code.
But it doesn't make sense that it's just an existing real deadlock that
is now found -- it never occurred previously!
Anyway, I'll try your patch, thanks for that.
johannes
> > Well, it used to not deadlock and actually shut down the machine :)
> So
> > in that sense it's definitely new. It might have printed a lockdep
> > warning before, which you wouldn't normally see since the machine
> turns
> > off right after this.
>
> before shutdown, you can:
> echo N > /proc/sys/kernel/printk_delay
> to see the printk messages, N is 0-10000 in milliseconds
Well if I understand Américo correctly then it won't have printed
anything (even if it were to deadlock) before adding the lockdep
annotations to s_active, so I guess that theory is out.
johannes
Are you sure there is no difference? :-/
Also, could you please also apply the 4 patches from Eric?
You can get them here:
http://lkml.org/lkml/2010/2/11/334
Thanks much for your testing!
Hey, johannes
Not sure if you made some mistake here, the one you report here [1]
is _not_ the same with this one reported by Benjamin.
Please make sure what you are talking about here is the same one.
Thanks.
1. http://lkml.org/lkml/2010/2/18/33
> >> Does my following untested patch help?
> >
> > Sorry, no. I'll hook up a screen to the box after I return from the
> > fresh market.
> Are you sure there is no difference? :-/
No ... could be a different deadlock now :) Not sure how likely that is
though.
> Also, could you please also apply the 4 patches from Eric?
>
> You can get them here:
> http://lkml.org/lkml/2010/2/11/334
Will do.
johannes
> Hey, johannes
>
> Not sure if you made some mistake here, the one you report here [1]
> is _not_ the same with this one reported by Benjamin.
>
> Please make sure what you are talking about here is the same one.
>
> Thanks.
>
> 1. http://lkml.org/lkml/2010/2/18/33
I'm talking about the problem Ben reported -- that one is completely
different. Was your patch supposed to address _that_ one?
johannes
It deadlocks after
Disabling non-boot CPUs ...
Not sure if that counts as a difference...
johannes
> It deadlocks after
>
> Disabling non-boot CPUs ...
I suspect the BUG: key not in data! thing I get now disables lockdep (it
seems to be mostly due to module loading btw) and then I don't get any
output here.
Seems it's all busted.
johannes
I am not sure neither...
That message is displayed before shutting down the devices.
To verify, you can add some printk() in the end of
__cpufreq_remove_dev(), or enable CONFIG_CPU_FREQ_DEBUG.
Thanks!
> That message is displayed before shutting down the devices.
>
> To verify, you can add some printk() in the end of
> __cpufreq_remove_dev(), or enable CONFIG_CPU_FREQ_DEBUG.
That is already enabled.
johannes
Ok, I got it.
Could you test the patch below? Thanks!
----------------->
Signed-off-by: WANG Cong <xiyou.w...@gmail.com>
No change, sorry, still hangs right after "Disabling non-boot CPUs ..."
before the machine turns off.
johannes
It might be nothing related with cpufreq. I think there's something
wrong during the _cpu_down path.
put more debug printks into _cpu_down(), if we can find kernel is
stuck in which place in _cpu_down, it would be helpful.
and it looks like this breakage is only seen on powerMac G5, so it
might be arch specific. Maybe some commit in powermac breaks G5's
cpu_down, just a guess ;-)
Hmm, not sure ... it seems to be in __stop_machine(), in this code:
printk("got cpu\n");
for_each_online_cpu(i) {
sm_work = per_cpu_ptr(stop_machine_work, i);
INIT_WORK(sm_work, stop_cpu);
queue_work_on(i, stop_machine_wq, sm_work);
}
/* This will release the thread on our CPU. */
put_cpu();
printk("put cpu\n");
which is weird... the "got cpu" printk is the last thing I see.
johannes
> printk("got cpu\n");
> for_each_online_cpu(i) {
> sm_work = per_cpu_ptr(stop_machine_work, i);
> INIT_WORK(sm_work, stop_cpu);
> queue_work_on(i, stop_machine_wq, sm_work);
> }
> /* This will release the thread on our CPU. */
> put_cpu();
> printk("put cpu\n");
As odd as that may be, it hangs in put_cpu() here.
Oh, I see, then this will be another problem.
My previous patch is to fix the cpufreq lockdep warning mentioned
in Benjamin's report, so this hang should be caused by other problem,
not the cpufreq lockdep problem.
Thanks.
Maybe not, there could be some generic code bug that will only be
exposed on a specific arch.
Thanks.
Hmm, does adding synchronize_sched() in _cpu_down() help?
Something like this:
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 677f253..681f5c5 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -228,6 +228,7 @@ static int __ref _cpu_down(unsigned int cpu, int
tasks_frozen)
cpumask_copy(old_allowed, ¤t->cpus_allowed);
set_cpus_allowed_ptr(current, cpu_active_mask);
+ synchronize_sched();
err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu));
if (err) {
set_cpu_active(cpu, true);
> >> Could you test the patch below? Thanks!
> >
> > No change, sorry, still hangs right after "Disabling non-boot CPUs
> ..."
> > before the machine turns off.
> >
>
> Oh, I see, then this will be another problem.
>
> My previous patch is to fix the cpufreq lockdep warning mentioned
> in Benjamin's report, so this hang should be caused by other problem,
> not the cpufreq lockdep problem.
Right, sounds like -- and I haven't seen that lockdep report during
shutdown any more.
johannes
No luck.
johannes
> Since it hangs in put_cpu() which is just preempt_enable(), so I began
> to suspect if we need a synchronize_sched(), or some barrier perhaps.
> I am not sure at all.
Right.
> Before other experts look at this, I think doing a bisect would be
> very useful.
I was afraid you'd say that, it'll take forever though since I need to
walk over to it after every shutdown, see if it turned off and then turn
it on again (and possibly off).... I guess I'll get started on that.
Ok, thanks.
Since it hangs in put_cpu() which is just preempt_enable(), so I began
to suspect if we need a synchronize_sched(), or some barrier perhaps.
I am not sure at all.
Before other experts look at this, I think doing a bisect would be very
useful.
Again, thanks for your testing!
No problem.
Feel free to wait for the experts like Linus taking this. Leave as it is. ;)
Thanks.
> Johannes and I see this on our quad G5s... it -could- be similar to
> one reported a short while ago by Xiaotian Feng <xtf...@gmail.com>
> under the subject [2.6.33-rc4] sysfs lockdep warnings on cpu hotplug.
>
> Basically, the machine deadlocks right after printing the following
> when doing a shutdown:
>
> halt/4071 is trying to acquire lock:
> (s_active){++++.+}, at: [<c0000000001ef868>] .sysfs_addrm_finish+0x58/0xc0
>
> but task is already holding lock:
> (&per_cpu(cpu_policy_rwsem, cpu)){+.+.+.}, at: [<c0000000004cd6ac>] .lock_policy_rwsem_write+0x84/0xf4
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> <nothing else ... machine deadlocked here>
I see this now, with full backtrace:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.34-rc2-cachefs #115
-------------------------------------------------------
halt/2291 is trying to acquire lock:
(s_active#31){++++.+}, at: [<ffffffff81104950>] sysfs_addrm_finish+0x31/0x5a
but task is already holding lock:
(&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [<ffffffff812a3a92>] lock_policy_rwsem_write+0x4a/0x7b
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}:
[<ffffffff81053da2>] __lock_acquire+0x1343/0x16cd
[<ffffffff81054183>] lock_acquire+0x57/0x6d
[<ffffffff813637b6>] down_write+0x3f/0x62
[<ffffffff812a3a92>] lock_policy_rwsem_write+0x4a/0x7b
[<ffffffff812a3b36>] store+0x39/0x79
[<ffffffff81103589>] sysfs_write_file+0x103/0x13f
[<ffffffff810afddc>] vfs_write+0xad/0x172
[<ffffffff810aff5a>] sys_write+0x45/0x6c
[<ffffffff81001eeb>] system_call_fastpath+0x16/0x1b
-> #0 (s_active#31){++++.+}:
[<ffffffff81053a59>] __lock_acquire+0xffa/0x16cd
[<ffffffff81054183>] lock_acquire+0x57/0x6d
[<ffffffff81104086>] sysfs_deactivate+0x8c/0xc9
[<ffffffff81104950>] sysfs_addrm_finish+0x31/0x5a
[<ffffffff81104a33>] sysfs_remove_dir+0x75/0x88
[<ffffffff811bafae>] kobject_del+0x16/0x37
[<ffffffff811bb00d>] kobject_release+0x3e/0x66
[<ffffffff811bbd71>] kref_put+0x43/0x4d
[<ffffffff811baf29>] kobject_put+0x47/0x4b
[<ffffffff812a39b2>] __cpufreq_remove_dev+0x1da/0x236
[<ffffffff8136178e>] cpufreq_cpu_callback+0x62/0x7a
[<ffffffff81048362>] notifier_call_chain+0x32/0x5e
[<ffffffff810483ed>] __raw_notifier_call_chain+0x9/0xb
[<ffffffff81351736>] _cpu_down+0x90/0x29e
[<ffffffff810311d3>] disable_nonboot_cpus+0x6f/0x105
[<ffffffff8103f845>] kernel_power_off+0x21/0x3b
[<ffffffff8103facd>] sys_reboot+0x103/0x16a
[<ffffffff81001eeb>] system_call_fastpath+0x16/0x1b
other info that might help us debug this:
4 locks held by halt/2291:
#0: (reboot_mutex){+.+.+.}, at: [<ffffffff8103fa5b>] sys_reboot+0x91/0x16a
#1: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81031102>] cpu_maps_update_begin+0x12/0x14
#2: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8103113d>] cpu_hotplug_begin+0x27/0x4e
#3: (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [<ffffffff812a3a92>] lock_policy_rwsem_write+0x4a/0x7b
stack backtrace:
Pid: 2291, comm: halt Not tainted 2.6.34-rc2-cachefs #115
Call Trace:
[<ffffffff81052522>] print_circular_bug+0xae/0xbd
[<ffffffff81053a59>] __lock_acquire+0xffa/0x16cd
[<ffffffff81054183>] lock_acquire+0x57/0x6d
[<ffffffff81104950>] ? sysfs_addrm_finish+0x31/0x5a
[<ffffffff81044acf>] ? __init_waitqueue_head+0x35/0x46
[<ffffffff81104086>] sysfs_deactivate+0x8c/0xc9
[<ffffffff81104950>] ? sysfs_addrm_finish+0x31/0x5a
[<ffffffff811044a3>] ? release_sysfs_dirent+0x9e/0xbe
[<ffffffff81104950>] sysfs_addrm_finish+0x31/0x5a
[<ffffffff81104a33>] sysfs_remove_dir+0x75/0x88
[<ffffffff811bafae>] kobject_del+0x16/0x37
[<ffffffff811bb00d>] kobject_release+0x3e/0x66
[<ffffffff811bafcf>] ? kobject_release+0x0/0x66
[<ffffffff811bbd71>] kref_put+0x43/0x4d
[<ffffffff811baf29>] kobject_put+0x47/0x4b
[<ffffffff812a39b2>] __cpufreq_remove_dev+0x1da/0x236
[<ffffffff8136178e>] cpufreq_cpu_callback+0x62/0x7a
[<ffffffff81048362>] notifier_call_chain+0x32/0x5e
[<ffffffff810483ed>] __raw_notifier_call_chain+0x9/0xb
[<ffffffff81351736>] _cpu_down+0x90/0x29e
[<ffffffff810311d3>] disable_nonboot_cpus+0x6f/0x105
[<ffffffff8103f845>] kernel_power_off+0x21/0x3b
[<ffffffff8103facd>] sys_reboot+0x103/0x16a
[<ffffffff813636d9>] ? do_nanosleep+0x78/0xb2
[<ffffffff8104797d>] ? hrtimer_nanosleep+0xab/0x118
[<ffffffff810473a6>] ? hrtimer_wakeup+0x0/0x21
[<ffffffff81364e29>] ? retint_swapgs+0xe/0x13
[<ffffffff81051e4e>] ? trace_hardirqs_on_caller+0x10c/0x130
[<ffffffff8107496a>] ? audit_syscall_entry+0x17d/0x1b0
[<ffffffff81364354>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff81001eeb>] system_call_fastpath+0x16/0x1b
Broke affinity for irq 4
lockdep: fixing up alternatives.
SMP alternatives: switching to UP code
Power down.
acpi_power_off called
David
I already send a patch for this:
https://patchwork.kernel.org/patch/83558/
It is still in -mm tree, I think.
Thanks!
> > =======================================================
> > [ INFO: possible circular locking dependency detected ]
> > 2.6.34-rc2-cachefs #115
> > -------------------------------------------------------
> > halt/2291 is trying to acquire lock:
> > (s_active#31){++++.+}, at: [<ffffffff81104950>] sysfs_addrm_finish+0x31/0x5a
> >
> > but task is already holding lock:
> > (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [<ffffffff812a3a92>] lock_policy_rwsem_write+0x4a/0x7b
> >
> > which lock already depends on the new lock.
> >
>
>
> I already send a patch for this:
>
> https://patchwork.kernel.org/patch/83558/
>
> It is still in -mm tree, I think.
Tested-by: David Howells <dhow...@redhat.com>