Sample o/p from a 4 way x86_64 box :
#:/sys/devices/system/cpu/cpu3 # cat online
1
#:/sys/devices/system/cpu/cpu3 # echo 0 > online
Stack:
Call Trace:
<IRQ>
<EOI>
Code: 45 f0 48 89 45 b8 48 8d 45 d0 4c 89 4d f8 c7 45 b0 10 00 00 00 48 89 45 c0 e8 5a ff ff ff c9 c3 89 f0 b9 40 00 00 00 55 99 f7 f9 <48> 89 e5 48 89 f9 48 83 ec 08 31 d2 41 89 c0 eb 12 48 8b 01 48
Stack:
Call Trace:
Code: 20 00 00 00 00 48 89 3e 49 8d 74 24 28 e8 45 75 1d 00 49 c7 44 24 50 00 00 00 00 48 8b 9b 60 01 00 00 48 85 db 0f 85 25 ff ff ff <5b> 41 5c c9 c3 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 81 ec
Stack:
Call Trace:
Code: 48 0f 45 d8 eb 1a 44 0f a3 20 19 c0 48 c7 c3 10 b5 a0 81 85 c0 48 c7 c0 30 b5 a0 81 48 0f 44 d8 31 c0 f3 90 44 8b 25 82 35 96 00 <41> 39 c4 74 46 41 83 fc 02 74 08 41 83 fc 03 75 12 eb 03 fa eb
Stack:
Call Trace:
Code: 48 c7 c0 30 b5 a0 81 48 0f 45 d8 eb 1a 44 0f a3 20 19 c0 48 c7 c3 10 b5 a0 81 85 c0 48 c7 c0 30 b5 a0 81 48 0f 44 d8 31 c0 f3 90 <44> 8b 25 82 35 96 00 41 39 c4 74 46 41 83 fc 02 74 08 41 83 fc
Uhhuh. NMI received for unknown reason 25 on CPU 0.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue
The above messages are repeated after this. I observed similar hangs
on other architectures as well (s390x, PowerPC, x86_32).
2.6.33-rc1 worked fine. I haven't tried the bisect. Will do that
first thing tomorrow morning.
Thanks
-Sachin
--
---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
damnit, you're right.. it's getting stuck during hotplug on stop machine
some place.
didn't notice it because it didn't crash..
sorry for that :-(
Perhaps this is also why suspend doesn't work on current -git... -rc1 is
fine.
--
Jens Axboe
Yep, suspend relies on hotplug.
Exactly, hence the connection. Shall I bisect, or do you already know
what the problem is?
--
Jens Axboe
> Exactly, hence the connection. Shall I bisect, or do you already know
> what the problem is?
I have a definite suspect alright, let me prod at this a bit more.
Signed-off-by: Peter Zijlstra <a.p.zi...@chello.nl>
---
kernel/sched.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index 720df10..0ac4fa5 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2348,7 +2348,7 @@ int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)
* not worry about this generic constraint ]
*/
if (unlikely(!cpumask_test_cpu(cpu, &p->cpus_allowed) ||
- !cpu_active(cpu)))
+ !cpu_online(cpu)))
cpu = select_fallback_rq(task_cpu(p), p);
return cpu;
Yep, this works for me!
Tested-by: Jens Axboe <jens....@oracle.com>
>
> Signed-off-by: Peter Zijlstra <a.p.zi...@chello.nl>
> ---
> kernel/sched.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 720df10..0ac4fa5 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -2348,7 +2348,7 @@ int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)
> * not worry about this generic constraint ]
> */
> if (unlikely(!cpumask_test_cpu(cpu, &p->cpus_allowed) ||
> - !cpu_active(cpu)))
> + !cpu_online(cpu)))
> cpu = select_fallback_rq(task_cpu(p), p);
>
> return cpu;
>
>
--
Jens Axboe
sched: Fix hotplug hang
The hot-unplug kstopmachine usage does a wakeup after
deactivating the cpu, hence we cannot use cpu_active()
here but must rely on the good olde online.
Reported-by: Sachin Sant <sac...@in.ibm.com>
Reported-by: Jens Axboe <jens....@oracle.com>
Signed-off-by: Peter Zijlstra <a.p.zi...@chello.nl>
Tested-by: Jens Axboe <jens....@oracle.com>
Cc: Heiko Carstens <heiko.c...@de.ibm.com>
Cc: Benjamin Herrenschmidt <be...@kernel.crashing.org>
LKML-Reference: <1261326987.4314.24.camel@laptop>
Signed-off-by: Ingo Molnar <mi...@elte.hu>
---
kernel/sched.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index 7ffde2a..87f1f47 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2346,7 +2346,7 @@ int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)