2.6.12 hangs on boot

Alexander Y. Fomichev

unread,

Jun 22, 2005, 10:16:41 AM6/22/05

to linux-...@vger.kernel.org, ad...@list.net.ru

G' day

I've been trying to switch from 2.6.12-rc3 to 2.6.12 on Dual EM64T 2.8 GHz
[ MoBo: Intel E7520, intel 82801 ]
but kernel hangs on boot right after records:

Booting processor 2/1 rip 6000 rsp ffff8100023dbf58
Initializing CPU#2

( below is a link to full boot trace, actually -git3 but no differences)
http://sysadminday.org.ru/2.6.12-hang-on-boot/2.6.12-git3-hang

An attempt to enable debug:
+CONFIG_ACPI_DEBUG=y
+CONFIG_DEBUG_SLAB=y
+CONFIG_DEBUG_PREEMPT=y
+CONFIG_DEBUG_SPINLOCK=y
+CONFIG_DEBUG_SPINLOCK_SLEEP=y
+CONFIG_DEBUG_KOBJECT=y
+CONFIG_DEBUG_INFO=y
+CONFIG_INIT_DEBUG=y
gives rather strange result, kernel boots successfully ( with a lot of
debuging messages of course but i couldn't find something suspicious )
http://sysadminday.org.ru/2.6.12-hang-on-boot/2.6.12-git3-debug

config for 2.6.12 have been taken from previous one, only
'make oldconfig' has been made.
http://sysadminday.org.ru/2.6.12-hang-on-boot/2.6.12-git3.config

Hang 100% reproducible on at least two of my EM64T hosts.
( actualy the same configuration as of MoBo/CPU )

--
Best regards.
Alexander Y. Fomichev <gl...@php4.ru>
Public PGP key: http://sysadminday.org.ru/gluk.asc
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Alexey Dobriyan

unread,

Jun 24, 2005, 4:53:33 PM6/24/05

to Alexander Y. Fomichev, linux-...@vger.kernel.org, ad...@list.net.ru

On Wednesday 22 June 2005 18:13, Alexander Y. Fomichev wrote:
> I've been trying to switch from 2.6.12-rc3 to 2.6.12 on Dual EM64T 2.8 GHz
> [ MoBo: Intel E7520, intel 82801 ]
> but kernel hangs on boot right after records:
>
> Booting processor 2/1 rip 6000 rsp ffff8100023dbf58
> Initializing CPU#2

I've filed a bug at kernel bugzilla, so your report won't be lost.
See http://bugme.osdl.org/show_bug.cgi?id=4792

You can register at http://bugme.osdl.org/createaccount.cgi and add yourself
to CC list.

Linus Torvalds

unread,

Jun 24, 2005, 6:23:53 PM6/24/05

to Alexander Y. Fomichev, Kernel Mailing List, ad...@list.net.ru, Git Mailing List

On Wed, 22 Jun 2005, Alexander Y. Fomichev wrote:
>
> I've been trying to switch from 2.6.12-rc3 to 2.6.12 on Dual EM64T 2.8 GHz
> [ MoBo: Intel E7520, intel 82801 ]
> but kernel hangs on boot right after records:
>
> Booting processor 2/1 rip 6000 rsp ffff8100023dbf58
> Initializing CPU#2

Hmm.. Since you seem to be a git user, maybe you could try the git
"bisect" thing to help narrow down exactly where this happened (and help
test that thing too ;).

You can basically use git to find the half-way point between a set of
"known good" points and a "known bad" point ("bisecting" the set of
commits), and doing just a few of those should give us a much better view
of where things started going wrong.

For example, since you know that 2.6.12-rc3 is good, and 2.6.12 is bad,
you'd do

git-rev-list --bisect v2.6.12 ^v2.6.12-rc3

where the "v2.6.12 ^v2.6.12-rc3" thing basically means "everything in
v2.6.12 but _not_ in v2.6.12-rc3" (that's what the ^ marks), and the
"--bisect" flag just asks git-rev-list to list the middle-most commit,
rather than all the commits in between those kernel versions.

You should get the answer "0e6ef3e02b6f07e37ba1c1abc059f8bee4e0847f", but
before you go any further, just make sure your git index is all clean:

git status

should not print anything else than "nothing to commit". If so, then
you're ready to try the new "mid-point" head:

git-rev-list --bisect v2.6.12 ^v2.6.12-rc3 > .git/refs/heads/try1
git checkout try1

which will create a new branch called "try1", where the head is that
"mid-point", and it will switch to that branch (this requires a fairly
recent "git", btw, so make sure you update your git first).

Then, compile that kernel, and try it out.

Now, there are two possibilities: either "try1" ends up being good, or it
still shows the bug. If it is a buggy kernel, then you now have a new
"bad" point, and you do

git-rev-list --bisect try1 ^v2.6.12-rc3 > .git/refs/heads/try2
git checkout try2

which is all the same thing as you did before, except now we use "try1" as
the known bad one rather than v2.6.12 (and we call the new branch "try2"
of course).

However, if that "try1" is _good_, and doesn't show the bug, then you
shouldn't replace the other "known good" case, but instead you should add
it to the list of good commits (aka commits we don't want to know about):

git-rev-list --bisect v2.6.12 ^v2.6.12-rc3 ^try1 > .git/refs/heads/try2
git checkout try2

ie notice how we now say: want to get the bisection of the commits in
v2.6.12 (known bad) but _not_ in either of v2.6.12-rc3 or the 'try1'
branch (which are known good).

After compiling and testing a few kernels, you will have narrowed the
range down a _lot_, and at some point you can just say

git-rev-list --pretty try4 ^v2.6.12-rc3 ^try1 ^try3

(or however the "success/failure" pattern ends up being - the above
example line assumes that "try1" didn't have the bug, but "try2" did, and
then "try3" was ok again but "try4" was buggy), and you'll get a fairly
small list of commits that are the potential "bad" ones.

After the above four tries, you'd have limited it down to a list of 95
changes (from the original 1520), so it would really be best to try six or
seven different kernels, but at that point you'd have it down to less than
20 commits and then pinpointing the bug is usually much easier.

And when you're done, you can just do

git checkout master

and you're back to where you started.

Linus

Alexander Y. Fomichev

unread,

Jul 7, 2005, 10:23:43 AM7/7/05

to Linus Torvalds, Kernel Mailing List, ad...@list.net.ru, Git Mailing List

Thank you for your answer, i've been on vacations last two weeks,
and i didn't have an access to my mail account.
Hmmm... it seems that 'bisect' method not applicable to this host, this
is production server, not so critical to one or two reboots but 'bisect' will
require much more, i suspect. I've another host, nearly the same as of
hardware and non-critical where such tests could be done , but i haven't a
serial console on it as now. It takes some time to link console because both
of this are remote hosts.

--
Best regards.
Alexander Y. Fomichev <gl...@php4.ru>
Public PGP key: http://sysadminday.org.ru/gluk.asc

Alexander Y. Fomichev

unread,

Jul 18, 2005, 7:29:41 AM7/18/05

to Linus Torvalds, Kernel Mailing List, ad...@list.net.ru, Git Mailing List, a...@suse.de

On Saturday 25 June 2005 02:20, Linus Torvalds wrote:

> On Wed, 22 Jun 2005, Alexander Y. Fomichev wrote:
> > I've been trying to switch from 2.6.12-rc3 to 2.6.12 on Dual EM64T 2.8
> > GHz [ MoBo: Intel E7520, intel 82801 ]
> > but kernel hangs on boot right after records:
> >
> > Booting processor 2/1 rip 6000 rsp ffff8100023dbf58
> > Initializing CPU#2
>
> Hmm.. Since you seem to be a git user, maybe you could try the git
> "bisect" thing to help narrow down exactly where this happened (and help
> test that thing too ;).

[skiped]

Ok, as i can see [and as Andi guessed
http://bugme.osdl.org/show_bug.cgi?id=4792]
issue have been introduced by new TSC sync algorithm
git id: dda50e716dc9451f40eebfb2902c260e4f62cf34.

And, yes, seems like it depends of timings...
In my case kludge with insertion of low delay (e.g. printk) between
cpu_set/mb and tsc_sync_wait() makes kernel bootable.

diff -urN b/arch/x86_64/kernel/smpboot.c a/arch/x86_64/kernel/smpboot.c
--- b/arch/x86_64/kernel/smpboot.c 2005-07-17 21:55:55.000000000 +0400
+++ a/arch/x86_64/kernel/smpboot.c 2005-07-17 21:57:56.000000000 +0400
@@ -451,6 +451,7 @@
cpu_set(smp_processor_id(), cpu_online_map);
mb();

+ printk(KERN_INFO "We're still here!\n");
/* Wait for TSC sync to not schedule things before.
We still process interrupts, which could see an inconsistent
time in that window unfortunately. */

--
Best regards.
Alexander Y. Fomichev <gl...@php4.ru>
Public PGP key: http://sysadminday.org.ru/gluk.asc

Andi Kleen

unread,

Jul 18, 2005, 9:00:41 AM7/18/05

to Alexander Y. Fomichev, Linus Torvalds, Kernel Mailing List, ad...@list.net.ru, Git Mailing List, a...@suse.de

Can you please test if this patch fixes it?

-Andi

Don't compare linux processor index with APICID

Fixes boot up lockups on some machines where CPU apic ids
don't start with 0

Signed-off-by: Andi Kleen <a...@suse.de>

Index: linux/arch/x86_64/kernel/smpboot.c
===================================================================
--- linux.orig/arch/x86_64/kernel/smpboot.c
+++ linux/arch/x86_64/kernel/smpboot.c
@@ -211,7 +211,7 @@ static __cpuinit void sync_master(void *
{
unsigned long flags, i;

- if (smp_processor_id() != boot_cpu_id)
+ if (smp_processor_id() != 0)
return;

go[MASTER] = 0;

Alexander Y. Fomichev

unread,

Jul 19, 2005, 7:55:26 AM7/19/05

to Andi Kleen, Linus Torvalds, Kernel Mailing List, ad...@list.net.ru, Git Mailing List

On Monday 18 July 2005 16:58, Andi Kleen wrote:
> Can you please test if this patch fixes it?
>
> -Andi
>
>
> Don't compare linux processor index with APICID
>
> Fixes boot up lockups on some machines where CPU apic ids
> don't start with 0
>
> Signed-off-by: Andi Kleen <a...@suse.de>
>
> Index: linux/arch/x86_64/kernel/smpboot.c
> ===================================================================
> --- linux.orig/arch/x86_64/kernel/smpboot.c
> +++ linux/arch/x86_64/kernel/smpboot.c
> @@ -211,7 +211,7 @@ static __cpuinit void sync_master(void *
> {
> unsigned long flags, i;
>
> - if (smp_processor_id() != boot_cpu_id)
> + if (smp_processor_id() != 0)
> return;
>
> go[MASTER] = 0;

No, sorry, the same result -- hangs just after:

Booting processor 2/1 rip 6000 rsp ffff8100dff7df58
Initializing CPU#2

(hmm... as i can see one string above [and if i understand correctly]
boot_cpu_id == 0 in my case:
CPU 1: Syncing TSC to CPU 0 )

--
Best regards.
Alexander Y. Fomichev <gl...@php4.ru>
Public PGP key: http://sysadminday.org.ru/gluk.asc

Andrew Morton

unread,

Jul 29, 2005, 1:14:43 AM7/29/05

to Alexander Y. Fomichev, linux-...@vger.kernel.org, ad...@list.net.ru

Is this still happening in 2.6.13-rc4?

If so, could you please test 2.6.13-rc4 plus the below fix?

Thanks.

From: ebie...@xmission.com (Eric W. Biederman)

sync_tsc was using smp_call_function to ask the boot processor to report
it's tsc value. smp_call_function performs an IPI_send_allbutself which is
a broadcast ipi. There is a window during processor startup during which
the target cpu has started and before it has initialized it's interrupt
vectors so it can properly process an interrupt. Receveing an interrupt
during that window will triple fault the cpu and do other nasty things.

Why cli does not protect us from that is beyond me.

The simple fix is to match ia64 and provide a smp_call_function_single.
Which avoids the broadcast and is more efficient.

This certainly fixes the problem of getting stuck on boot which was very
easy to trigger on my SMP Hyperthreaded Xeon, and I think it fixes it for
the right reasons.

I believe this patch suffers from apicid versus logical cpu number
confusion. I copied the basic logic from smp_send_reschedule and I can't
find where that translates from the logical cpuid to apicid. So it isn't
quite correct yet. It should be close enough that it shouldn't be too hard
to finish it up.

More bug fixes after I have slept but I figured I needed to get this
one out for review.

Signed-off-by: Eric W. Biederman <ebie...@xmission.com>
Signed-off-by: Andrew Morton <ak...@osdl.org>
---

arch/x86_64/kernel/smp.c | 65 +++++++++++++++++++++++++++++++++++++++++++
arch/x86_64/kernel/smpboot.c | 18 +++++++----
include/asm-x86_64/smp.h | 2 +
3 files changed, 79 insertions(+), 6 deletions(-)

diff -puN arch/x86_64/kernel/smpboot.c~x86_64-sync_tsc-fix-the-race-so-we-can-boot arch/x86_64/kernel/smpboot.c
--- devel/arch/x86_64/kernel/smpboot.c~x86_64-sync_tsc-fix-the-race-so-we-can-boot 2005-07-28 22:07:55.000000000 -0700
+++ devel-akpm/arch/x86_64/kernel/smpboot.c 2005-07-28 22:07:55.000000000 -0700
@@ -280,7 +280,7 @@ get_delta(long *rt, long *master)
return tcenter - best_tm;
}

-static __cpuinit void sync_tsc(void)
+static __cpuinit void sync_tsc(unsigned int master)
{
int i, done = 0;
long delta, adj, adjust_latency = 0;
@@ -294,9 +294,17 @@ static __cpuinit void sync_tsc(void)
} t[NUM_ROUNDS] __cpuinitdata;
#endif

+ printk(KERN_INFO "CPU %d: Syncing TSC to CPU %u.\n",
+ smp_processor_id(), master);
+
go[MASTER] = 1;

- smp_call_function(sync_master, NULL, 1, 0);
+ /* It is dangerous to broadcast IPI as cpus are coming up,
+ * as they may not be ready to accept them. So since
+ * we only need to send the ipi to the boot cpu direct
+ * the message, and avoid the race.
+ */
+ smp_call_function_single(master, sync_master, NULL, 1, 0);

while (go[MASTER]) /* wait for master to be ready */
no_cpu_relax();
@@ -340,16 +348,14 @@ static __cpuinit void sync_tsc(void)
printk(KERN_INFO
"CPU %d: synchronized TSC with CPU %u (last diff %ld cycles, "
"maxerr %lu cycles)\n",
- smp_processor_id(), boot_cpu_id, delta, rt);
+ smp_processor_id(), master, delta, rt);
}

static void __cpuinit tsc_sync_wait(void)
{
if (notscsync || !cpu_has_tsc)
return;
- printk(KERN_INFO "CPU %d: Syncing TSC to CPU %u.\n", smp_processor_id(),
- boot_cpu_id);
- sync_tsc();
+ sync_tsc(boot_cpu_id);
}

static __init int notscsync_setup(char *s)
diff -puN arch/x86_64/kernel/smp.c~x86_64-sync_tsc-fix-the-race-so-we-can-boot arch/x86_64/kernel/smp.c
--- devel/arch/x86_64/kernel/smp.c~x86_64-sync_tsc-fix-the-race-so-we-can-boot 2005-07-28 22:07:55.000000000 -0700
+++ devel-akpm/arch/x86_64/kernel/smp.c 2005-07-28 22:07:55.000000000 -0700
@@ -294,6 +294,71 @@ void unlock_ipi_call_lock(void)
}

/*
+ * this function sends a 'generic call function' IPI to one other CPU
+ * in the system.
+ */
+static void __smp_call_function_single (int cpu, void (*func) (void *info), void *info,
+ int nonatomic, int wait)
+{
+ struct call_data_struct data;
+ int cpus = 1;
+
+ data.func = func;
+ data.info = info;
+ atomic_set(&data.started, 0);
+ data.wait = wait;
+ if (wait)
+ atomic_set(&data.finished, 0);
+
+ call_data = &data;
+ wmb();
+ /* Send a message to all other CPUs and wait for them to respond */
+ send_IPI_mask(cpumask_of_cpu(cpu), CALL_FUNCTION_VECTOR);
+
+ /* Wait for response */
+ while (atomic_read(&data.started) != cpus)
+ cpu_relax();
+
+ if (!wait)
+ return;
+
+ while (atomic_read(&data.finished) != cpus)
+ cpu_relax();
+}
+
+/*
+ * Run a function on another CPU
+ * <func> The function to run. This must be fast and non-blocking.
+ * <info> An arbitrary pointer to pass to the function.
+ * <nonatomic> Currently unused.
+ * <wait> If true, wait until function has completed on other CPUs.
+ * [RETURNS] 0 on success, else a negative status code.
+ *
+ * Does not return until the remote CPU is nearly ready to execute <func>
+ * or is or has executed.
+ */
+
+int smp_call_function_single (int cpu, void (*func) (void *info), void *info,
+ int nonatomic, int wait)
+{
+
+ int me = get_cpu(); /* prevent preemption and reschedule on another processor */
+
+ if (cpu == me) {
+ printk("%s: trying to call self\n", __func__);
+ put_cpu();
+ return -EBUSY;
+ }
+ spin_lock_bh(&call_lock);
+
+ __smp_call_function_single(cpu, func,info,nonatomic,wait);
+
+ spin_unlock_bh(&call_lock);
+ put_cpu();
+ return 0;
+}
+
+/*
* this function sends a 'generic call function' IPI to all other CPUs
* in the system.
*/
diff -puN include/asm-x86_64/smp.h~x86_64-sync_tsc-fix-the-race-so-we-can-boot include/asm-x86_64/smp.h
--- devel/include/asm-x86_64/smp.h~x86_64-sync_tsc-fix-the-race-so-we-can-boot 2005-07-28 22:07:55.000000000 -0700
+++ devel-akpm/include/asm-x86_64/smp.h 2005-07-28 22:07:55.000000000 -0700
@@ -48,6 +48,8 @@ extern void unlock_ipi_call_lock(void);
extern int smp_num_siblings;
extern void smp_flush_tlb(void);
extern void smp_message_irq(int cpl, void *dev_id, struct pt_regs *regs);
+extern int smp_call_function_single (int cpuid, void (*func) (void *info), void *info,
+ int retry, int wait);
extern void smp_send_reschedule(int cpu);
extern void smp_invalidate_rcv(void); /* Process an NMI */
extern void zap_low_mappings(void);
_

Alexander Y. Fomichev

unread,

Aug 1, 2005, 5:57:36 AM8/1/05

to Andrew Morton, linux-...@vger.kernel.org, ad...@list.net.ru

> Signed-off-by: Eric ic W. Biederman <ebie...@xmission.com>

> Signed-off-by: Andrew Morton <ak...@osdl.org>
> ---

[skip]

I've not tried 2.6.13-rc4 itself because i notice changes has been commited
into Linus git tree under id: 3d483f47579461a4715db33c68ef8752e5a97a2d
http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3d483f47579461a4715db33c68ef8752e5a97a2d
and this tree works well for me though previous one
[94d2ac66c12397e2ca7988dbf59f24a966d275cb] -- hangs. So i guess it is exactly
problem this patch solve.
Thank you and for your help.

--
Best regards.
Alexander Y. Fomichev <gl...@php4.ru>
Public PGP key: http://sysadminday.org.ru/gluk.asc