Re: EARLY_AP_STARTUP hangs during boot

12 views
Skip to first unread message

John Baldwin

unread,
Jul 27, 2016, 5:45:31 PM7/27/16
to
On Tuesday, June 07, 2016 12:06:54 PM Gary Jennejohn wrote:
> On Tue, 31 May 2016 13:10:06 -0700
> John Baldwin <j...@freebsd.org> wrote:
>
> > On Saturday, May 28, 2016 02:11:41 PM Gary Jennejohn wrote:
> > > On Fri, 27 May 2016 09:50:05 +0200
> > > Gary Jennejohn <gljen...@gmail.com> wrote:
> > >
> > > > On Thu, 26 May 2016 16:54:35 -0700
> > > > John Baldwin <j...@freebsd.org> wrote:
> > > >
> > > > > On Tuesday, May 17, 2016 06:47:41 PM Gary Jennejohn wrote:
> > > > > > On Mon, 16 May 2016 10:54:19 -0700
> > > > > > John Baldwin <j...@freebsd.org> wrote:
> > > > > >
> > > > > > > On Monday, May 16, 2016 12:22:42 PM Gary Jennejohn wrote:
> > > > > > > > I tried out EARLY_AP_STARTUP, but the kernel hangs and I can't
> > > > > > > > break into DDB.
> > > > > > > >
> > > > > > > > I did a verbose boot and the last lines I see are related to routing
> > > > > > > > MSI-X to various local APIC vectors. I copied the last few lines and
> > > > > > > > they look like this:
> > > > > > > >
> > > > > > > > msi: routing MSI-X IRQ 256 to local APIC 2 vector 48
> > > > > > > > msi: routing MSI-X IRQ 257 to local APIC 3 vector 48
> > > > > > > > msi: routing MSI-X IRQ 258 to local APIC 4 vector 48
> > > > > > > > msi: routing MSI-X IRQ 256 to local APIC 0 vector 49
> > > > > > ^^^^^^^ Assigning
> > > > > > > >
> > > > > > > > I tried disabling msi and msix in /boot/loader.conf, but the settings
> > > > > > > > were ignored (probabaly too early).
> > > > > > >
> > > > > > > No, those settings are not too early. However, the routing to different
> > > > > > > CPUs now happens earlier than it used to. What is the line before the
> > > > > > > MSI lines? You can take a picture with your phone/camera if that's simplest.
> > > > > > >
> > > > > >
> > > > > > Here a few lines before the MSI routing happens:
> > > > > >
> > > > > > hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 0,8 on acpi0
> > > > > > hpet0: vendor 0x4353, rev 0x1, 14318180 Hz, 3 timers, legacy route
> > > > > > hpet0: t0 : irqs 0x00c0ff (0), MSI, periodic
> > > > > > hpet0: t1 : irqs 0x00c0ff (0), MSI, periodic
> > > > > > hpet0: t2 : irqs 0x00c0ff (0), MSI, periodic
> > > > > > Timecounter "HPET" frequency 14318180 Hz quality 950
> > > > >
> > > > > The assigning message means it is in the loop using
> > > > > bus_bind_intr() to setup per-CPU timers. Can you please try
> > > > > setting 'hint.hpet.0.per_cpu=0' at the loader prompt to see if
> > > > > disabling the use of per-CPU timers allows you to boot?
> > > > >
> > > >
> > > > Something has changed since the last time I generated a kernel with
> > > > this option.
> > > >
> > > > Now I get a NULL-pointer dereference in the kernel, doesn't matter
> > > > whether I set the hint or not.
> > > >
> > >
> > > OK, now that the startup has been fixed, I tried setting the hint at
> > > the loader prompt, but the kenel hangs in exactly the same place as
> > > before. I actually booted twice to make certain I hadn't made a
> > > typo when setting the hint.
> >
> > Humm, it shouldn't be calling bus_bind_intr() if the hint is set. Actually,
> > I guess it just binds them all to first CPU if per-CPU timers aren't set.
> > Can you add debug printfs to hpet_attach() in sys/dev/acpica/acpi_hpet.c to
> > narrow down which line in that function it hangs after?
> >
> > Another option to try is to add the following to your kernel config:
> >
> > options KTR
> > options KTR_COMPILE=KTR_PROC
> > options KTR_MASK=KTR_PROC
> > options KTR_VERBOSE=1
> >
> > this will spew a lot of crap to the screen, but if it stops spewing when it
> > hangs then it might be tell us where the system is hung. If you have any way
> > to configure a serial console then this would also be useful even if it spews
> > constantly when it is hung (assuming you could log the output of the serial
> > console).
> >
>
> I used the KTR options.
>
> After the Timecounter "HPET" frequency 14318180 Hz quality 950 I see
>
> cpu0 mi_switch: old thread 10000 (swapper)
> cpu0 mi_switch: new thread 10022 (if_config_tqg_0)
> cpu0 sleep_broadcast(0x80002f9a600, 0)
> cpu0 msleep_spin: old thread 100022
> cpu0 mi_switch: old thread 10022
> cpu0 mi_switch: new thread 10016 (if_io_tqg_0)
> cpu0 sleep_broadcast(0x80002f9a780, 0)
> cpu0 msleep_spin: old thread 10016
> cpu0 mi_switch: old thread 10016
> cpu0 fork_exit: new thread 0x80004239510 (td_sched 0x8000042399d8, pid
> 10, idle: cpu0)
>
> And that's all that came out, really not very much at all.

Ok, that seems odd.

Can you apply this patch and run with the KTR output still:

Index: sched_ule.c
===================================================================
--- sched_ule.c (revision 303397)
+++ sched_ule.c (working copy)
@@ -1904,6 +1904,13 @@ sched_switch(struct thread *td, struct thread *new
td->td_owepreempt = 0;
if (!TD_IS_IDLETHREAD(td))
tdq->tdq_switchcnt++;
+
+ if (td == &thread0 && cold) {
+ printf("thread0 switching out (ts_cpu %d):\n", ts->ts_cpu);
+ kdb_backtrace();
+ MPASS(TD_IS_RUNNING(td));
+ }
+
/*
* The lock pointer in an idle thread should never change. Reset it
* to CAN_RUN as well.
@@ -1920,6 +1927,9 @@ sched_switch(struct thread *td, struct thread *new
if (THREAD_CAN_MIGRATE(td) && !THREAD_CAN_SCHED(td, ts->ts_cpu))
ts->ts_cpu = sched_pickcpu(td, 0);
#endif
+ if (td == &thread0 && cold)
+ printf("thread0 ts_cpu %d, cpuid %d\n", ts->ts_cpu,
+ cpuid);
if (ts->ts_cpu == cpuid)
tdq_runq_add(tdq, td, srqflag);
else {


--
John Baldwin
_______________________________________________
freebsd...@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-curre...@freebsd.org"

Gary Jennejohn

unread,
Jul 27, 2016, 6:32:06 PM7/27/16
to
On Wed, 27 Jul 2016 14:43:36 -0700
Well, now I know that ULE is a prerequiste for EARLY_AP_STARTUP! I
wasn't aware of that. I prefer BSD and that's the scheduler I did
the first tests with.

But with the ULE scheduler the system comes up all the way.

It would be nice if the BSD scheduler could also be modified to
work with EARLY_AP_STARTUP.

--
Gary Jennejohn

John Baldwin

unread,
Jul 28, 2016, 11:15:49 AM7/28/16
to
No, ULE is not a prerequisite. However, I will do some testing with 4BSD
to see what I need to fix. Thanks! I suspect the early sched_bind that
happens when migrating x86 interrupts due to bus_bind_intr() is causing
the issue.

--
John Baldwin

John Baldwin

unread,
Jul 29, 2016, 4:18:12 PM7/29/16
to
I wasn't able to reproduce your hang with 4BSD, but I think I see a
possible problem. Try this:

diff --git a/sys/kern/sched_4bsd.c b/sys/kern/sched_4bsd.c
index 7de56b6..d53331a 100644
--- a/sys/kern/sched_4bsd.c
+++ b/sys/kern/sched_4bsd.c
@@ -327,7 +327,6 @@ maybe_preempt(struct thread *td)
* - The current thread has a higher (numerically lower) or
* equivalent priority. Note that this prevents curthread from
* trying to preempt to itself.
- * - It is too early in the boot for context switches (cold is set).
* - The current thread has an inhibitor set or is in the process of
* exiting. In this case, the current thread is about to switch
* out anyways, so there's no point in preempting. If we did,
@@ -348,7 +347,7 @@ maybe_preempt(struct thread *td)
("maybe_preempt: trying to run inhibited thread"));
pri = td->td_priority;
cpri = ctd->td_priority;
- if (panicstr != NULL || pri >= cpri || cold /* || dumping */ ||
+ if (panicstr != NULL || pri >= cpri /* || dumping */ ||
TD_IS_INHIBITED(ctd))
return (0);
#ifndef FULL_PREEMPTION
@@ -1127,7 +1126,7 @@ forward_wakeup(int cpunum)
if ((!forward_wakeup_enabled) ||
(forward_wakeup_use_mask == 0 && forward_wakeup_use_loop == 0))
return (0);
- if (!smp_started || cold || panicstr)
+ if (!smp_started || panicstr)
return (0);

forward_wakeups_requested++;

Gary Jennejohn

unread,
Jul 30, 2016, 3:44:56 AM7/30/16
to
Thanks, but with this patch the kernel hangs in exactly the same
place as before - after the HPET output.

Maybe I'm missing some kernel option which ULE works around, or
something like that.

--
Gary Jennejohn

John Baldwin

unread,
Jul 30, 2016, 3:14:36 PM7/30/16
to
Hmm, ok. Please add KTR_RUNQ and KTR_SMP to the KTR masks, that is
'options KTR_COMPILE=(KTR_PROC|KTR_RUNQ|KTR_SMP)' and
'options KTR_MASK=(KTR_PROC|KTR_RUNQ|KTR_SMP)'

Please also add this patch (on top of the previous patch):

diff --git a/sys/kern/sched_4bsd.c b/sys/kern/sched_4bsd.c
index 2973a23..bab2278 100644
--- a/sys/kern/sched_4bsd.c
+++ b/sys/kern/sched_4bsd.c
@@ -1278,6 +1278,8 @@ sched_add(struct thread *td, int flags)
KASSERT(td->td_flags & TDF_INMEM,
("sched_add: thread swapped out"));

+ CTR2(KTR_PROC, "sched_add: thread %d (%s)", td->td_tid,
+ sched_tdname(td));
KTR_STATE2(KTR_SCHED, "thread", sched_tdname(td), "runq add",
"prio:%d", td->td_priority, KTR_ATTR_LINKED,
sched_tdname(curthread));
diff --git a/sys/x86/x86/cpu_machdep.c b/sys/x86/x86/cpu_machdep.c
index f07b97e..1f418f1 100644
--- a/sys/x86/x86/cpu_machdep.c
+++ b/sys/x86/x86/cpu_machdep.c
@@ -440,6 +440,7 @@ cpu_idle_wakeup(int cpu)
return (0);
if (*state == STATE_MWAIT)
*state = STATE_RUNNING;
+ CTR1(KTR_PROC, "cpu_idle_wakeup: wokeup CPU %d", cpu);
return (1);
}

(I haven't tried compiling it, you might have to add the sys/ktr.h
header to cpu_machdep.c if it doesn't build.)

Hopefully we will get some better trace messages before it hangs
with this added info. The root issue seems to be that 4BSD is
pinning thread0 to some other CPU (due to sched_bind that happens
inside of bus_bind_intr() when the HPET driver pins IRQs to CPUs)
and that other CPU isn't waking up to realize it needs to run thread0.

--
John Baldwin

Gary Jennejohn

unread,
Jul 31, 2016, 5:29:49 AM7/31/16
to
On Sat, 30 Jul 2016 12:03:59 -0700
It compiled with no changes needed.

Even though I set MAXCPU to a mere 2, the boot still hadn't
completed after 90 minutes and I broke it off. I still have
the kernel, so I can try it another time when I have less need
for my FreeBSD box.

--
Gary Jennejohn

John Baldwin

unread,
Jul 31, 2016, 5:25:42 PM7/31/16
to
Did you have the KTR options enabled from before? I don't expect this
to fix the issue, this is more about getting better debug info when it
hangs.

--
John Baldwin

Gary Jennejohn

unread,
Aug 1, 2016, 4:46:13 AM8/1/16
to
On Sun, 31 Jul 2016 14:22:35 -0700
Yes, all the options from before were enabled. Maybe I should have
disabled KTR_VERBOSE=1? I'll try it without that.

I reduced MAXCPU because that reduced the amount of CPU core-related
output by a factor of 3.

I'm aware that you're just trying to get more information, but I use
my FreeBSD box for pretty much everything and, after 90 minutes, I
wanted to use it for something other than booting.

--
Gary Jennejohn

Gary Jennejohn

unread,
Aug 1, 2016, 9:31:47 AM8/1/16
to
KTR_VERBOSE=1 is necessary.

OK, after about 5 hours it landed in an infinite loop emitting:

cpu_0 ipi_cpu: cpu: 1 to 5 ipi: 2 (my CPU has 6 cores)

John Baldwin

unread,
Aug 1, 2016, 4:22:24 PM8/1/16
to
Yes.

> OK, after about 5 hours it landed in an infinite loop emitting:
>
> cpu_0 ipi_cpu: cpu: 1 to 5 ipi: 2 (my CPU has 6 cores)

Humm, can you capture a picture right when it hangs? Those interrupts
are due to clock interrupts (IPI_HARDCLOCK) and are noise. More what
I'm trying to see is if we send IPI_AST/IPI_PREEMPT to the CPU after
binding to it.

--
John Baldwin

Gary Jennejohn

unread,
Aug 2, 2016, 3:03:44 AM8/2/16
to
On Mon, 01 Aug 2016 13:19:16 -0700
I can't tell when it hangs due to the amount of trace coming out.
The hang is hidden in the noise and I have no way to suppress the
trace that I'm aware of. The trace is coming so fast that even
a photo of the screen looks smeared.

Is there a way to limit the trace to IPI_AST/IPI_PREEMPT?

I have, however, now switched to ULE and I'm running with
EARLY_AP_STARTUP, but I'm willing run more tests with 4BSD.

--
Gary Jennejohn

John Baldwin

unread,
Aug 2, 2016, 1:41:50 PM8/2/16
to
Yes:

diff --git a/sys/x86/x86/mp_x86.c b/sys/x86/x86/mp_x86.c
index 91c119a..6c57b20 100644
--- a/sys/x86/x86/mp_x86.c
+++ b/sys/x86/x86/mp_x86.c
@@ -1160,7 +1160,8 @@ ipi_cpu(int cpu, u_int ipi)
if (ipi == IPI_STOP_HARD)
CPU_SET_ATOMIC(cpu, &ipi_stop_nmi_pending);

- CTR3(KTR_SMP, "%s: cpu: %d ipi: %x", __func__, cpu, ipi);
+ if (ipi == IPI_AST || ipi == IPI_PREEMPT)
+ CTR3(KTR_SMP, "%s: cpu: %d ipi: %x", __func__, cpu, ipi);
ipi_send_cpu(cpu, ipi);
}


--
John Baldwin

Gary Jennejohn

unread,
Aug 4, 2016, 2:59:40 AM8/4/16
to
On Tue, 02 Aug 2016 10:41:23 -0700
I limited output to KTR_SMP and used only the patch above.

The last lines which appear are:
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 0,8
on acpi0
cpu0: smp_targeted_tlb_shootdown: cpu: 1 ipi: f6
cpu0: smp_targeted_tlb_shootdown: cpu: 2 ipi: f6
cpu0: smp_targeted_tlb_shootdown: cpu: 3 ipi: f6
cpu0: smp_targeted_tlb_shootdown: cpu: 4 ipi: f6
cpu0: smp_targeted_tlb_shootdown: cpu: 5 ipi: f6
Timecounter "HPET" frequency 14318180 Hz qualty 950

--
Gary Jennejohn

John Baldwin

unread,
Aug 4, 2016, 1:53:41 PM8/4/16
to
Hmm, no lines with mi_switch like you saw before?

--
John Baldwin
Reply all
Reply to author
Forward
0 new messages