Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Why do so many machines need "noapic"?

316 views
Skip to first unread message

Chuck Ebbert

unread,
Sep 5, 2007, 7:40:08 PM9/5/07
to
Some systems lock up without the noapic option. I found one
that will freeze while trying to set up the timer interrupt.
Passing 'nolapic' makes it freeze just after:

Setting up timer through ExtINT... works

Sometimes it will boot up and then freeze during the startup
scripts. Passing the noapic option fixes all that, but it
then gets 1000 spurious interrupts per second on IRQ7 (which
only shows ehci using it.) Kernel version is 2.6.22.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Andi Kleen

unread,
Sep 6, 2007, 7:40:11 AM9/6/07
to
Chuck Ebbert <ceb...@redhat.com> writes:

> Some systems lock up without the noapic option.

Please find patterns: cpu type, chipsets, mainboard vendors etc.

> I found one
> that will freeze while trying to set up the timer interrupt.
> Passing 'nolapic' makes it freeze just after:
>
> Setting up timer through ExtINT... works

Always boot with apic=debug

The messages means the primary timer setup methods already didn't work.
ExtInt is really a crappy fallback that was originally only
needed for some early SMP systems which where the timer was not wired
according to specs.

But the real problem is that the standard timer access method
through the local APIC didn't work.

I had a rewrite of the timer probing some time ago that tried
more combinations automatically. It had some problems so it
never went in, but perhaps it's worth revisiting.

-Andi

Chuck Ebbert

unread,
Sep 7, 2007, 3:40:08 PM9/7/07
to
On 09/06/2007 07:31 AM, Andi Kleen wrote:
> Chuck Ebbert <ceb...@redhat.com> writes:
>
>> Some systems lock up without the noapic option.
>
> Please find patterns: cpu type, chipsets, mainboard vendors etc.
>

This is the first one I've actually had in front of me:

HP TX1000 notebook
Nvidia C51/MCP51 mobile chipset

Booting with "noapic" gives some very strange results. This is two
snapshots of /proc/interrupts taken one second apart. It almost looks
like timer interrupts are occurring on IRQ 0 and IRQ7 on different
CPUs:

CPU0 CPU1
0: 446096 6224 XT-PIC-XT timer
1: 342 6 XT-PIC-XT i8042
2: 0 0 XT-PIC-XT cascade
5: 3099 865 XT-PIC-XT sata_nv
7: 8145 494718 XT-PIC-XT ehci_hcd:usb2
8: 0 0 XT-PIC-XT rtc0
9: 323 9 XT-PIC-XT acpi
10: 136 36 XT-PIC-XT HDA Intel
11: 43884 1091 XT-PIC-XT ohci_hcd:usb1, eth0
12: 104 19 XT-PIC-XT i8042
14: 1011 25 XT-PIC-XT libata
15: 0 0 XT-PIC-XT libata
NMI: 0 0
LOC: 6212 445951
ERR: 403241
MIS: 0

CPU0 CPU1
0: 447098 6233 XT-PIC-XT timer
1: 343 6 XT-PIC-XT i8042
2: 0 0 XT-PIC-XT cascade
5: 3100 865 XT-PIC-XT sata_nv
7: 8158 495847 XT-PIC-XT ehci_hcd:usb2
8: 0 0 XT-PIC-XT rtc0
9: 323 9 XT-PIC-XT acpi
10: 136 36 XT-PIC-XT HDA Intel
11: 43988 1094 XT-PIC-XT ohci_hcd:usb1, eth0
12: 104 19 XT-PIC-XT i8042
14: 1032 26 XT-PIC-XT libata
15: 0 0 XT-PIC-XT libata
NMI: 0 0
LOC: 6221 446953
ERR: 404383
MIS: 0


>> I found one
>> that will freeze while trying to set up the timer interrupt.
>> Passing 'nolapic' makes it freeze just after:
>>
>> Setting up timer through ExtINT... works
>
> Always boot with apic=debug
>

I can't capture the messages. Even when it boots it doesn't last
long enough to get them.

Al Boldi

unread,
Sep 8, 2007, 12:20:08 AM9/8/07
to

You may want to try to reconfigure your bios to reserve irq 5/7 for isa only.

Then post /proc/interrupts again.


Thanks!

--
Al

Prakash Punnoor

unread,
Sep 8, 2007, 1:20:07 AM9/8/07
to
On the day of Friday 07 September 2007 Chuck Ebbert hast written:

> On 09/06/2007 07:31 AM, Andi Kleen wrote:
> > Chuck Ebbert <ceb...@redhat.com> writes:
> >> Some systems lock up without the noapic option.
> >
> > Please find patterns: cpu type, chipsets, mainboard vendors etc.
>
> This is the first one I've actually had in front of me:
>
> HP TX1000 notebook
> Nvidia C51/MCP51 mobile chipset

Do you have a hpet? If not, have you tried using acpi_use_timer_override with
apic?

bye,
--
(°= =°)
//\ Prakash Punnoor /\\
V_/ \_V

signature.asc

Chuck Ebbert

unread,
Sep 10, 2007, 3:20:12 PM9/10/07
to

Yes, it has an hpet. And I tried every combination of options I could
think of.

But, even stranger, x86_64 works (only i386 fails.)

Andi Kleen

unread,
Sep 10, 2007, 3:50:13 PM9/10/07
to
>
> Yes, it has an hpet. And I tried every combination of options I could
> think of.

>
> But, even stranger, x86_64 works (only i386 fails.)

x86-64 has quite different time code (at least until the dyntick patches
currently in mm)

Obvious thing would be to diff the boot messages and see if anything
jumps out (e.g. in interrupt routing).

Or check with mm and if x86-64 is broken there too then it's likely
the new time code.

-Andi

Chuck Ebbert

unread,
Sep 10, 2007, 7:40:11 PM9/10/07
to
On 09/10/2007 03:44 PM, Andi Kleen wrote:
>> Yes, it has an hpet. And I tried every combination of options I could
>> think of.
>
>> But, even stranger, x86_64 works (only i386 fails.)
>
> x86-64 has quite different time code (at least until the dyntick patches
> currently in mm)
>
> Obvious thing would be to diff the boot messages and see if anything
> jumps out (e.g. in interrupt routing).
>
> Or check with mm and if x86-64 is broken there too then it's likely
> the new time code.

This is Fedora 8 and it already has the highres-timers code in x86_64.
But I was still comparing 2.6.22 on i386 to 2.6.23-rc5-git1 + highres-timers
on x86_64. 2.6.23-rc5 on i386 seems okay too, so whatever is happening it
only occurs on 2.6.22 here.

Chuck Ebbert

unread,
Sep 13, 2007, 12:40:06 PM9/13/07
to
On 09/10/2007 03:44 PM, Andi Kleen wrote:
>> Yes, it has an hpet. And I tried every combination of options I could
>> think of.
>
>> But, even stranger, x86_64 works (only i386 fails.)
>
> x86-64 has quite different time code (at least until the dyntick patches
> currently in mm)
>
> Obvious thing would be to diff the boot messages and see if anything
> jumps out (e.g. in interrupt routing).
>
> Or check with mm and if x86-64 is broken there too then it's likely
> the new time code.

I reported too soon that x86_64 works. It does not work, it just takes
a bit longer before it freezes. There are message threads all over the
place discussing this problem with the HP Pavilion tx 1000, and it seems
the best workaround is to use the "nolapic" option instead of "noapic".
Using that, it is totally stable _and_ there are no spurious interrupts
that would otherwise break USB. Interrupt setup is a bit strange, though:

CPU0 CPU1
0: 241 0 XT-PIC-XT timer
1: 1 736 IO-APIC-edge i8042


2: 0 0 XT-PIC-XT cascade

5: 14 10028 IO-APIC-edge sata_nv
7: 0 57 IO-APIC-edge ehci_hcd:usb1
8: 0 0 IO-APIC-edge rtc0
9: 4 2463 IO-APIC-edge acpi
10: 2 2795 IO-APIC-edge HDA Intel
11: 740 478806 IO-APIC-edge ohci_hcd:usb2, eth0
12: 42 19911 IO-APIC-edge i8042
14: 5 7958 IO-APIC-edge libata
15: 0 0 IO-APIC-edge libata
NMI: 0 0
LOC: 4617310 4617213
ERR: 0

Andrew Morton

unread,
Sep 15, 2007, 3:40:11 AM9/15/07
to
On 06 Sep 2007 13:31:50 +0200 Andi Kleen <an...@firstfloor.org> wrote:

> Chuck Ebbert <ceb...@redhat.com> writes:
>
> > Some systems lock up without the noapic option.
>
> Please find patterns: cpu type, chipsets, mainboard vendors etc.

There are 48 bugs in bugzilla which mention "noapic"

http://bugzilla.kernel.org/buglist.cgi?query_format=advanced&short_desc_type=allwordssubstr&short_desc=&long_desc_type=substring&long_desc=noapic&kernel_version_type=allwordssubstr&kernel_version=&bug_status=NEW&bug_status=REOPENED&bug_status=ASSIGNED&emailassigned_to1=1&emailtype1=substring&email1=&emailassigned_to2=1&emailreporter2=1&emailcc2=1&emailtype2=substring&email2=&bugidtype=include&bug_id=&chfieldfrom=&chfieldto=Now&chfieldvalue=&regression=both&cmdtype=doit&order=Reuse+same+sort+as+last+time&field0-0-0=noop&type0-0-0=noop&value0-0-0=

And there are 173,000 on the internet ;)
http://www.google.com/search?hl=en&q=linux+noapic&btnG=Google+Search

We screwed this pooch a long time ago - years. Perhaps if some of the many
noapic users could run a bisection search to work out when it broke we
could start fixing things. But they all have a workaround so there's no
motivation.

Andrew Morton

unread,
Sep 15, 2007, 7:10:10 AM9/15/07
to
On Sat, 15 Sep 2007 12:58:27 +0200 Ingo Oeser <ioe-...@rameria.de> wrote:

> On Saturday 15 September 2007, Andrew Morton wrote:
> > There are 48 bugs in bugzilla which mention "noapic"
> >
> > http://bugzilla.kernel.org/buglist.cgi?query_format=advanced&short_desc_type=allwordssubstr&short_desc=&long_desc_type=substring&long_desc=noapic&kernel_version_type=allwordssubstr&kernel_version=&bug_status=NEW&bug_status=REOPENED&bug_status=ASSIGNED&emailassigned_to1=1&emailtype1=substring&email1=&emailassigned_to2=1&emailreporter2=1&emailcc2=1&emailtype2=substring&email2=&bugidtype=include&bug_id=&chfieldfrom=&chfieldto=Now&chfieldvalue=&regression=both&cmdtype=doit&order=Reuse+same+sort+as+last+time&field0-0-0=noop&type0-0-0=noop&value0-0-0=
> >
> > And there are 173,000 on the internet ;)
> > http://www.google.com/search?hl=en&q=linux+noapic&btnG=Google+Search
> >
> > We screwed this pooch a long time ago - years. Perhaps if some of the many
> > noapic users could run a bisection search to work out when it broke we
> > could start fixing things. But they all have a workaround so there's no
> > motivation.
>

> I have 2 SMP-Boards and both need noapic. One is from 2001 (AUSUS CUR-DLS),
> one is from June 2006 (Gigabyte M57SLI-S4).
>
> There are many reasons:
>
> 1. Bugs which have such a simple workaround don't get much attention.
>
> 2. Usually SMP boards are used for machines, which just HAVE to work,
> since they have been expensive. These are not consumer boards.
>
> 3. I usually had only USB problems (no IRQ), if ommiting noapic.
> USB technology is a cosumer grade technology and enterprise
> grade developers don't have much interest in it (until now?).
>
> 4. IRQ routing setup is often a BIOS issue. You might be able
> to fix that by upgrading your BIOS. That often needs a Windows
> tool. Linux people not always (want to) have access to Windows :-)
>
> I reported the all the problems (starting 2001), no developer
> seemed interested.
>
> I can report them against the latest RC6 kernel tomorrow and put them
> into bugzilla, if we now REALLY care.
>

I believe that about two years ago we broke something which caused quite a
large number of people to need noapic. Is that the case with any of your
machines? Do you know if they run 2.6.ancient without noapic?

Thanks.

Ingo Oeser

unread,
Sep 15, 2007, 7:10:11 AM9/15/07
to
On Saturday 15 September 2007, Andrew Morton wrote:

I have 2 SMP-Boards and both need noapic. One is from 2001 (AUSUS CUR-DLS),


one is from June 2006 (Gigabyte M57SLI-S4).

There are many reasons:

1. Bugs which have such a simple workaround don't get much attention.

2. Usually SMP boards are used for machines, which just HAVE to work,
since they have been expensive. These are not consumer boards.

3. I usually had only USB problems (no IRQ), if ommiting noapic.
USB technology is a cosumer grade technology and enterprise
grade developers don't have much interest in it (until now?).

4. IRQ routing setup is often a BIOS issue. You might be able
to fix that by upgrading your BIOS. That often needs a Windows
tool. Linux people not always (want to) have access to Windows :-)

I reported the all the problems (starting 2001), no developer
seemed interested.

I can report them against the latest RC6 kernel tomorrow and put them
into bugzilla, if we now REALLY care.


Best Regards

Ingo Oeser

Matthew Garrett

unread,
Sep 15, 2007, 8:20:07 AM9/15/07
to
On Sat, Sep 15, 2007 at 04:08:02AM -0700, Andrew Morton wrote:

> I believe that about two years ago we broke something which caused quite a
> large number of people to need noapic. Is that the case with any of your
> machines? Do you know if they run 2.6.ancient without noapic?

My recollection is that we shifted from "Enable the apic even if the
BIOS disabled it" to "Only use the apic if the BIOS didn't disable it"
around that time, which meant that distributions could actually turn on
apic-on-up support without breaking everything. That might correspond to
what you're seeing.

--
Matthew Garrett | mj...@srcf.ucam.org

Rafael J. Wysocki

unread,
Sep 15, 2007, 2:40:09 PM9/15/07
to
On Saturday, 15 September 2007 09:39, Andrew Morton wrote:
> On 06 Sep 2007 13:31:50 +0200 Andi Kleen <an...@firstfloor.org> wrote:
>
> > Chuck Ebbert <ceb...@redhat.com> writes:
> >
> > > Some systems lock up without the noapic option.
> >
> > Please find patterns: cpu type, chipsets, mainboard vendors etc.
>
> There are 48 bugs in bugzilla which mention "noapic"
>
> http://bugzilla.kernel.org/buglist.cgi?query_format=advanced&short_desc_type=allwordssubstr&short_desc=&long_desc_type=substring&long_desc=noapic&kernel_version_type=allwordssubstr&kernel_version=&bug_status=NEW&bug_status=REOPENED&bug_status=ASSIGNED&emailassigned_to1=1&emailtype1=substring&email1=&emailassigned_to2=1&emailreporter2=1&emailcc2=1&emailtype2=substring&email2=&bugidtype=include&bug_id=&chfieldfrom=&chfieldto=Now&chfieldvalue=&regression=both&cmdtype=doit&order=Reuse+same+sort+as+last+time&field0-0-0=noop&type0-0-0=noop&value0-0-0=
>
> And there are 173,000 on the internet ;)
> http://www.google.com/search?hl=en&q=linux+noapic&btnG=Google+Search
>
> We screwed this pooch a long time ago - years. Perhaps if some of the many
> noapic users could run a bisection search to work out when it broke we
> could start fixing things. But they all have a workaround so there's no
> motivation.

Well, I think it broke soon after 2.6.9.

Please see http://bugzilla.kernel.org/show_bug.cgi?id=3639#c10

Dave Jones

unread,
Sep 24, 2007, 5:40:12 PM9/24/07
to
On Sat, Sep 15, 2007 at 01:08:25PM +0100, Matthew Garrett wrote:
> On Sat, Sep 15, 2007 at 04:08:02AM -0700, Andrew Morton wrote:
>
> > I believe that about two years ago we broke something which caused quite a
> > large number of people to need noapic. Is that the case with any of your
> > machines? Do you know if they run 2.6.ancient without noapic?
>
> My recollection is that we shifted from "Enable the apic even if the
> BIOS disabled it" to "Only use the apic if the BIOS didn't disable it"
> around that time, which meant that distributions could actually turn on
> apic-on-up support without breaking everything. That might correspond to
> what you're seeing.

If memory serves correctly, that was circa 2.6.10, back in these commits..

commit a068ea13d1db406e15c346e93530343f6e70184c
Author: Len Brown <len....@intel.com>
Date: Sun Oct 10 05:21:08 2004 -0400

[ACPI] If BIOS disabled the LAPIC, believe it by default.
"lapic" is available to force enabling the LAPIC
in the event you know more than your BIOS vendor.
http://bugzilla.kernel.org/show_bug.cgi?id=3238

commit 2fcfece90db9643b6f30a7ad343898a2871e6a81
Author: Len Brown <len....@intel.com>
Date: Sat Oct 9 20:12:45 2004 -0400

[ACPI] Don't enable LAPIC when the BIOS disabled it.
Doing so apparently breaks every Dell on Earth.
http://bugzilla.kernel.org/show_bug.cgi?id=3238


But those changes relate to the local APIC, which 'noapic' shouldn't
have any effect on should it ?

Dave

--
http://www.codemonkey.org.uk

Thomas Gleixner

unread,
Sep 25, 2007, 5:10:18 AM9/25/07
to
Chuck,

On Thu, 2007-09-13 at 12:38 -0400, Chuck Ebbert wrote:
> On 09/10/2007 03:44 PM, Andi Kleen wrote:
> >> Yes, it has an hpet. And I tried every combination of options I could
> >> think of.
> >
> >> But, even stranger, x86_64 works (only i386 fails.)
> >
> > x86-64 has quite different time code (at least until the dyntick patches
> > currently in mm)
> >
> > Obvious thing would be to diff the boot messages and see if anything
> > jumps out (e.g. in interrupt routing).
> >
> > Or check with mm and if x86-64 is broken there too then it's likely
> > the new time code.
>
> I reported too soon that x86_64 works. It does not work, it just takes
> a bit longer before it freezes. There are message threads all over the
> place discussing this problem with the HP Pavilion tx 1000, and it seems
> the best workaround is to use the "nolapic" option instead of "noapic".
> Using that, it is totally stable _and_ there are no spurious interrupts
> that would otherwise break USB. Interrupt setup is a bit strange, though:

can you please send me 32 and 64 bit boot logs of mainline and fedora
kernels ?

tglx

Phillip Susi

unread,
Sep 27, 2007, 6:10:13 PM9/27/07
to
Dave Jones wrote:
> If memory serves correctly, that was circa 2.6.10, back in these commits..
>
> commit a068ea13d1db406e15c346e93530343f6e70184c
> Author: Len Brown <len....@intel.com>
> Date: Sun Oct 10 05:21:08 2004 -0400
>
> [ACPI] If BIOS disabled the LAPIC, believe it by default.
> "lapic" is available to force enabling the LAPIC
> in the event you know more than your BIOS vendor.
> http://bugzilla.kernel.org/show_bug.cgi?id=3238
>
> commit 2fcfece90db9643b6f30a7ad343898a2871e6a81
> Author: Len Brown <len....@intel.com>
> Date: Sat Oct 9 20:12:45 2004 -0400
>
> [ACPI] Don't enable LAPIC when the BIOS disabled it.
> Doing so apparently breaks every Dell on Earth.
> http://bugzilla.kernel.org/show_bug.cgi?id=3238
>
>
> But those changes relate to the local APIC, which 'noapic' shouldn't
> have any effect on should it ?

If the LAPIC is disabled, then you CAN'T use the IO-APIC right? So then
wouldn't the noapic option have no effects since the apic is already
disabled?

0 new messages