Re: DMA cache consistency bug introduced in 2.6.28

Krzysztof Halasa

unread,

Dec 17, 2009, 1:30:03 PM12/17/09

to

Linus Torvalds <torv...@linux-foundation.org> writes:

> On x86, where all caches are supposed to be totally coherent (except for
> I$ under very special circumstances),

BTW SWIOTLB is a non-coherent "cache" in some sense, though I'd be
surprised if it's related. Anyway mentioning $CPU and $RAM at the very
least would be a good idea in such cases.
--
Krzysztof Halasa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Krzysztof Halasa

unread,

Dec 18, 2009, 10:10:02 AM12/18/09

to

Mark Hounschell <dma...@cfl.rr.com> writes:

> harley:/usr/src/linux-2.6-allstable # git bisect good
> Bisecting: 2443 revisions left to test after this (roughly 11 steps)
> [db563fc2e80534f98c7f9121a6f7dfe41f177a79] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
>
> This one doesn't build:
>
> CC [M] fs/ext3/super.o
> fs/ext3/super.c: In function ‘ext3_quota_on’:
> fs/ext3/super.c:2839: error: ‘nd’ undeclared (first use in this function)
> fs/ext3/super.c:2839: error: (Each undeclared identifier is reported only once
> fs/ext3/super.c:2839: error: for each function it appears in.)
> make[2]: *** [fs/ext3/super.o] Error 1
> make[1]: *** [fs/ext3] Error 2
> make: *** [fs] Error 2
>
> I haven't yet determined that I can but, if I were to make a modification to the
> tree now to fix this would that screw up the bisect process?

It won't, in such cases.
But you can also git reset --hard another_commit_id (while doing git
bisect) if it fixes this problem (e.g. some next commit).

And you can skip uninteresting parts of the tree when starting git
bisect (though if the cause is in skipped parts, the results will be
meaningless).

Mark Hounschell

unread,

Dec 18, 2009, 10:30:01 AM12/18/09

to

On 12/18/2009 10:22 AM, Linus Torvalds wrote:

>
>
> On Fri, 18 Dec 2009, Mark Hounschell wrote:
>>
>> This one doesn't build:
>>
>> CC [M] fs/ext3/super.o

>> fs/ext3/super.c: In function ï¿½ext3_quota_onï¿½:
>> fs/ext3/super.c:2839: error: ï¿½ndï¿½ undeclared (first use in this function)

>> fs/ext3/super.c:2839: error: (Each undeclared identifier is reported only once
>> fs/ext3/super.c:2839: error: for each function it appears in.)
>> make[2]: *** [fs/ext3/super.o] Error 1
>> make[1]: *** [fs/ext3] Error 2
>> make: *** [fs] Error 2
>>
>> I haven't yet determined that I can but, if I were to make a modification to the
>> tree now to fix this would that screw up the bisect process?
>

> You can safely fix unrelated problems without screwing up the bisection.
> And in this case you can be pretty sure that this is unrelated, so it's
> all ok.
>
> The fix for that silly problem is
>
> - path_put(&nd.path);
> + path_put(&path);
>
> (it's due to a silent merge failure - it merged cleanly, but semantics had
> changed in a branch and impacted code that was newly introduced in another
> branch).

Yep, thanks. I'm past that now. But haven't done a bisect [good|bad] on the
results of that one yet. Did you see Alain's email response to my bisect
progress report to him?

I'm still at a loss as to how to proceed?

Mark

Linus Torvalds

unread,

Dec 18, 2009, 10:30:02 AM12/18/09

to

On Fri, 18 Dec 2009, Mark Hounschell wrote:
>
> This one doesn't build:
>
> CC [M] fs/ext3/super.o
> fs/ext3/super.c: In function ï¿½ext3_quota_onï¿½:
> fs/ext3/super.c:2839: error: ï¿½ndï¿½ undeclared (first use in this function)
> fs/ext3/super.c:2839: error: (Each undeclared identifier is reported only once
> fs/ext3/super.c:2839: error: for each function it appears in.)
> make[2]: *** [fs/ext3/super.o] Error 1
> make[1]: *** [fs/ext3] Error 2
> make: *** [fs] Error 2
>
> I haven't yet determined that I can but, if I were to make a modification to the
> tree now to fix this would that screw up the bisect process?

You can safely fix unrelated problems without screwing up the bisection.
And in this case you can be pretty sure that this is unrelated, so it's
all ok.

The fix for that silly problem is

- path_put(&nd.path);
+ path_put(&path);

(it's due to a silent merge failure - it merged cleanly, but semantics had
changed in a branch and impacted code that was newly introduced in another
branch).

Linus

Linus Torvalds

unread,

Dec 18, 2009, 10:50:04 AM12/18/09

to

On Fri, 18 Dec 2009, Mark Hounschell wrote:
>
> Yep, thanks. I'm past that now. But haven't done a bisect [good|bad] on the
> results of that one yet. Did you see Alain's email response to my bisect
> progress report to him?
>
> I'm still at a loss as to how to proceed?

Ahh, the HPET issue.

That one is actually very interesting information, because we've had
problems with HPET before. But what I would suggest is to try to continue
to bisect with HPET enabled (to see the problem), and the commit that you
couldn't even boot with HPET enabled you should not count as good or bad
because you just don't know.

You can do "git bisect skip" to make git know that some particular commit
is not a commit you can test, and you can also move away from a whole
problematic region to another area by doing

git bisect visualize

to bring up a graphical gitk view of what all you have left to bisect,
pick a good point (still _reasonably_ close to the middle) there, and do

git reset --hard <the-point-you-want-to-test>

and try that kernel instead of the one git bisect suggested.

But this floppy DMA inconsistency being somehow HPET-related is
interestign in itself. One thing that HPET does si to obviously change how
we read the time - and what that can cause (totally indirectly) is that
now we don't touch the southbridge with IO accesses nearly as much,
because instead of going to the old 8253 PIT will touch the same legacy
chip support that implements the floppy controller itself.

So it's entirely possible that the reason a non-HPET setup doesn't show
this is that the accesses to the i8253 PIT part will "synchronize" the old
floppy controller too, and hide some issue.

But still, I assume you had HPET enabled in 2.6.27, so it would be
interesting to see exactly when the problem starts.

Linus

Mark Hounschell

unread,

Dec 18, 2009, 3:10:01 PM12/18/09

to

It looks like I may have to back up and first find the points that, let me,
and stop me, booting with the HPET enabled. Before I change direction, can
the git-bisect start sequence use the SHA1 id for the starting 'goods' and
'bads'? I don't see reference to that in the doc.

Thanks
Mark

Linus Torvalds

unread,

Dec 18, 2009, 3:30:01 PM12/18/09

to

On Fri, 18 Dec 2009, Mark Hounschell wrote:
>
> It looks like I may have to back up and first find the points that, let me,
> and stop me, booting with the HPET enabled. Before I change direction, can
> the git-bisect start sequence use the SHA1 id for the starting 'goods' and
> 'bads'? I don't see reference to that in the doc.

You can always use a SHA1 id instead of a tag. So when you did

git bisect good v2.6.17.4

you could always have replaced that "v2.6.17.4" with the SHA1 of the
commit.

In git, the SHA1 ID's are the "real" names - the tags and branch names are
purely for human-readable decoration. Git always turns them into SHA1 id's
internally.

Linus

Mark Hounschell

unread,

Dec 22, 2009, 10:20:02 AM12/22/09

to

On 12/18/2009 03:15 PM, Linus Torvalds wrote:
>
>
> On Fri, 18 Dec 2009, Mark Hounschell wrote:
>>
>> It looks like I may have to back up and first find the points that, let me,
>> and stop me, booting with the HPET enabled. Before I change direction, can
>> the git-bisect start sequence use the SHA1 id for the starting 'goods' and
>> 'bads'? I don't see reference to that in the doc.
>
> You can always use a SHA1 id instead of a tag. So when you did
>
> git bisect good v2.6.17.4
>
> you could always have replaced that "v2.6.17.4" with the SHA1 of the
> commit.
>
> In git, the SHA1 ID's are the "real" names - the tags and branch names are
> purely for human-readable decoration. Git always turns them into SHA1 id's
> internally.
>
> Linus
>

Ok, I may have something that might help.

# git bisect bad
26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
Author: venkatesh...@intel.com <venkatesh...@intel.com>
Date: Fri Sep 5 18:02:18 2008 -0700

x86: HPET_MSI Initialise per-cpu HPET timers

Initialize a per CPU HPET MSI timer when possible. We retain the HPET
timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when legacy mode is being
used. We
setup the remaining HPET timers as per CPU MSI based timers. This per CPU
timer will eliminate the need for timer broadcasting with IRQ 0 when there
is non-functional LAPIC timer across CPU deep C-states.

If there are more CPUs than number of available timers, CPUs that do not
find any timer to use will continue using LAPIC and IRQ 0 broadcast.

Signed-off-by: Venkatesh Pallipadi <venkatesh...@intel.com>
Signed-off-by: Shaohua Li <shaoh...@intel.com>
Signed-off-by: Ingo Molnar <mi...@elte.hu>

:040000 040000 b0a11fa0abdc591427e78236a1f25f26b824140e
f2e9b13cf9e2eb7e0fc101660b1e1d499033d78f M arch

And of coarse this was the first commit that I could not boot if I had hpet
enabled. To get this one to boot (single user mode only) I had to add the
the quiet cmdline option and following patch from to arch/x86/kernel/hpet.c

commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a

@ -445,7 +445,7 @@ static int hpet_setup_irq(struct hpet_dev *dev)
{

if (request_irq(dev->irq, hpet_interrupt_handler,
- IRQF_SHARED|IRQF_NOBALANCING, dev->name, dev))
+ IRQF_DISABLED|IRQF_NOBALANCING, dev->name, dev))
return -1;

disable_irq(dev->irq);

AND add the quiet cmdline option.

Also, of all the machines it does work on with hpets enabled, I don't see
the HPET2 in /proc/interupts as below.

cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
0: 82 0 3 0 IO-APIC-edge timer
1: 0 0 1712 6 IO-APIC-edge i8042
3: 0 0 6 0 IO-APIC-edge
4: 0 0 6 0 IO-APIC-edge
6: 0 0 4 0 IO-APIC-edge floppy
8: 0 0 60 0 IO-APIC-edge rtc0
9: 0 0 0 0 IO-APIC-fasteoi acpi
12: 0 0 37798 179 IO-APIC-edge i8042
14: 0 0 16462 71 IO-APIC-edge
pata_atiixp
15: 0 0 5713 17 IO-APIC-edge
pata_atiixp
16: 0 0 904 2 IO-APIC-fasteoi
aic79xx, ohci_hcd:usb2, ohci_hcd:usb4, HDA Intel, ni-pci-gpib
17: 0 0 2 0 IO-APIC-fasteoi
ehci_hcd:usb1, parport0, ni-pci-gpib
18: 0 0 49940 90 IO-APIC-fasteoi
ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, nvidia
19: 0 0 703 2 IO-APIC-fasteoi
aic7xxx, ehci_hcd:usb3, ttySLG0, eth1
22: 0 0 1303 15 IO-APIC-fasteoi ahci

24: 261763 0 0 0 HPET_MSI-edge hpet2

29: 0 0 220 5 PCI-MSI-edge
sky2@pci:0000:04:00.0
NMI: 0 0 0 0 Non-maskable interrupts
LOC: 138 271356 264446 261050 Local timer interrupts
SPU: 0 0 0 0 Spurious interrupts
PMI: 0 0 0 0 Performance monitoring
interrupts
PND: 0 0 0 0 Performance pending work
RES: 4511 9275 8470 8086 Rescheduling interrupts
CAL: 3624 8666 523 4543 Function call interrupts
TLB: 981 1111 1065 1058 TLB shootdowns
ERR: 0
MIS: 0

Regards
Mark

Linus Torvalds

unread,

Dec 22, 2009, 12:40:02 PM12/22/09

to

[ Ingo, Venki and Shaohua added to cc: see the whole thread on lkml for
details, but Mark is basically chasing down a situation where the floppy
driver seems to have trouble formatting floppies, and it happened
between 2.6.27 and .28. The trouble seems to be that a DMA transfer of a
memory block transfers the wrong value for the first byte of the block.

Which should be impossible, but whatever. Some part of the system has a
cached buffer that isn't flushed.

What gets _you_ guys involved is that Mark cannot reproduce the bug if
HPET is disabled in the BIOS or by using 'nohpet'. He found that out by
pure luck while bisecting, because some time during his bisect, his
machine wouldn't even boot with HPET.

So the problem is: with HPET enabled, 2.6.27.4 _used_ to work. But
2.6.28 (and current -git) does not. Any ideas? ]

On Tue, 22 Dec 2009, Mark Hounschell wrote:
>
> Ok, I may have something that might help.
>
> # git bisect bad
> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
> Author: venkatesh...@intel.com <venkatesh...@intel.com>
> Date: Fri Sep 5 18:02:18 2008 -0700
>
> x86: HPET_MSI Initialise per-cpu HPET timers
>
> Initialize a per CPU HPET MSI timer when possible. We retain the HPET
> timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when legacy mode is being used. We
> setup the remaining HPET timers as per CPU MSI based timers. This per CPU
> timer will eliminate the need for timer broadcasting with IRQ 0 when there
> is non-functional LAPIC timer across CPU deep C-states.
>
> If there are more CPUs than number of available timers, CPUs that do not
> find any timer to use will continue using LAPIC and IRQ 0 broadcast.
>
> Signed-off-by: Venkatesh Pallipadi <venkatesh...@intel.com>
> Signed-off-by: Shaohua Li <shaoh...@intel.com>
> Signed-off-by: Ingo Molnar <mi...@elte.hu>
>

> And of coarse this was the first commit that I could not boot if I had hpet
> enabled. To get this one to boot (single user mode only) I had to add the
> the quiet cmdline option and following patch from to arch/x86/kernel/hpet.c
>
> commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a
>
> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct hpet_dev *dev)
> {
>
> if (request_irq(dev->irq, hpet_interrupt_handler,
> - IRQF_SHARED|IRQF_NOBALANCING, dev->name, dev))
> + IRQF_DISABLED|IRQF_NOBALANCING, dev->name, dev))
> return -1;
>
> disable_irq(dev->irq);
>
> AND add the quiet cmdline option.

Ok, so we know why HPET didn't boot for you, and that was fixed later (by
that 5ceb1a04). But is this also when the floppy started mis-behaving?

IOW, _if_ you boot with that fix from commit 5ceb1a04 (and the quiet
option - I wonder what that is about: do you have any ideas?), is the
per-CPU HPET timer commit also the commit that causes floppy problems, or
is this purely a "bisect when HPET became a boot-up problem"?

Linus

---

Mark Hounschell

unread,

Dec 22, 2009, 1:00:03 PM12/22/09

to

Commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is when the floppy stops
working
and also when I could no longer boot with hpet enabled. Commit 5ceb1a04 is
where I found I could boot again with the hpet enabled. It was a simple
patch so backed it into where I was
in order to be able to boot with hpet on. I did 2 different bisects. First
to find out when I could boot again with hpet on, then the next to find
which caused the floppy problem. Using the patch from the first bisect
(5ceb1a04) while doing the second bisect.

> IOW, _if_ you boot with that fix from commit 5ceb1a04 (and the quiet
> option - I wonder what that is about: do you have any ideas?), is the
> per-CPU HPET timer commit also the commit that causes floppy problems, or
> is this purely a "bisect when HPET became a boot-up problem"?
>

The quiet option was only needed because with that 5ceb1a04 commit applied
to the kernels I was interested in, kernel messages of some kind went on
for hours and I could not get a login prompt. They went by so fast and I
didn't have a serial console available to see them.
They must not have too important or critical because the machine acted as
normal as any machine in single user mode.

But once I got to a single user login prompt it was for sure the same
floppy problem.

Regards
Mark

Pallipadi, Venkatesh

unread,

Dec 22, 2009, 6:40:02 PM12/22/09

to

I am missing something here. Commit 26afe5f2 is where system does not
boot with HPET or is it where the floppy stops working when you boot
with HPET enabled.

Can you try "idle=halt" with both .27 and .28 with /proc/interrupts
output in each case. With that option, we should be using local APIC
timer and PIT, HPET or HPET with MSI should not really matter. Does it
still fail with .28 with that option?

Thanks,
Venki

Mark Hounschell

unread,

Dec 22, 2009, 7:30:01 PM12/22/09

to

As it happens, both happen there. Commit 5ceb1a04 is where it starts
booting _again_ with hpet enabled. So I took that patch (5ceb1a04) and
applied it to (26afe5f2f) to be able to boot with hpet enabled. I had to
use the quiet option to get to a login prompt, but there is where the
floppy format first fails, just as it does in 2.6.28 and up.

> Can you try "idle=halt" with both .27 and .28 with /proc/interrupts
> output in each case. With that option, we should be using local APIC
> timer and PIT, HPET or HPET with MSI should not really matter. Does it
> still fail with .28 with that option?
>

Yes, I will try that for you but will have to wait until the morning. Sorry.

Regards
Mark

Mark Hounschell

unread,

Dec 23, 2009, 8:10:03 AM12/23/09

to

2.6.28 still fails with that option.

2.6.27.41 /proc/interrupts with idle=halt

CPU0 CPU1 CPU2 CPU3
0: 126 0 0 1 IO-APIC-edge timer
1: 0 0 1 157 IO-APIC-edge i8042
3: 0 0 0 6 IO-APIC-edge
4: 0 0 0 6 IO-APIC-edge
6: 0 0 0 4 IO-APIC-edge floppy
8: 0 0 0 1 IO-APIC-edge rtc0

9: 0 0 0 0 IO-APIC-fasteoi acpi

12: 0 0 1 128 IO-APIC-edge i8042
14: 0 0 34 4457 IO-APIC-edge
pata_atiixp
15: 0 0 4 480 IO-APIC-edge
pata_atiixp
16: 0 0 0 397 IO-APIC-fasteoi
aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel
17: 0 0 0 2 IO-APIC-fasteoi
ehci_hcd:usb1
18: 0 0 0 0 IO-APIC-fasteoi

ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7

19: 0 0 0 142 IO-APIC-fasteoi
aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
22: 0 0 4 1154 IO-APIC-fasteoi ahci
219: 0 0 3 63 PCI-MSI-edge eth0

NMI: 0 0 0 0 Non-maskable interrupts

LOC: 91539 91964 92525 91181 Local timer interrupts
RES: 2888 3873 2434 2721 Rescheduling interrupts
CAL: 240 245 247 84 function call interrupts
TLB: 768 628 526 512 TLB shootdowns

SPU: 0 0 0 0 Spurious interrupts

ERR: 0
MIS: 0

2.6.28 /proc/interrupts with idle=halt

CPU0 CPU1 CPU2 CPU3
0: 126 0 2 0 IO-APIC-edge timer
1: 0 0 192 0 IO-APIC-edge i8042

3: 0 0 6 0 IO-APIC-edge
4: 0 0 6 0 IO-APIC-edge
6: 0 0 4 0 IO-APIC-edge floppy

8: 0 0 1 0 IO-APIC-edge rtc0

9: 0 0 0 0 IO-APIC-fasteoi acpi

12: 0 0 128 1 IO-APIC-edge i8042
14: 0 1 147114 396 IO-APIC-edge
pata_atiixp
15: 0 0 646 2 IO-APIC-edge
pata_atiixp
16: 0 0 396 0 IO-APIC-fasteoi

aic79xx, ohci_hcd:usb2, ohci_hcd:usb4, HDA Intel

17: 0 0 0 0 IO-APIC-fasteoi
ehci_hcd:usb1
18: 0 0 0 0 IO-APIC-fasteoi

ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7

19: 0 0 362 1 IO-APIC-fasteoi

aic7xxx, ehci_hcd:usb3, ttySLG0, eth1

22: 0 0 874 1 IO-APIC-fasteoi ahci
1274: 0 0 193 4 PCI-MSI-edge eth0
1279: 513207 0 0 0 HPET_MSI-edge hpet2

NMI: 0 0 0 0 Non-maskable interrupts

LOC: 268 513395 513138 522088 Local timer interrupts
RES: 3262 3679 2573 3746 Rescheduling interrupts
CAL: 131 166 57 147 Function call interrupts
TLB: 680 438 450 639 TLB shootdowns

SPU: 0 0 0 0 Spurious interrupts

ERR: 0
MIS: 0

Pallipadi, Venkatesh

unread,

Dec 23, 2009, 10:10:02 AM12/23/09

to

Hmm. Looks like hpet2 is still getting used instead of local APIC timer in .28 case.

I was expecting some low number in hpet2 and local timer on all CPU to be around the same value. Above shows CPU 0 is depending on hpet2 for some reason even with idle=halt. Can you send the output of below two in case of .28
/proc/timer_list
grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*

Thanks,
Venki

Mark Hounschell

unread,

Dec 23, 2009, 10:40:03 AM12/23/09

to

Attached.

> grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*

I have no /sys/devices/system/cpu/cpu0/cpuidle on this machine.
Maybe because of

#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
# CONFIG_CPU_IDLE is not set

Would it be OK if when you ask for 2.6.28 info, I use a 2.6.32.2 kernel?
That kernel also fails fdformat with hpet enabled on these machines.

Thanks
Mark

timer_list.txt

Mark Hounschell

unread,

Dec 23, 2009, 11:00:01 AM12/23/09

to

I do have this on 2.6.32.2 though.

# grep . /sys/devices/system/cpu/cpuidle/current_*
/sys/devices/system/cpu/cpuidle/current_driver:acpi_idle
/sys/devices/system/cpu/cpuidle/current_governor_ro:ladder

Want me to go back to 2.6.28 and show this?

Mark

Linus Torvalds

unread,

Dec 23, 2009, 11:40:02 AM12/23/09

to

On Wed, 23 Dec 2009, Mark Hounschell wrote:
> >
> > Hmm. Looks like hpet2 is still getting used instead of local APIC
> > timer in .28 case.
> >
> > I was expecting some low number in hpet2 and local timer on all CPU to
> > be around the same value. Above shows CPU 0 is depending on hpet2 for
> > some reason even with idle=halt. Can you send the output of below two
> > in case of .28 /proc/timer_list
>
> Attached.

Oh wow.

That's crazy:

Tick Device: mode: 1
Per CPU device: 0
Clock Event Device: hpet2
max_delta_ns: 2147483647
min_delta_ns: 5000
mult: 61510047
shift: 32
mode: 3
next_event: 123991000000 nsecs
set_next_event: hpet_msi_next_event
set_mode: hpet_msi_set_mode
event_handler: hrtimer_interrupt

Tick Device: mode: 1
Per CPU device: 1
Clock Event Device: lapic
max_delta_ns: 670831998
min_delta_ns: 1199
mult: 53707624
shift: 32
mode: 3
next_event: 123991125000 nsecs
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: hrtimer_interrupt

...

It's not using the lapic for CPU0.

Using the HPET as a per-cpu timer is some crazy sh*t, since it's pretty
expensive to reprogram (compared to the local apic). And having different
timers for different CPU's is just odd.

The fact that the timer subsystem can do this and it all (mostly) works at
all is nice and impressive, but doesn't make it any less crazy ;)

That said, none of this seems to explain why DMA/fdformat doesn't work.

Linus

Andi Kleen

unread,

Dec 23, 2009, 11:40:01 AM12/23/09

to

Linus Torvalds <torv...@linux-foundation.org> writes:

> It's not using the lapic for CPU0.
>
> Using the HPET as a per-cpu timer is some crazy sh*t, since it's pretty
> expensive to reprogram (compared to the local apic). And having different
> timers for different CPU's is just odd.
>
> The fact that the timer subsystem can do this and it all (mostly) works at
> all is nice and impressive, but doesn't make it any less crazy ;)

I suspect it's a system where the APIC timer stops in deeper idle
states and it supports them. In this case CPU #0 does timer broadcasts
when needed to wake the other CPUs up from deep C, but for that it has
to run with HPET. At least the other ones can still enjoy the LAPIC
timer.

This might suggest that Mark's floppy controller doesn't like
deep C? Mark, did you try booting with processor.max_cstate=1
and HPET enabled?

-Andi
--
a...@linux.intel.com -- Speaking for myself only.

Linus Torvalds

unread,

Dec 23, 2009, 12:00:01 PM12/23/09

to

On Wed, 23 Dec 2009, Andi Kleen wrote:
>
> I suspect it's a system where the APIC timer stops in deeper idle
> states and it supports them. In this case CPU #0 does timer broadcasts
> when needed to wake the other CPUs up from deep C, but for that it has
> to run with HPET. At least the other ones can still enjoy the LAPIC
> timer.

Ahh, ok, that makes sense. I was assuming the broadcast timer would act in
that capacity, but..

> This might suggest that Mark's floppy controller doesn't like
> deep C? Mark, did you try booting with processor.max_cstate=1
> and HPET enabled?

We have indeed had historical issues with floppy and sleep states before.

I do note another issue, though - the floppy driver itself seems totally
broken when it comes to using interleaved sectors. Alain, that "place
logical sectors" code is simply _broken_ - the "while" kicks in only if
the first sector we test is busy _and_ we were at the last sector so that
we increment past F_SECT_PER_TRACK.

So shouldn't that sector layout be something like the appended?

Linus
---
drivers/block/floppy.c | 7 ++-----
1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 3266b4f..9c9148c 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -2237,13 +2237,10 @@ static void setup_format_params(int track)
for (count = 1; count <= F_SECT_PER_TRACK; ++count) {
here[n].sect = count;
n = (n + il) % F_SECT_PER_TRACK;
- if (here[n].sect) { /* sector busy, find next free sector */
+ while (here[n].sect) { /* sector busy, find next free sector */
++n;
- if (n >= F_SECT_PER_TRACK) {
+ if (n >= F_SECT_PER_TRACK)
n -= F_SECT_PER_TRACK;
- while (here[n].sect)
- ++n;
- }
}
}
if (_floppy->stretch & FD_SECTBASEMASK) {

Andi Kleen

unread,

Dec 23, 2009, 12:10:02 PM12/23/09

to

On Wed, Dec 23, 2009 at 08:49:38AM -0800, Linus Torvalds wrote:
>
>
> On Wed, 23 Dec 2009, Andi Kleen wrote:
> >
> > I suspect it's a system where the APIC timer stops in deeper idle
> > states and it supports them. In this case CPU #0 does timer broadcasts
> > when needed to wake the other CPUs up from deep C, but for that it has
> > to run with HPET. At least the other ones can still enjoy the LAPIC
> > timer.
>
> Ahh, ok, that makes sense. I was assuming the broadcast timer would act in
> that capacity, but..

The "broadcasts" are done using IPIs from cpu #08 and only when that target
CPU is deep idle. That's more efficient than letting the hardware
always broadcast.

>
> > This might suggest that Mark's floppy controller doesn't like
> > deep C? Mark, did you try booting with processor.max_cstate=1
> > and HPET enabled?
>
> We have indeed had historical issues with floppy and sleep states before.

I removed that code when moving to 64bit (floppy driver disabling C1),
but perhaps we need some variant of it again (but it's the first such
report in many years). Although it would be sad to have it again on all
systems.

-Andi

Andi Kleen

unread,

Dec 23, 2009, 12:20:01 PM12/23/09

to

> This is what I was thining yday and asked Mark to try idle=halt.
> This /proc/interrupts is with idle=halt when there should not be any
> C-states and broadcasts involved.

Ah ok, missed that sorry.

Actually I'm glad that the floppy-idle hack is not needed again.

-Andi
--
a...@linux.intel.com -- Speaking for myself only.

Pallipadi, Venkatesh

unread,

Dec 23, 2009, 12:20:01 PM12/23/09

to

>-----Original Message-----
>From: Linus Torvalds [mailto:torv...@linux-foundation.org]
>Sent: Wednesday, December 23, 2009 8:50 AM
>To: Andi Kleen
>Cc: Mark Hounschell; Pallipadi, Venkatesh; dma...@cfl.rr.com;
>Alain Knaff; Linux Kernel Mailing List;
>fdu...@fdutils.linux.lu; Li, Shaohua; Ingo Molnar
>Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28
>
>
>

>On Wed, 23 Dec 2009, Andi Kleen wrote:
>>
>> I suspect it's a system where the APIC timer stops in deeper idle
>> states and it supports them. In this case CPU #0 does timer
>broadcasts
>> when needed to wake the other CPUs up from deep C, but for
>that it has
>> to run with HPET. At least the other ones can still enjoy the LAPIC
>> timer.
>
>Ahh, ok, that makes sense. I was assuming the broadcast timer
>would act in
>that capacity, but..

This is what I was thining yday and asked Mark to try idle=halt.

This /proc/interrupts is with idle=halt when there should not be any
C-states and broadcasts involved.

>>> HPET_MSI-edge hpet2
>>> NMI: 0 0 0 0
>>> Non-maskable interrupts
>>> LOC: 268 513395 513138 522088 Local timer
>>> interrupts

Not sure how this is related to floppy problem. But, we surely
have something wrong with percpu HPET usage here.

Thanks,
Venki

Mark Hounschell

unread,

Dec 23, 2009, 12:50:01 PM12/23/09

to

On 12/23/2009 11:38 AM, Andi Kleen wrote:
> Linus Torvalds <torv...@linux-foundation.org> writes:
>
>> It's not using the lapic for CPU0.
>>
>> Using the HPET as a per-cpu timer is some crazy sh*t, since it's pretty
>> expensive to reprogram (compared to the local apic). And having different
>> timers for different CPU's is just odd.
>>
>> The fact that the timer subsystem can do this and it all (mostly) works at
>> all is nice and impressive, but doesn't make it any less crazy ;)
>
> I suspect it's a system where the APIC timer stops in deeper idle
> states and it supports them. In this case CPU #0 does timer broadcasts
> when needed to wake the other CPUs up from deep C, but for that it has
> to run with HPET. At least the other ones can still enjoy the LAPIC
> timer.
>
> This might suggest that Mark's floppy controller doesn't like
> deep C? Mark, did you try booting with processor.max_cstate=1
> and HPET enabled?

I just did and /proc/interrupts looks the same and the floppy still does
not format.

I'll try the patch Linus provided now.

Mark

Linus Torvalds

unread,

Dec 23, 2009, 1:10:02 PM12/23/09

to

On Wed, 23 Dec 2009, Mark Hounschell wrote:
>

> I'll try the patch Linus provided now.

I doubt it matters - because if it did, it would matter for everybody, and
the HPET thing shouldn't make any difference at all.

[ Or rather, it should matter for everybody trying to format a specific
format (without interleave it won't matter, and not all formats have any
interleave - I think it was mainly used on 5.25" floppies and special
formats). ]

Besides, maybe I was just mis-reading the code.

But getting some testing for the patch certainly won't hurt, so I'm not
going to argue against it any more ;)

Linus

Mark Hounschell

unread,

Dec 23, 2009, 1:20:01 PM12/23/09

to

On 12/23/2009 01:01 PM, Linus Torvalds wrote:
>
>
> On Wed, 23 Dec 2009, Mark Hounschell wrote:
>>
>> I'll try the patch Linus provided now.
>
> I doubt it matters - because if it did, it would matter for everybody, and
> the HPET thing shouldn't make any difference at all.
>
> [ Or rather, it should matter for everybody trying to format a specific
> format (without interleave it won't matter, and not all formats have any
> interleave - I think it was mainly used on 5.25" floppies and special
> formats). ]
>
> Besides, maybe I was just mis-reading the code.
>
> But getting some testing for the patch certainly won't hurt, so I'm not
> going to argue against it any more ;)

Yea, that hosed it up pretty good. The very first track label sent out
caused some sort of timeout.

Dec 23 13:10:02 harley kernel:
Dec 23 13:10:02 harley kernel: floppy driver state
Dec 23 13:10:02 harley kernel: -------------------
Dec 23 13:10:02 harley kernel: now=9017 last interrupt=8117 diff=900 last
called handler=f73ce27d
Dec 23 13:10:02 harley kernel: timeout_message=lock fdc
Dec 23 13:10:02 harley kernel: last output bytes:
Dec 23 13:10:02 harley kernel: 0 90 4294899106
Dec 23 13:10:02 harley kernel: 1a 90 4294899106
Dec 23 13:10:02 harley kernel: 0 90 4294899106
Dec 23 13:10:02 harley kernel: 3 90 4294899106
Dec 23 13:10:02 harley kernel: c1 90 4294899106
Dec 23 13:10:02 harley kernel: 10 90 4294899106
Dec 23 13:10:02 harley kernel: 7 80 4294899106
Dec 23 13:10:02 harley kernel: 0 90 4294899106
Dec 23 13:10:02 harley kernel: 8 81 4294899106
Dec 23 13:10:02 harley kernel: 4 80 4294899106
Dec 23 13:10:02 harley kernel: 0 90 4294899106
Dec 23 13:10:02 harley kernel: e6 80 8007
Dec 23 13:10:02 harley kernel: 0 90 8007
Dec 23 13:10:02 harley syslog-ng[2651]: last message repeated 2 times
Dec 23 13:10:02 harley kernel: 1 90 8007
Dec 23 13:10:02 harley kernel: 2 90 8007
Dec 23 13:10:02 harley kernel: 12 90 8007
Dec 23 13:10:02 harley kernel: 1b 90 8007
Dec 23 13:10:02 harley kernel: ff 90 8007
Dec 23 13:10:02 harley kernel: last result at 8117
Dec 23 13:10:02 harley kernel: last redo_fd_request at 8117
Dec 23 13:10:02 harley kernel:
Dec 23 13:10:02 harley kernel: status=80
Dec 23 13:10:02 harley kernel: fdc_busy=1
Dec 23 13:10:02 harley kernel: cont=f73d58e4
Dec 23 13:10:02 harley kernel: current_req=(null)
Dec 23 13:10:02 harley kernel: command_status=-1
Dec 23 13:10:02 harley kernel:
Dec 23 13:10:02 harley kernel: floppy0: floppy timeout called
Dec 23 13:10:22 harley kernel:
Dec 23 13:10:22 harley kernel: floppy driver state
Dec 23 13:10:22 harley kernel: -------------------
Dec 23 13:10:22 harley kernel: now=15017 last interrupt=8117 diff=6900 last
called handler=f73ce27d
Dec 23 13:10:22 harley kernel: timeout_message=do wakeup
Dec 23 13:10:22 harley kernel: last output bytes:
Dec 23 13:10:22 harley kernel: 0 90 4294899106
Dec 23 13:10:22 harley kernel: 1a 90 4294899106
Dec 23 13:10:22 harley kernel: 0 90 4294899106
Dec 23 13:10:22 harley kernel: 3 90 4294899106
Dec 23 13:10:22 harley kernel: c1 90 4294899106
Dec 23 13:10:22 harley kernel: 10 90 4294899106
Dec 23 13:10:22 harley kernel: 7 80 4294899106
Dec 23 13:10:22 harley kernel: 0 90 4294899106
Dec 23 13:10:22 harley kernel: 8 81 4294899106
Dec 23 13:10:22 harley kernel: 4 80 4294899106
Dec 23 13:10:22 harley kernel: 0 90 4294899106
Dec 23 13:10:22 harley kernel: e6 80 8007
Dec 23 13:10:22 harley kernel: 0 90 8007
Dec 23 13:10:22 harley syslog-ng[2651]: last message repeated 2 times
Dec 23 13:10:22 harley kernel: 1 90 8007
Dec 23 13:10:22 harley kernel: 2 90 8007
Dec 23 13:10:22 harley kernel: 12 90 8007
Dec 23 13:10:22 harley kernel: 1b 90 8007
Dec 23 13:10:22 harley kernel: ff 90 8007
Dec 23 13:10:22 harley kernel: last result at 8117
Dec 23 13:10:22 harley kernel: last redo_fd_request at 8117
Dec 23 13:10:22 harley kernel:
Dec 23 13:10:22 harley kernel: status=80
Dec 23 13:10:22 harley kernel: fdc_busy=1
Dec 23 13:10:22 harley kernel: floppy_work.func=f73d03da
Dec 23 13:10:22 harley kernel: cont=f73d5274
Dec 23 13:10:22 harley kernel: current_req=(null)
Dec 23 13:10:22 harley kernel: command_status=-1
Dec 23 13:10:22 harley kernel:
Dec 23 13:10:22 harley kernel: floppy0: floppy timeout called
Dec 23 13:10:22 harley kernel: floppy.c: no request in request_don

Have to reboot now...

Mark

Pallipadi, Venkatesh

unread,

Dec 23, 2009, 2:20:01 PM12/23/09

to

On Wed, Dec 23, 2009 at 09:41:50AM -0800, Mark Hounschell wrote:
> On 12/23/2009 11:38 AM, Andi Kleen wrote:
> > Linus Torvalds <torv...@linux-foundation.org> writes:
> >
> >> It's not using the lapic for CPU0.
> >>
> >> Using the HPET as a per-cpu timer is some crazy sh*t, since it's pretty
> >> expensive to reprogram (compared to the local apic). And having different
> >> timers for different CPU's is just odd.
> >>
> >> The fact that the timer subsystem can do this and it all (mostly) works at
> >> all is nice and impressive, but doesn't make it any less crazy ;)
> >
> > I suspect it's a system where the APIC timer stops in deeper idle
> > states and it supports them. In this case CPU #0 does timer broadcasts
> > when needed to wake the other CPUs up from deep C, but for that it has
> > to run with HPET. At least the other ones can still enjoy the LAPIC
> > timer.
> >
> > This might suggest that Mark's floppy controller doesn't like
> > deep C? Mark, did you try booting with processor.max_cstate=1
> > and HPET enabled?
>
> I just did and /proc/interrupts looks the same and the floppy still does
> not format.
>

Can you try this one line patch either on .28 or .32 (with /proc/interrupts
output).
This disables hpet2 and lapic timer should then be used on CPU 0. If things
work with this test patch, we will know that the failure is somehow related
to HPET usage in MSI mode.

Thanks,
Venki

Reduce the rating of percpu hpet timer

Signed-off-by: Venkatesh Pallipadi <venkatesh...@intel.com>
---
arch/x86/kernel/hpet.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index cafb1c6..f89d17a 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -480,7 +480,7 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
hpet_setup_irq(hdev);
evt->irq = hdev->irq;

- evt->rating = 110;
+ evt->rating = 40;
evt->features = CLOCK_EVT_FEAT_ONESHOT;
if (hdev->flags & HPET_DEV_PERI_CAP)
evt->features |= CLOCK_EVT_FEAT_PERIODIC;
--
1.6.0.6

Mark Hounschell

unread,

Dec 23, 2009, 2:40:02 PM12/23/09

to

That made it work. Used 2.6.32.2

cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
0: 82 0 0 1 IO-APIC-edge timer
1: 0 0 0 67 IO-APIC-edge i8042

3: 0 0 0 6 IO-APIC-edge

4: 0 0 0 4 IO-APIC-edge

6: 0 0 0 4 IO-APIC-edge floppy

8: 0 0 0 8 IO-APIC-edge rtc0

9: 0 0 0 0 IO-APIC-fasteoi acpi

12: 0 0 10 1519 IO-APIC-edge i8042
14: 0 0 39 10995 IO-APIC-edge
pata_atiixp
15: 0 0 3 391 IO-APIC-edge
pata_atiixp
16: 0 0 2 606 IO-APIC-fasteoi
aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel, Digi DBX2, ni-pci-gpib
17: 0 0 0 3 IO-APIC-fasteoi
ehci_hcd:usb1, parport0, ni-pci-gpib
18: 0 0 10 2168 IO-APIC-fasteoi
ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, Digi DBX2, nvidia
19: 0 0 0 130 IO-APIC-fasteoi

aic7xxx, ehci_hcd:usb2, ttySLG0, eth1

22: 0 0 8 1151 IO-APIC-fasteoi ahci
24: 0 0 0 0 HPET_MSI-edge hpet2
29: 0 0 0 48 PCI-MSI-edge
sky2@pci:0000:04:00.0

NMI: 0 0 0 0 Non-maskable interrupts

LOC: 34842 30177 29672 29632 Local timer interrupts

SPU: 0 0 0 0 Spurious interrupts

PMI: 0 0 0 0 Performance monitoring
interrupts
PND: 0 0 0 0 Performance pending work

RES: 17501 20449 16670 11224 Rescheduling interrupts
CAL: 10554 2336 1102 1071 Function call interrupts
TLB: 364 562 753 468 TLB shootdowns
ERR: 0
MIS: 0

# fdformat /dev/fd0u1440
Double-sided, 80 tracks, 18 sec/track. Total capacity 1440 kB.
Formatting ... done
Verifying ... done

alain

unread,

Dec 23, 2009, 3:20:01 PM12/23/09

to

Linus Torvalds wrote:

> diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
> index 3266b4f..9c9148c 100644
> --- a/drivers/block/floppy.c
> +++ b/drivers/block/floppy.c
> @@ -2237,13 +2237,10 @@ static void setup_format_params(int track)
> for (count = 1; count <= F_SECT_PER_TRACK; ++count) {
> here[n].sect = count;
> n = (n + il) % F_SECT_PER_TRACK;
> - if (here[n].sect) { /* sector busy, find next free sector */
> + while (here[n].sect) { /* sector busy, find next free sector */
> ++n;
> - if (n >= F_SECT_PER_TRACK) {
> + if (n >= F_SECT_PER_TRACK)
> n -= F_SECT_PER_TRACK;
> - while (here[n].sect)
> - ++n;
> - }
> }
> }
> if (_floppy->stretch & FD_SECTBASEMASK) {

The original code does indeed look a little bit strange... and might
break if there is a long run of "busy" sectors near the end of the
physical track. Or maybe there is a mathematical reason why this
situation cannot occur. I'll have to think about it a little bit more to
come up with a test case that will break either the new or old code.

But in any case, if a bug would occur due to this code, it would only
depend on the format's parameters, and not on the hardwarde.

Regards,

Alain

alain

unread,

Dec 23, 2009, 3:40:03 PM12/23/09

to

Pallipadi, Venkatesh wrote:
> MSI interrupt being delivered to CPU 0. I cannot think of any reason why
> this can break dma. We can probably try adding some dummy HPET read
> after dma write, to see if that flushes things properly.

Shouldn't that be "... some dummy HPET read _before_ dma write...". In
order to ensure that DMA cache is consistent _before_ dma controller
reads it?

Regards,

Alain

Pallipadi, Venkatesh

unread,

Dec 23, 2009, 3:40:02 PM12/23/09

to

Hmmm.. Thats very interesting indeed.

That clearly says that HPET MSI interrupts somehow is causing some
caching side effect in the chipset that results in this floppy dma
failure.

Here's is what we have until now.
IRQ 0 is based on HPET legacy interrupt and HPET device is also capable
of MSI on this platform. So we also have a percpu hpet (hpet2 tied to
CPU0). percpu hpet was added to avoid the usage of IRQ0+LAPIC broadcast
in cases where LAPIC timer will stop working in deep C-state. As we have
only one HPET channel free for percpu HPET, we only have hpet2 tied to
CPU 0 and other CPUs still have to go through IRQ0+LAPIC broadcast with
deep C-state.

One problem here is that percpu hpet should only get used when LAPIC
cannot be used (that is when CPU enters deep C-state). Using hpet2 in
place of LAPIC timer even when deep C-state is not supported is not
right in terms of performance. We need some changes here to fix that
[Problem 1].

But, that still does not explain why we are seeing this problem in the
first place. I mean, using hpet2 is not optimal, but should not have
functionality issues like this. Even fixing [Problem 1] above, we may
see this problem on some other platform that supports deep C-state and
so has hpet2 enabled for a valid reason.

Also, I am not sure whether the problem also happens if legacy HPET
interrupts are used during run time in place of LAPIC timer (May be
worth to try this with a simple test patch, let me think about it). In
this case, legacy HPET interrupt rightly goes quiet after boot, giving
priority to LAPIC timer.

With hpet MSI interrupts, we do a write followed by read of HPET
memmapped register to set a HPET channel timeout + read of global HPET
timer. This happens on every timer interrupt on CPU 0. And we also have

MSI interrupt being delivered to CPU 0. I cannot think of any reason why
this can break dma. We can probably try adding some dummy HPET read
after dma write, to see if that flushes things properly.

Thanks,
Venki

Pallipadi, Venkatesh

unread,

Dec 23, 2009, 4:40:02 PM12/23/09

to

On Wed, 2009-12-23 at 12:34 -0800, alain wrote:
> Pallipadi, Venkatesh wrote:
> > MSI interrupt being delivered to CPU 0. I cannot think of any reason why
> > this can break dma. We can probably try adding some dummy HPET read
> > after dma write, to see if that flushes things properly.
>
> Shouldn't that be "... some dummy HPET read _before_ dma write...". In
> order to ensure that DMA cache is consistent _before_ dma controller
> reads it?
>

Yes. I meant after the contents of the buffer is changed and before the
DMA transfer and the controller reading it.

Thanks,
Venki

Arjan van de Ven

unread,

Dec 25, 2009, 7:20:02 AM12/25/09

to

On Wed, 23 Dec 2009 18:08:32 +0100
Andi Kleen <an...@firstfloor.org> wrote:

> I removed that code when moving to 64bit (floppy driver disabling C1),
> but perhaps we need some variant of it again (but it's the first such
> report in many years). Although it would be sad to have it again on
> all systems.

at least now we have the pmqos infrastructure, driver just needs to ask
for 0 latency ;)

--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

Andi Kleen

unread,

Dec 25, 2009, 3:40:02 PM12/25/09

to

On Fri, Dec 25, 2009 at 01:21:16PM +0100, Arjan van de Ven wrote:
> On Wed, 23 Dec 2009 18:08:32 +0100
> Andi Kleen <an...@firstfloor.org> wrote:
>
> > I removed that code when moving to 64bit (floppy driver disabling C1),
> > but perhaps we need some variant of it again (but it's the first such
> > report in many years). Although it would be sad to have it again on
> > all systems.
>
> at least now we have the pmqos infrastructure, driver just needs to ask
> for 0 latency ;)

Does pmqos work with apci=off etc.? I didn't think it shut down
the classic "HLT" idle, does it? The old i386 systems needed that
apparently, they long pre date any deeper idle states.

Anyways the code is still there for 32bit.

-Andi

--
a...@linux.intel.com -- Speaking for myself only.

Arjan van de Ven

unread,

Dec 26, 2009, 4:40:01 AM12/26/09

to

On Fri, 25 Dec 2009 21:33:04 +0100
Andi Kleen <an...@firstfloor.org> wrote:

> On Fri, Dec 25, 2009 at 01:21:16PM +0100, Arjan van de Ven wrote:
> > On Wed, 23 Dec 2009 18:08:32 +0100
> > Andi Kleen <an...@firstfloor.org> wrote:
> >
> > > I removed that code when moving to 64bit (floppy driver disabling
> > > C1), but perhaps we need some variant of it again (but it's the
> > > first such report in many years). Although it would be sad to
> > > have it again on all systems.
> >
> > at least now we have the pmqos infrastructure, driver just needs to
> > ask for 0 latency ;)
>
> Does pmqos work with apci=off etc.?

yes

> I didn't think it shut down
> the classic "HLT" idle, does it?

it does if you specify a latency of 0; it will then go into the
spin-only state until you give up your latency requirement

--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

Andi Kleen

unread,

Dec 26, 2009, 11:50:02 AM12/26/09

to

> > Does pmqos work with apci=off etc.?
>
> yes
>
> > I didn't think it shut down
> > the classic "HLT" idle, does it?
>
> it does if you specify a latency of 0; it will then go into the
> spin-only state until you give up your latency requirement

I looked at it this evening, but it seems like pm_qos is not
interrupt safe (e.g. calls blocking notifiers) and floppy currently does
enable/disable_hlt from interrupts and bottom halves.

Would need some more infrastructure work or restructuring
of the floppy driver.

-Andi
--
a...@linux.intel.com -- Speaking for myself only.

Alain Knaff

unread,

Dec 27, 2009, 7:40:02 AM12/27/09

to

Andi Kleen wrote:
>>> Does pmqos work with apci=off etc.?
>> yes
>>
>>> I didn't think it shut down
>>> the classic "HLT" idle, does it?
>> it does if you specify a latency of 0; it will then go into the
>> spin-only state until you give up your latency requirement
>
> I looked at it this evening, but it seems like pm_qos is not
> interrupt safe (e.g. calls blocking notifiers) and floppy currently does
> enable/disable_hlt from interrupts and bottom halves.
>
> Would need some more infrastructure work or restructuring
> of the floppy driver.
>
> -Andi

disable_hlt/enable_hlt was only needed to work around a bug on TM4000
(Texas Instrument) Laptops which were popular around 1994 / 1995.
Basically, as soon as the CPU went into hlt() state, so did the DMA
controller, either causing a really slow transfer, or (worse) a buffer
over/underrun which failed the operation.

On hardware unaffected by this particular bug (which would be most
hardware around now, 14 years after the fact...), these calls can safely
be removed.

Regards,

Alain

Andi Kleen

unread,

Dec 27, 2009, 9:00:01 PM12/27/09

to

> disable_hlt/enable_hlt was only needed to work around a bug on TM4000
> (Texas Instrument) Laptops which were popular around 1994 / 1995.

I don't think we can fully drop support for these systems.

Did they have an unique PCI ID or something else that could be tested
for?

Perhaps it could be just a white list like dmi_year > 1995 to disable.

Depending on how often floppies are still used this might save
non trivial amounts of power on newer systems :)

Anyways it would be probably good to convert this to the new infrastructure,
and remove the old hooks, but the interrupt-context issue would
need to be fixed first.

-Andi
--
a...@linux.intel.com -- Speaking for myself only.

Alain Knaff

unread,

Dec 28, 2009, 5:40:02 AM12/28/09

to

Andi Kleen wrote:
>> disable_hlt/enable_hlt was only needed to work around a bug on TM4000
>> (Texas Instrument) Laptops which were popular around 1994 / 1995.
>
> I don't think we can fully drop support for these systems.
>
> Did they have an unique PCI ID or something else that could be tested
> for?

Floppy controllers are not PCI devices and thus have no PCI id
unfortunately... :-(

> Perhaps it could be just a white list like dmi_year > 1995 to disable.
>
> Depending on how often floppies are still used this might save
> non trivial amounts of power on newer systems :)

Removing these calls will indeed save a *tiny* amount of power by
allowing the CPU to go into halt during DMA transfer. But the main
argument should be simplification.

> Anyways it would be probably good to convert this to the new infrastructure,
> and remove the old hooks, but the interrupt-context issue would
> need to be fixed first.
>
> -Andi

Well, at least for testing whether it fixes the new problem (DMA cache
issue), it's useful to know that these calls can be safely removed on
almost all of today's machines. That way, we will know whether this
refactoring will be worth the effort.

Regards,

Alain

Andi Kleen

unread,

Dec 28, 2009, 10:00:02 AM12/28/09

to

On Mon, Dec 28, 2009 at 11:27:56AM +0100, Alain Knaff wrote:
> Andi Kleen wrote:
> >> disable_hlt/enable_hlt was only needed to work around a bug on TM4000
> >> (Texas Instrument) Laptops which were popular around 1994 / 1995.
> >
> > I don't think we can fully drop support for these systems.
> >
> > Did they have an unique PCI ID or something else that could be tested
> > for?
>
> Floppy controllers are not PCI devices and thus have no PCI id
> unfortunately... :-(

Yes, but it's enough to identify any component in the system.

-Andi
--
a...@linux.intel.com -- Speaking for myself only.

Pavel Machek

unread,

Dec 28, 2009, 3:10:02 PM12/28/09

to

> > > This might suggest that Mark's floppy controller doesn't like
> > > deep C? Mark, did you try booting with processor.max_cstate=1
> > > and HPET enabled?
> >
> > We have indeed had historical issues with floppy and sleep states before.
>
> I removed that code when moving to 64bit (floppy driver disabling C1),
> but perhaps we need some variant of it again (but it's the first such
> report in many years). Although it would be sad to have it again on all
> systems.

C1 is hlt. Are you sure? I could see how C3 could cause problems (DMA
latency), but...

Can mark simply try with idle=poll?

Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Mark Hounschell

unread,

Dec 28, 2009, 4:00:02 PM12/28/09

to

On 12/27/2009 06:09 AM, Pavel Machek wrote:
>
>>>> This might suggest that Mark's floppy controller doesn't like
>>>> deep C? Mark, did you try booting with processor.max_cstate=1
>>>> and HPET enabled?
>>>
>>> We have indeed had historical issues with floppy and sleep states before.
>>
>> I removed that code when moving to 64bit (floppy driver disabling C1),
>> but perhaps we need some variant of it again (but it's the first such
>> report in many years). Although it would be sad to have it again on all
>> systems.
>
> C1 is hlt. Are you sure? I could see how C3 could cause problems (DMA
> latency), but...
>
> Can mark simply try with idle=poll?
>
> Pavel
>

The floppy still fails with idle=poll

Mark

Mark Hounschell

unread,

Jan 8, 2010, 12:50:02 PM1/8/10

to

Haven't seen any activity on this thread in a while. Just curious, are we
still working this?
Is there anything else I can do to help?

Thanks
Mark

Pallipadi, Venkatesh

unread,

Jan 11, 2010, 7:20:02 PM1/11/10

to

Sorry for not following up on this. We have narrowed this down to HPET
MSI and floppy DMA. I still don't know how HPET MSI interrupts are
breaking floppy DMA.

You are seeing the problem on two different systems. Correct? Do you
have any system where this works with HPET MSI enabled?

Couple of options on how we can go about this one:
1) Change the HPET-MSI change to not get activated when there are no
C-states with LAPIC stoppage involved. This will resolve the problem on
the systems you reported as there are no deep C-states. But, I fear that
with the actual problem unresolved, we may hit it in future with this or
some other platform having same issue with CPUs that support deep
C-state.
2) Try this testcase on few other platforms that support HPET-MSI and
deep C-states and check how widespread the problem is and then add a
whitelist-blacklist for HPET MSI usage.

I think, for 2.6.33 option 1 is better. Will work on that and send in
patches for you test.

Thanks,
Venki

Mark Hounschell

unread,

Jan 12, 2010, 4:10:01 AM1/12/10

to

I see the problem on every system in which the HPET2 shows up in
/proc/interrupts. The machines that work with HPET enabled don't show HPET
at all in /proc/interrupts. I have some of each. All the boxes that fail
here use the (AMD) 790x series chip sets.

> Couple of options on how we can go about this one:
> 1) Change the HPET-MSI change to not get activated when there are no
> C-states with LAPIC stoppage involved. This will resolve the problem on
> the systems you reported as there are no deep C-states. But, I fear that
> with the actual problem unresolved, we may hit it in future with this or
> some other platform having same issue with CPUs that support deep
> C-state.
> 2) Try this testcase on few other platforms that support HPET-MSI and
> deep C-states and check how widespread the problem is and then add a
> whitelist-blacklist for HPET MSI usage.
>
> I think, for 2.6.33 option 1 is better. Will work on that and send in
> patches for you test.
>

OK, thanks
Mark

Pallipadi, Venkatesh

unread,

Jan 14, 2010, 9:10:01 PM1/14/10

to

Mark,

I just sent out a patchset that should workaround the problem here. Can
you check and let me know whether thats the case.

We will still need a simpler/smaller workaround for .33. Will send a
patch for that soon.

Also, are you testing this with usb floppy controller? I tried to test
it on my end, but fdformat doesn't seem to like my usb floppy drive. I
tried, 'ufiformat -f 1440 <dev>', with which I am not able to reproduce
the failure on any of my boxes. Not sure whether that really means I
don't hit this bug or that is going through totally different code path.

Thanks,
Venki

Mark Hounschell

unread,

Jan 15, 2010, 4:40:02 AM1/15/10

to

Yes, I'll try that today. I assume I'll find them on LMKL.

> We will still need a simpler/smaller workaround for .33. Will send a
> patch for that soon.
>
> Also, are you testing this with usb floppy controller? I tried to test
> it on my end, but fdformat doesn't seem to like my usb floppy drive. I
> tried, 'ufiformat -f 1440 <dev>', with which I am not able to reproduce
> the failure on any of my boxes. Not sure whether that really means I
> don't hit this bug or that is going through totally different code path.
>

No, I've never even seen a USB floppy controller.

Mark

Mark Hounschell

unread,

Jan 15, 2010, 1:10:03 PM1/15/10

to

On 01/14/2010 09:01 PM, Pallipadi, Venkatesh wrote:

Yes, it does seem to fix the issue. The floppy formats and /proc/interrupts
look normal with nothing going on with the hpet2 msi.

Regards
Mark

Pallipadi, Venkatesh

unread,

Jan 21, 2010, 2:20:01 PM1/21/10

to

HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
side-effects on floppy DMA. Do not use HPET MSI on such platforms.

Original problem report from Mark Hounschell
http://lkml.indiana.edu/hypermail/linux/kernel/0912.2/01118.html

Tested-by: Mark Hounschell <ma...@compro.net>

Signed-off-by: Venkatesh Pallipadi <venkatesh...@intel.com>
---

This patch needs to go to stable as well. But, there are some conflicts that prevents
the patch from going as is. I can rebase/resubmit to stable once the patch goes upstream.

arch/x86/include/asm/hpet.h | 1 +
arch/x86/kernel/hpet.c | 8 ++++++++
arch/x86/kernel/quirks.c | 13 +++++++++++++
3 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index 5d89fd2..1d5c08a 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -67,6 +67,7 @@ extern unsigned long hpet_address;
extern unsigned long force_hpet_address;
extern u8 hpet_blockid;
extern int hpet_force_user;
+extern u8 hpet_msi_disable;
extern int is_hpet_enabled(void);
extern int hpet_enable(void);
extern void hpet_disable(void);
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index ba6e658..ad80a1c 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -34,6 +34,8 @@
*/
unsigned long hpet_address;
u8 hpet_blockid; /* OS timer block num */
+u8 hpet_msi_disable;
+
#ifdef CONFIG_PCI_MSI
static unsigned long hpet_num_timers;
#endif
@@ -596,6 +598,9 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
unsigned int num_timers_used = 0;
int i;

+ if (hpet_msi_disable)
+ return;
+
if (boot_cpu_has(X86_FEATURE_ARAT))
return;
id = hpet_readl(HPET_ID);
@@ -928,6 +933,9 @@ static __init int hpet_late_init(void)
hpet_reserve_platform_timers(hpet_readl(HPET_ID));
hpet_print_config();

+ if (hpet_msi_disable)
+ return 0;
+
if (boot_cpu_has(X86_FEATURE_ARAT))
return 0;

diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 18093d7..12e9fea 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -491,6 +491,19 @@ void force_hpet_resume(void)
break;
}
}
+
+/*
+ * HPET MSI on some boards (ATI SB700/SB800) has side effect on
+ * floppy DMA. Disable HPET MSI on such platforms.
+ */
+static void force_disable_hpet_msi(struct pci_dev *unused)
+{
+ hpet_msi_disable = 1;
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_SBX00_SMBUS,
+ force_disable_hpet_msi);
+
#endif

#if defined(CONFIG_PCI) && defined(CONFIG_NUMA)
--
1.6.0.6

tip-bot for Pallipadi, Venkatesh

unread,

Jan 22, 2010, 5:10:02 PM1/22/10

to

Commit-ID: 9f0b0ce525f19ef408e877b1c7662b60424c7cdc
Gitweb: http://git.kernel.org/tip/9f0b0ce525f19ef408e877b1c7662b60424c7cdc
Author: Pallipadi, Venkatesh <venkatesh...@intel.com>
AuthorDate: Thu, 21 Jan 2010 11:09:52 -0800
Committer: H. Peter Anvin <h...@zytor.com>
CommitDate: Fri, 22 Jan 2010 13:47:01 -0800

x86: Disable HPET MSI on ATI SB700/SB800

HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
side-effects on floppy DMA. Do not use HPET MSI on such platforms.

Original problem report from Mark Hounschell
http://lkml.indiana.edu/hypermail/linux/kernel/0912.2/01118.html

[ This patch needs to go to stable as well. But, there are some

conflicts that prevents the patch from going as is. I can
rebase/resubmit to stable once the patch goes upstream.

hpa: still Cc:'ing stable@ as an FYI. ]

Tested-by: Mark Hounschell <ma...@compro.net>
Signed-off-by: Venkatesh Pallipadi <venkatesh...@intel.com>

Cc: <sta...@kernel.org>
LKML-Reference: <20100121190...@linux-os.sc.intel.com>
Signed-off-by: H. Peter Anvin <h...@zytor.com>
---

tip-bot for Pallipadi, Venkatesh

unread,

Jan 23, 2010, 2:00:03 AM1/23/10

to

Commit-ID: 73472a46b5b28116b145fb5fc05242c1aa8e1461
Gitweb: http://git.kernel.org/tip/73472a46b5b28116b145fb5fc05242c1aa8e1461

Author: Pallipadi, Venkatesh <venkatesh...@intel.com>
AuthorDate: Thu, 21 Jan 2010 11:09:52 -0800

Committer: Ingo Molnar <mi...@elte.hu>
CommitDate: Sat, 23 Jan 2010 06:21:58 +0100

x86: Disable HPET MSI on ATI SB700/SB800

HPET MSI on platforms with ATI SB700/SB800 as they seem to have some

side-effects on floppy DMA. Do not use HPET MSI on such platforms.

Original problem report from Mark Hounschell
http://lkml.indiana.edu/hypermail/linux/kernel/0912.2/01118.html

[ This patch needs to go to stable as well. But, there are some

conflicts that prevents the patch from going as is. I can
rebase/resubmit to stable once the patch goes upstream.

hpa: still Cc:'ing stable@ as an FYI. ]

Tested-by: Mark Hounschell <ma...@compro.net>
Signed-off-by: Venkatesh Pallipadi <venkatesh...@intel.com>

Cc: <sta...@kernel.org>
LKML-Reference: <20100121190...@linux-os.sc.intel.com>
Signed-off-by: H. Peter Anvin <h...@zytor.com>
---

Yuhong Bao

unread,

Jan 23, 2010, 2:30:02 AM1/23/10

to

> HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
> side-effects on floppy DMA. Do not use HPET MSI on such platforms.

I think somebody from AMD should review the situation.Clearly something is happening inside their southbridge.CCingï¿½Andreas Herrmannï¿½from AMD.
Yuhong Bao
_________________________________________________________________
Hotmail: Trusted email with Microsoftï¿½s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/196390706/direct/01/--

Andreas Herrmann

unread,

Jan 25, 2010, 12:20:03 PM1/25/10

to

On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
>
> > HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
> > side-effects on floppy DMA. Do not use HPET MSI on such platforms.

Argh, will see what information I can find about this problem ...

> I think somebody from AMD should review the situation.Clearly

something is happening inside their southbridge.CCing�Andreas
Herrmann�from AMD.

I have the feeling that this problem should rather be fixed with a DMI
quirk instead of disabling HPET MSI for the entire chipset.

Was the latest available BIOS installed on the affected system?

Thanks,
Andreas

--
Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M�nchen, Germany
Research | Gesch�ftsf�hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M�nchen
(OSRC) | Registergericht M�nchen, HRB Nr. 43632

Mark Hounschell

unread,

Jan 28, 2010, 4:20:01 AM1/28/10

to

On 01/25/2010 12:10 PM, Andreas Herrmann wrote:
> On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
>>
>>> HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
>>> side-effects on floppy DMA. Do not use HPET MSI on such platforms.
>
> Argh, will see what information I can find about this problem ...
>
>> I think somebody from AMD should review the situation.Clearly
> something is happening inside their southbridge.CCing Andreas
> Herrmann from AMD.
>
> I have the feeling that this problem should rather be fixed with a DMI
> quirk instead of disabling HPET MSI for the entire chipset.
>
> Was the latest available BIOS installed on the affected system?
>

You mean "systems" of different manufactures? I will check today. Due to
mis configured filters I didn't see this until today. Sorry.

Mark

Mark Hounschell

unread,

Jan 28, 2010, 8:30:01 AM1/28/10

to

On 01/28/2010 04:17 AM, Mark Hounschell wrote:
> On 01/25/2010 12:10 PM, Andreas Herrmann wrote:
>> On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
>>>
>>>> HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
>>>> side-effects on floppy DMA. Do not use HPET MSI on such platforms.
>>
>> Argh, will see what information I can find about this problem ...
>>
>>> I think somebody from AMD should review the situation.Clearly
>> something is happening inside their southbridge.CCing Andreas
>> Herrmann from AMD.
>>
>> I have the feeling that this problem should rather be fixed with a DMI
>> quirk instead of disabling HPET MSI for the entire chipset.
>>
>> Was the latest available BIOS installed on the affected system?
>>
>
> You mean "systems" of different manufactures? I will check today. Due to
> mis configured filters I didn't see this until today. Sorry.
>
> Mark
>

My BIOS were below rev on all my affected boards but updating did not help
with the problem.

Andreas, while I have your ear, I am also having another issue with this
chip set doing peer to peer bus transfers between pci buses and pci-e buses
and from pci-e to pci-e buses. I've read the chip set specs and they _seem_
to imply that it may not be allowed due to "Trusted Computing" something or
another. I've posed the issue to the AMD forums with no luck, and
I can't figure out why this doesn't work using these chip sets.

Sorry to change the subject. I just figured I'd ask someone from AMD while
I had the chance.

Thanks and Regards

Borislav Petkov

unread,

Jan 28, 2010, 8:50:02 AM1/28/10

to

On Thu, Jan 28, 2010 at 08:25:23AM -0500, Mark Hounschell wrote:
> On 01/28/2010 04:17 AM, Mark Hounschell wrote:
> > On 01/25/2010 12:10 PM, Andreas Herrmann wrote:
> >> On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
> >>>
> >>>> HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
> >>>> side-effects on floppy DMA. Do not use HPET MSI on such platforms.
> >>
> >> Argh, will see what information I can find about this problem ...
> >>
> >>> I think somebody from AMD should review the situation.Clearly
> >> something is happening inside their southbridge.CCing Andreas
> >> Herrmann from AMD.
> >>
> >> I have the feeling that this problem should rather be fixed with a DMI
> >> quirk instead of disabling HPET MSI for the entire chipset.
> >>
> >> Was the latest available BIOS installed on the affected system?
> >>
> >
> > You mean "systems" of different manufactures? I will check today. Due to
> > mis configured filters I didn't see this until today. Sorry.
> >
> > Mark
> >
>
> My BIOS were below rev on all my affected boards but updating did not help
> with the problem.

Hi,

can you post the BIOS vendors of the boards along with the respective
BIOS versions?

Thanks.

--
Regards/Gruss,
Boris.

--
Advanced Micro Devices, Inc.
Operating Systems Research Center

Mark Hounschell

unread,

Jan 28, 2010, 9:50:01 AM1/28/10

to

On 01/28/2010 08:41 AM, Borislav Petkov wrote:
> On Thu, Jan 28, 2010 at 08:25:23AM -0500, Mark Hounschell wrote:
>> On 01/28/2010 04:17 AM, Mark Hounschell wrote:
>>> On 01/25/2010 12:10 PM, Andreas Herrmann wrote:
>>>> On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
>>>>>
>>>>>> HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
>>>>>> side-effects on floppy DMA. Do not use HPET MSI on such platforms.
>>>>
>>>> Argh, will see what information I can find about this problem ...
>>>>
>>>>> I think somebody from AMD should review the situation.Clearly
>>>> something is happening inside their southbridge.CCing Andreas
>>>> Herrmann from AMD.
>>>>
>>>> I have the feeling that this problem should rather be fixed with a DMI
>>>> quirk instead of disabling HPET MSI for the entire chipset.
>>>>
>>>> Was the latest available BIOS installed on the affected system?
>>>>
>>>
>>> You mean "systems" of different manufactures? I will check today. Due to
>>> mis configured filters I didn't see this until today. Sorry.
>>>
>>> Mark
>>>
>>
>> My BIOS were below rev on all my affected boards but updating did not help
>> with the problem.
>
> Hi,
>
> can you post the BIOS vendors of the boards along with the respective
> BIOS versions?
>
> Thanks.
>

DFI DK-790FXB-M3H5 MB using AWARD bios D7SDA09.BIN (10/09/2009)
BIOSTAR TA790GXB A2+ using AMI bios 78DDA928.BST (09/28/09)

Regards
Mark

Andreas Herrmann

unread,

May 17, 2010, 11:10:01 AM5/17/10

to

On Mon, Jan 25, 2010 at 06:10:59PM +0100, Andreas Herrmann wrote:
> On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
> >
> > > HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
> > > side-effects on floppy DMA. Do not use HPET MSI on such platforms.
>
> Argh, will see what information I can find about this problem ...

FYI. I've tried to trigger the publication of errata information for that
chipset. Finally this has happened.

The discussed problem is indeed due to an erratum. See erratum #27 in
http://support.amd.com/us/Embedded_TechDocs/46837.pdf

The suggested workaround for this is to disable HPET MSI if LPC
devices are used. I doubt that there is a convenient way for Linux to
find out whether LPC devices are used. Thus I think the only solution
to safely avoid the problem is the currently implemented quirk to
disable HPET MSI on this chipset.

Regards,

Andreas

--
Operating | Advanced Micro Devices GmbH

System | Einsteinring 24, 85609 Dornach b. M�nchen, Germany

Yuhong Bao

unread,

May 17, 2010, 11:20:02 AM5/17/10

to

> The suggested workaround for this is to disable HPET MSI if LPC
> devices are used. I doubt that there is a convenient way for Linux to

> find out whether LPC devices are used.And don't forget that the Super I/O chip in most motherboards is an LPC device!(In fact, that was what LPC was invented for)

> Thus I think the only solution
> to safely avoid the problem is the currently implemented quirk to
> disable HPET MSI on this chipset.

Yuhong Bao

_________________________________________________________________
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.
http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4--

Linus Torvalds

unread,

May 17, 2010, 11:20:02 AM5/17/10

to

On Mon, 17 May 2010, Andreas Herrmann wrote:
>
> FYI. I've tried to trigger the publication of errata information for that
> chipset. Finally this has happened.
>
> The discussed problem is indeed due to an erratum. See erratum #27 in
> http://support.amd.com/us/Embedded_TechDocs/46837.pdf
>
> The suggested workaround for this is to disable HPET MSI if LPC
> devices are used. I doubt that there is a convenient way for Linux to
> find out whether LPC devices are used. Thus I think the only solution
> to safely avoid the problem is the currently implemented quirk to
> disable HPET MSI on this chipset.

Goodie. It would be good to point this out in the source too. Would you be
willing to send in a patch that documents this quirk as a result of that
erratum #27, so that we don't lose sight of why we're doing that odd MSI
disable?

Linus

Andreas Herrmann

unread,

May 17, 2010, 12:50:03 PM5/17/10

to

On Mon, May 17, 2010 at 11:12:59AM -0400, Linus Torvalds wrote:
>
>
> On Mon, 17 May 2010, Andreas Herrmann wrote:
> >
> > FYI. I've tried to trigger the publication of errata information for that
> > chipset. Finally this has happened.
> >
> > The discussed problem is indeed due to an erratum. See erratum #27 in
> > http://support.amd.com/us/Embedded_TechDocs/46837.pdf
> >
> > The suggested workaround for this is to disable HPET MSI if LPC
> > devices are used. I doubt that there is a convenient way for Linux to
> > find out whether LPC devices are used. Thus I think the only solution
> > to safely avoid the problem is the currently implemented quirk to
> > disable HPET MSI on this chipset.
>
> Goodie. It would be good to point this out in the source too. Would you be
> willing to send in a patch that documents this quirk as a result of that
> erratum #27, so that we don't lose sight of why we're doing that odd MSI
> disable?

Done that.
See http://marc.info/?l=linux-kernel&m=127411462230838

Andreas

--
Operating | Advanced Micro Devices GmbH
System | Einsteinring 24, 85609 Dornach b. M�nchen, Germany
Research | Gesch�ftsf�hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M�nchen
(OSRC) | Registergericht M�nchen, HRB Nr. 43632

Robert Hancock

unread,

May 17, 2010, 9:00:02 PM5/17/10

to

On 05/17/2010 08:59 AM, Andreas Herrmann wrote:
> On Mon, Jan 25, 2010 at 06:10:59PM +0100, Andreas Herrmann wrote:
>> On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
>>>
>>>> HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
>>>> side-effects on floppy DMA. Do not use HPET MSI on such platforms.
>>
>> Argh, will see what information I can find about this problem ...
>
> FYI. I've tried to trigger the publication of errata information for that
> chipset. Finally this has happened.
>
> The discussed problem is indeed due to an erratum. See erratum #27 in
> http://support.amd.com/us/Embedded_TechDocs/46837.pdf
>
> The suggested workaround for this is to disable HPET MSI if LPC
> devices are used. I doubt that there is a convenient way for Linux to
> find out whether LPC devices are used. Thus I think the only solution
> to safely avoid the problem is the currently implemented quirk to
> disable HPET MSI on this chipset.

If one wanted, you could disable HPET MSI on this chipset only when a
driver requests an ISA DMA channel. Then if there's no floppy or other
LPC DMA device present, it can stay enabled. I don't know if it's worth
the trouble, though.

Linus Torvalds

unread,

May 17, 2010, 9:10:02 PM5/17/10

to

On Mon, 17 May 2010, Robert Hancock wrote:
>
> If one wanted, you could disable HPET MSI on this chipset only when a driver
> requests an ISA DMA channel. Then if there's no floppy or other LPC DMA device
> present, it can stay enabled. I don't know if it's worth the trouble, though.

Nope, that wouldn't work.

Imagine a driver that already loaded, and is already using MSI (say,
network device). What happens now if you want to access the floppy and
load the floppy module? Oh, you can't? Need to bring down the network
interface, unload that module first? Not practical.

Sure, in theory we can do some crazy callback for "you now need to re-do
your interrupt registration" for all devices. In practice, I can onyl say
"not going to happen".

Linus

Robert Hancock

unread,

May 17, 2010, 9:10:02 PM5/17/10

to

On Mon, May 17, 2010 at 7:02 PM, Linus Torvalds
<torv...@linux-foundation.org> wrote:
>
>
> On Mon, 17 May 2010, Robert Hancock wrote:
>>
>> If one wanted, you could disable HPET MSI on this chipset only when a driver
>> requests an ISA DMA channel. Then if there's no floppy or other LPC DMA device
>> present, it can stay enabled. I don't know if it's worth the trouble, though.
>
> Nope, that wouldn't work.
>
> Imagine a driver that already loaded, and is already using MSI (say,
> network device). What happens now if you want to access the floppy and
> load the floppy module? Oh, you can't? Need to bring down the network
> interface, unload that module first? Not practical.
>
> Sure, in theory we can do some crazy callback for "you now need to re-do
> your interrupt registration" for all devices. In practice, I can onyl say
> "not going to happen".

It sounds like this bug only affects HPET MSI requests (presumably the
only ones that the southbridge can concern itself with), not any
others. It would require the HPET code to support having its MSI
support yanked away at runtime, though.

Andi Kleen

unread,

May 18, 2010, 4:50:02 AM5/18/10

to

> If one wanted, you could disable HPET MSI on this chipset only when a
> driver requests an ISA DMA channel. Then if there's no floppy or other LPC
> DMA device present, it can stay enabled. I don't know if it's worth the
> trouble, though.

There can be LPC devices which are not visible to the kernel,
but only used through ACPI or the BIOS. Think of fancy fan
controllers and similar.

-Andi
--
a...@linux.intel.com -- Speaking for myself only.

Robert Hancock

unread,

May 18, 2010, 7:30:01 PM5/18/10

to

On Tue, May 18, 2010 at 2:45 AM, Andi Kleen <an...@firstfloor.org> wrote:
>> If one wanted, you could disable HPET MSI on this chipset only when a
>> driver requests an ISA DMA channel. Then if there's no floppy or other LPC
>> DMA device present, it can stay enabled. I don't know if it's worth the
>> trouble, though.
>
> There can be LPC devices which are not visible to the kernel,
> but only used through ACPI or the BIOS. Think of fancy fan
> controllers and similar.

I would hope they wouldn't use DMA without kernel knowledge, otherwise
that really would be an abomination..