Info: mapping multiple BARs. Your kernel is fine.

Borislav Petkov

unread,

Feb 24, 2014, 11:30:02 AM2/24/14

to

This started happening this morning after booting -rc4+tip, let's
add *everybody* to CC :-)

We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe and
other goodies on the stack.

...
[ 0.488998] software IO TLB [mem 0xcac30000-0xcec30000] (64MB) mapped at [ffff8800cac30000-ffff8800cec2ffff]
[ 0.489975] resource map sanity check conflict: 0xfed10000 0xfed15fff 0xfed10000 0xfed13fff pnp 00:01
[ 0.490079] ------------[ cut here ]------------
[ 0.490204] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 __ioremap_caller+0x372/0x380()
[ 0.490306] Info: mapping multiple BARs. Your kernel is fine.
[ 0.490371] Modules linked in:
[ 0.490558] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #1
[ 0.490642] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
[ 0.490742] 00000000000000ab ffff880213d01ad8 ffffffff816112e3 0000000000000006
[ 0.491032] ffff880213d01b28 ffff880213d01b18 ffffffff8104e9bc ffff880213d01b08
[ 0.491343] ffffc90000c58000 00000000fed10000 00000000fed10000 0000000000006000
[ 0.491631] Call Trace:
[ 0.493337] [<ffffffff816112e3>] dump_stack+0x4f/0x7c
[ 0.493420] [<ffffffff8104e9bc>] warn_slowpath_common+0x8c/0xc0
[ 0.493503] [<ffffffff8104eaa6>] warn_slowpath_fmt+0x46/0x50
[ 0.493588] [<ffffffff8103f1e2>] __ioremap_caller+0x372/0x380
[ 0.493674] [<ffffffff810211a2>] ? snb_uncore_imc_init_box+0x62/0x90
[ 0.493761] [<ffffffff8103f247>] ioremap_nocache+0x17/0x20
[ 0.493846] [<ffffffff810211a2>] snb_uncore_imc_init_box+0x62/0x90
[ 0.493933] [<ffffffff81022925>] uncore_pci_probe+0xe5/0x1e0
[ 0.494020] [<ffffffff812d487e>] local_pci_probe+0x4e/0xa0
[ 0.494104] [<ffffffff81418a59>] ? get_device+0x19/0x20
[ 0.494213] [<ffffffff812d5cd1>] pci_device_probe+0xe1/0x130
[ 0.494300] [<ffffffff8141d3cb>] driver_probe_device+0x7b/0x240
[ 0.494385] [<ffffffff8141d63b>] __driver_attach+0xab/0xb0
[ 0.494469] [<ffffffff8141d590>] ? driver_probe_device+0x240/0x240
[ 0.494551] [<ffffffff8141b71e>] bus_for_each_dev+0x5e/0x90
[ 0.494634] [<ffffffff8141cede>] driver_attach+0x1e/0x20
[ 0.494718] [<ffffffff8141ca57>] bus_add_driver+0x117/0x230
[ 0.494802] [<ffffffff8141dd34>] driver_register+0x64/0xf0
[ 0.494884] [<ffffffff812d4c14>] __pci_register_driver+0x64/0x70
[ 0.494972] [<ffffffff81d0319b>] ? uncore_types_init+0x19c/0x19c
[ 0.495056] [<ffffffff81d03312>] intel_uncore_init+0x177/0x41c
[ 0.495155] [<ffffffff81d0319b>] ? uncore_types_init+0x19c/0x19c
[ 0.495242] [<ffffffff8100029e>] do_one_initcall+0x4e/0x170
[ 0.495326] [<ffffffff81071100>] ? parse_args+0x60/0x360
[ 0.495411] [<ffffffff81cfbfb8>] kernel_init_freeable+0x106/0x19a
[ 0.495497] [<ffffffff81cfb83b>] ? do_early_param+0x86/0x86
[ 0.495582] [<ffffffff81607ef0>] ? rest_init+0xd0/0xd0
[ 0.495666] [<ffffffff81607efe>] kernel_init+0xe/0xf0
[ 0.495749] [<ffffffff81621f6c>] ret_from_fork+0x7c/0xb0
[ 0.495831] [<ffffffff81607ef0>] ? rest_init+0xd0/0xd0
[ 0.495921] ---[ end trace 428f365c054d9a01 ]---
[ 0.496196] RAPL PMU detected, hw unit 2^-16 Joules, API unit is 2^-32 Joules, 3 fixed counters 163840 ms ovfl timer
[ 0.498598] futex hash table entries: 1024 (order: 5, 131072 bytes)
[ 0.498833] audit: initializing netlink subsys (disabled)
[ 0.499024] audit: type=2000 audit(1393259866.477:1): initialized
...

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Borislav Petkov

unread,

Feb 24, 2014, 3:20:03 PM2/24/14

to

Btw,

I don't know whether the following observation is related or not, but it
so happens that after resume from suspend-to-disk, I see the booting up
of the resume kernel on the console but when it is time for the original
kernel to take over and switch to graphics, the screen remains black but
the machine is responsive over the network.

And this doesn't happen on every resume but only sporadically.

And yep, -rc3 was fine.

H. Peter Anvin

unread,

Feb 25, 2014, 10:50:01 AM2/25/14

to

On 02/24/2014 12:19 PM, Borislav Petkov wrote:
> Btw,
>
> I don't know whether the following observation is related or not, but it
> so happens that after resume from suspend-to-disk, I see the booting up
> of the resume kernel on the console but when it is time for the original
> kernel to take over and switch to graphics, the screen remains black but
> the machine is responsive over the network.
>
> And this doesn't happen on every resume but only sporadically.
>
> And yep, -rc3 was fine.
>
> On Mon, Feb 24, 2014 at 05:24:00PM +0100, Borislav Petkov wrote:
>> This started happening this morning after booting -rc4+tip, let's
>> add *everybody* to CC :-)
>>
>> We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe and
>> other goodies on the stack.
>>

snb_uncore_imc_init_box() is introduced new in tip:perf/core, and is a
relatively recent commit (b9e1ab6d4c0582cad97699285a6b3cf992251b00), so
I suspect that that wasn't in whatever -rc3 mix you were testing.

I am wondering if backing/disabling out that support (perhaps by
removing the relevant PCI ID) fixes the problem?

-hpa

Stephane Eranian

unread,

Feb 25, 2014, 11:20:01 AM2/25/14

to

Hi,

I am trying to understand your test case.
Were you actually measure uncore_imc events at the time you suspended?

I tried on my IvyBridge Lenovo and it works fine with 3.14-rc4+ (tip.git).
I used: echo -n disk >/sys/power/state

Borislav Petkov

unread,

Feb 25, 2014, 11:40:01 AM2/25/14

to

On Tue, Feb 25, 2014 at 05:14:01PM +0100, Stephane Eranian wrote:
> I am trying to understand your test case.
> Were you actually measure uncore_imc events at the time you suspended?

No test case, just the machine booting; look at the printk timestamps.

> I tried on my IvyBridge Lenovo and it works fine with 3.14-rc4+
> (tip.git). I used: echo -n disk >/sys/power/state

That's an x230 too, right? What I do is, I take linus/master, merge
tip/master, Matt's efi/next tree and my edac/for-next tree into it and
then boot that.

I don't think that the edac and efi trees interfere though. I'll do a
fresh merge of only current tip/master into linus/master to test hpa's
suggestion in the other mail.

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Stephane Eranian

unread,

Feb 25, 2014, 11:40:01 AM2/25/14

to

On Tue, Feb 25, 2014 at 5:30 PM, Borislav Petkov <b...@alien8.de> wrote:
> On Tue, Feb 25, 2014 at 05:14:01PM +0100, Stephane Eranian wrote:
>> I am trying to understand your test case.
>> Were you actually measure uncore_imc events at the time you suspended?
>
> No test case, just the machine booting; look at the printk timestamps.
>
>> I tried on my IvyBridge Lenovo and it works fine with 3.14-rc4+
>> (tip.git). I used: echo -n disk >/sys/power/state
>
> That's an x230 too, right? What I do is, I take linus/master, merge
> tip/master, Matt's efi/next tree and my edac/for-next tree into it and
> then boot that.

No, it's a T430s. What happens if you boot vanilla tip.git?

Borislav Petkov

unread,

Feb 25, 2014, 12:40:02 PM2/25/14

to

On Tue, Feb 25, 2014 at 05:33:13PM +0100, Stephane Eranian wrote:
> No, it's a T430s. What happens if you boot vanilla tip.git?

linus/master + tip/master -> fails
tip/master -> fails

All trees are from today, like an hour ago or so.

Doing what hpa suggested:

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index b262c6124cf3..ec217d2d28dd 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -3871,6 +3871,7 @@ static int __init uncore_pci_init(void)
pci_uncores = snb_pci_uncores;
uncore_pci_driver = &snb_uncore_pci_driver;
break;
+#if 0
case 58: /* Ivy Bridge */
ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_IVB_IMC);
if (ret)
@@ -3878,6 +3879,7 @@ static int __init uncore_pci_init(void)
pci_uncores = snb_pci_uncores;
uncore_pci_driver = &ivb_uncore_pci_driver;
break;
+#endif
case 60: /* Haswell */
case 69: /* Haswell Celeron */
ret = snb_pci2phy_map_init(PCI_DEVICE_ID_INTEL_HSW_IMC);

for model 58, IVB, works around the issue.

Stephane Eranian

unread,

Feb 25, 2014, 2:00:01 PM2/25/14

to

On Tue, Feb 25, 2014 at 6:39 PM, Borislav Petkov <b...@alien8.de> wrote:
> On Tue, Feb 25, 2014 at 05:33:13PM +0100, Stephane Eranian wrote:
>> No, it's a T430s. What happens if you boot vanilla tip.git?
>
> linus/master + tip/master -> fails
> tip/master -> fails
>
> All trees are from today, like an hour ago or so.
>
> Doing what hpa suggested:
>

I am on tip.git at cfbf8d4 Linux 3.14-rc4
and I don't see the problem (using Ubuntu Saucy).

Given what you commented out, it seems like you're saying
something goes wrong with pci_get_device().
Am I missing some pm callbacks?

The uncore IMC is not used internally.

Borislav Petkov

unread,

Feb 25, 2014, 5:20:02 PM2/25/14

to

On Tue, Feb 25, 2014 at 07:54:53PM +0100, Stephane Eranian wrote:

> I am on tip.git at cfbf8d4 Linux 3.14-rc4.

> and I don't see the problem (using Ubuntu Saucy).

Also IVB, model 58?

> Given what you commented out, it seems like you're saying
> something goes wrong with pci_get_device().

Probably. I'll add some debug printk's tomorrow to shed some more light
on the matter.

> Am I missing some pm callbacks?

Dunno. What do you mean by "pm callbacks" exactly? I don't know that
code so I have to ask.

> The uncore IMC is not used internally.

By IMC I'm assuming this PIC dev:

#define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154

?

And "internally" means by BIOS or something behind the curtains like
SMM...?

Stephane Eranian

unread,

Feb 26, 2014, 2:00:01 AM2/26/14

to

Hi,

On Tue, Feb 25, 2014 at 11:10 PM, Borislav Petkov <b...@alien8.de> wrote:
> On Tue, Feb 25, 2014 at 07:54:53PM +0100, Stephane Eranian wrote:
>
>> I am on tip.git at cfbf8d4 Linux 3.14-rc4.
>> and I don't see the problem (using Ubuntu Saucy).
>
> Also IVB, model 58?
>

Yes.

>> Given what you commented out, it seems like you're saying
>> something goes wrong with pci_get_device().
>
> Probably. I'll add some debug printk's tomorrow to shed some more light
> on the matter.
>
>> Am I missing some pm callbacks?
>
> Dunno. What do you mean by "pm callbacks" exactly? I don't know that
> code so I have to ask.
>

power management callbacks.

>> The uncore IMC is not used internally.
>
> By IMC I'm assuming this PIC dev:
>
> #define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154
>
> ?
>

Yes. Needs to point to the DRAM controller.

> And "internally" means by BIOS or something behind the curtains like
> SMM...?
>

I meant by the kernel.

Borislav Petkov

unread,

Feb 26, 2014, 4:30:02 AM2/26/14

to

On Wed, Feb 26, 2014 at 07:56:58AM +0100, Stephane Eranian wrote:
> > Also IVB, model 58?
> >
> Yes.

Right, so it must be chipset-specific.

> > Dunno. What do you mean by "pm callbacks" exactly? I don't know that
> > code so I have to ask.
> >
> power management callbacks.

Ok, just as I thought. But why would they be relevant if this happens
very early during boot?

> > #define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154

> Yes. Needs to point to the DRAM controller.

It seems I have it :-)

$ lspci -xxx -s 00.0
00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller (rev 09)
00: 86 80 54 01 06 00 90 20 09 00 00 06 00 00 00 00
^^^^^

10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 aa 17 fa 21
30: 00 00 00 00 e0 00 00 00 00 00 00 00 00 00 00 00
40: 01 90 d1 fe 00 00 00 00 01 00 d1 fe 00 00 00 00
50: 11 02 00 00 11 00 00 00 07 00 90 df 01 00 00 db
60: 05 00 00 f8 00 00 00 00 01 80 d1 fe 00 00 00 00
70: 00 00 00 fe 01 00 00 00 00 0c 00 fe 7f 00 00 00
80: 10 11 11 11 11 11 11 00 1a 00 00 00 00 00 00 00
90: 01 00 00 fe 01 00 00 00 01 00 50 1e 02 00 00 00
a0: 01 00 00 00 02 00 00 00 01 00 60 1e 02 00 00 00
b0: 01 00 a0 db 01 00 80 db 01 00 00 db 01 00 a0 df
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 09 00 0c 01 9b 61 00 e2 d0 00 e8 76 00 00 00 00
f0: 00 00 00 01 00 00 00 00 c8 0f 09 00 00 00 00 00

Anyway, here's some more debugging output and some more staring:

So we're correctly getting 0x154 and then snb_uncore_imc_init_box()
tries to ioremap 0xfed10000 but this fails the resource map check with:

[ 0.485356] resource map sanity check conflict: 0xfed10000 0xfed15fff 0xfed10000 0xfed13fff pnp 00:01

and the pnp 00:01 device already partially occupies that range (from
/proc/iomem):

fed10000-fed13fff : pnp 00:01

Oh, and snb_uncore_imc_init_box() gets that address from
SNB_UNCORE_PCI_IMC_BAR_OFFSET and SNB_UNCORE_PCI_IMC_BAR_OFFSET+4 and
they start at offset 0x48 in the PCI config space above, i.e.

40: 01 90 d1 fe 00 00 00 00 01 00 d1 fe 00 00 00 00
^^^^^^^^^^^^^^^^^^^^^^^

which is 0x000000fed10001 (the 0x1 bit disappears after addr &= ~(PAGE_SIZE - 1);)

So I'm guessing it is time to talk to platform guys and ask them why
they're putting SNB_UNCORE_PCI_IMC_BAR_OFFSET{,+4} in an overlapping
range with pnp 00:01.

[ 0.484023] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 0.484108] software IO TLB [mem 0xcac30000-0xcec30000] (64MB) mapped at [ffff8800cac30000-ffff8800cec2ffff]
[ 0.484971] DBG: will get device: 0x8086:154
[ 0.485054] DBG: Got device, bus: 0x0
[ 0.485254] DBG: ioremapping addr: 0xfed10000
[ 0.485356] resource map sanity check conflict: 0xfed10000 0xfed15fff 0xfed10000 0xfed13fff pnp 00:01
[ 0.485460] ------------[ cut here ]------------
[ 0.485544] WARNING: CPU: 2 PID: 1 at arch/x86/mm/ioremap.c:171 __ioremap_caller+0x372/0x380()
[ 0.485643] Info: mapping multiple BARs. Your kernel is fine.
[ 0.485709] Modules linked in:
[ 0.485935] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc4+ #6
[ 0.486019] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
[ 0.486117] 00000000000000ab ffff880213d01ad8 ffffffff81611339 0000000000000006
[ 0.486411] ffff880213d01b28 ffff880213d01b18 ffffffff8104e9cc ffff880213d01b08
[ 0.488308] ffffc90000c58000 00000000fed10000 00000000fed10000 0000000000006000
[ 0.488595] Call Trace:
[ 0.488671] [<ffffffff81611339>] dump_stack+0x4f/0x7c
[ 0.488754] [<ffffffff8104e9cc>] warn_slowpath_common+0x8c/0xc0
[ 0.488877] [<ffffffff8104eab6>] warn_slowpath_fmt+0x46/0x50
[ 0.488966] [<ffffffff8103f1f2>] __ioremap_caller+0x372/0x380
[ 0.489052] [<ffffffff810211b6>] ? snb_uncore_imc_init_box+0x76/0xa0
[ 0.489137] [<ffffffff8103f257>] ioremap_nocache+0x17/0x20
[ 0.489221] [<ffffffff810211b6>] snb_uncore_imc_init_box+0x76/0xa0
[ 0.489307] [<ffffffff81022935>] uncore_pci_probe+0xe5/0x1e0
[ 0.489391] [<ffffffff812d488e>] local_pci_probe+0x4e/0xa0
[ 0.489474] [<ffffffff81418a69>] ? get_device+0x19/0x20
[ 0.489558] [<ffffffff812d5ce1>] pci_device_probe+0xe1/0x130
[ 0.489642] [<ffffffff8141d3db>] driver_probe_device+0x7b/0x240
[ 0.489726] [<ffffffff8141d64b>] __driver_attach+0xab/0xb0
[ 0.489834] [<ffffffff8141d5a0>] ? driver_probe_device+0x240/0x240
[ 0.489920] [<ffffffff8141b72e>] bus_for_each_dev+0x5e/0x90
[ 0.490003] [<ffffffff8141ceee>] driver_attach+0x1e/0x20
[ 0.490086] [<ffffffff8141ca67>] bus_add_driver+0x117/0x230
[ 0.490170] [<ffffffff8141dd44>] driver_register+0x64/0xf0
[ 0.490251] [<ffffffff812d4c24>] __pci_register_driver+0x64/0x70
[ 0.490337] [<ffffffff81d0319b>] ? uncore_types_init+0x19c/0x19c
[ 0.490421] [<ffffffff81d03331>] intel_uncore_init+0x196/0x462
[ 0.490504] [<ffffffff81d0319b>] ? uncore_types_init+0x19c/0x19c
[ 0.490591] [<ffffffff8100029e>] do_one_initcall+0x4e/0x170
[ 0.490676] [<ffffffff81071100>] ? parse_args+0x50/0x360
[ 0.490762] [<ffffffff81cfbfb8>] kernel_init_freeable+0x106/0x19a
[ 0.490863] [<ffffffff81cfb83b>] ? do_early_param+0x86/0x86
[ 0.490948] [<ffffffff81607f00>] ? rest_init+0xd0/0xd0
[ 0.491032] [<ffffffff81607f0e>] kernel_init+0xe/0xf0
[ 0.491116] [<ffffffff81621fac>] ret_from_fork+0x7c/0xb0
[ 0.491199] [<ffffffff81607f00>] ? rest_init+0xd0/0xd0
[ 0.491289] ---[ end trace b31a7f760e34b24a ]---
[ 0.491547] RAPL PMU detected, hw unit 2^-16 Joules, API unit is 2^-32 Joules, 3 fixed counters 163840 ms ovfl timer
[ 0.493962] futex hash table entries: 1024 (order: 5, 131072 bytes)

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Stephane Eranian

unread,

Feb 26, 2014, 4:50:02 AM2/26/14

to

Hi,

Ok, so I am getting the same error message as you.
I checked my syslog now.

I have my uncore_imc addr=0xfed10000 (after masking)

And I also have pnp 00:01 overlapping the imc range completely.

What pnp device does it really represent? the DRAM controller?

So I think my laptop behaves like yours.

Borislav Petkov

unread,

Feb 26, 2014, 5:10:02 AM2/26/14

to

Can you please, pretty please, not top-post...

On Wed, Feb 26, 2014 at 10:47:05AM +0100, Stephane Eranian wrote:
> Hi,
>
> Ok, so I am getting the same error message as you.
> I checked my syslog now.
>
> I have my uncore_imc addr=0xfed10000 (after masking)
>
> And I also have pnp 00:01 overlapping the imc range completely.
>
> What pnp device does it really represent? the DRAM controller?
>
> So I think my laptop behaves like yours.

grep -Er . /sys/devices/pnp0/00\:01/* 2>/dev/null
/sys/devices/pnp0/00:01/firmware_node/hid:PNP0C02
...

so this PNP0C02 is

[ 0.363943] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)

@Rafael, can you please make sense of this whole ACPI gunk?

We have a resource conflict with pnp 00:01, analysis here:
http://lkml.kernel.org/r/20140226092...@pd.tnic

This is the rest of the 00:01 info from sysfs:

/sys/devices/pnp0/00:01/firmware_node/uid:0
/sys/devices/pnp0/00:01/firmware_node/path:\_SB_.PCI0.LPC_.SIO_
/sys/devices/pnp0/00:01/firmware_node/power/control:auto
/sys/devices/pnp0/00:01/firmware_node/power/runtime_active_time:0
/sys/devices/pnp0/00:01/firmware_node/power/runtime_status:unsupported
/sys/devices/pnp0/00:01/firmware_node/power/runtime_suspended_time:0
/sys/devices/pnp0/00:01/firmware_node/modalias:acpi:PNP0C02:
/sys/devices/pnp0/00:01/firmware_node/uevent:MODALIAS=acpi:PNP0C02:
/sys/devices/pnp0/00:01/id:PNP0c02
/sys/devices/pnp0/00:01/power/control:auto
/sys/devices/pnp0/00:01/power/runtime_active_time:0
/sys/devices/pnp0/00:01/power/runtime_status:unsupported
/sys/devices/pnp0/00:01/power/runtime_suspended_time:0
/sys/devices/pnp0/00:01/resources:state = active
/sys/devices/pnp0/00:01/resources:io 0x10-0x1f
/sys/devices/pnp0/00:01/resources:io 0x90-0x9f
/sys/devices/pnp0/00:01/resources:io 0x24-0x25
/sys/devices/pnp0/00:01/resources:io 0x28-0x29
/sys/devices/pnp0/00:01/resources:io 0x2c-0x2d
/sys/devices/pnp0/00:01/resources:io 0x30-0x31
/sys/devices/pnp0/00:01/resources:io 0x34-0x35
/sys/devices/pnp0/00:01/resources:io 0x38-0x39
/sys/devices/pnp0/00:01/resources:io 0x3c-0x3d
/sys/devices/pnp0/00:01/resources:io 0xa4-0xa5
/sys/devices/pnp0/00:01/resources:io 0xa8-0xa9
/sys/devices/pnp0/00:01/resources:io 0xac-0xad
/sys/devices/pnp0/00:01/resources:io 0xb0-0xb5
/sys/devices/pnp0/00:01/resources:io 0xb8-0xb9
/sys/devices/pnp0/00:01/resources:io 0xbc-0xbd
/sys/devices/pnp0/00:01/resources:io 0x50-0x53
/sys/devices/pnp0/00:01/resources:io 0x72-0x77
/sys/devices/pnp0/00:01/resources:io 0x400-0x47f
/sys/devices/pnp0/00:01/resources:io 0x500-0x57f
/sys/devices/pnp0/00:01/resources:io 0x800-0x80f
/sys/devices/pnp0/00:01/resources:io 0x15e0-0x15ef
/sys/devices/pnp0/00:01/resources:io 0x1600-0x167f
/sys/devices/pnp0/00:01/resources:mem 0xf8000000-0xfbffffff
/sys/devices/pnp0/00:01/resources:mem 0xfffff000-0xffffffff
/sys/devices/pnp0/00:01/resources:mem 0xfed1c000-0xfed1ffff
/sys/devices/pnp0/00:01/resources:mem 0xfed10000-0xfed13fff
/sys/devices/pnp0/00:01/resources:mem 0xfed18000-0xfed18fff
/sys/devices/pnp0/00:01/resources:mem 0xfed19000-0xfed19fff
/sys/devices/pnp0/00:01/resources:mem 0xfed45000-0xfed4bfff
/sys/devices/pnp0/00:01/resources:mem 0xfed40000-0xfed44fff
/sys/devices/pnp0/00:01/subsystem/drivers_autoprobe:1
/sys/devices/pnp0/00:01/uevent:DRIVER=system

Rafael J. Wysocki

unread,

Feb 26, 2014, 8:50:02 AM2/26/14

to

On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
> This started happening this morning after booting -rc4+tip, let's
> add *everybody* to CC :-)

What about -rc4 without tip?

I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

Peter Zijlstra

unread,

Feb 26, 2014, 9:00:01 AM2/26/14

to

On Wed, Feb 26, 2014 at 02:57:16PM +0100, Rafael J. Wysocki wrote:
> On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
> > This started happening this morning after booting -rc4+tip, let's
> > add *everybody* to CC :-)
>
> What about -rc4 without tip?

The driver causing this is new and lives in -tip.

Borislav Petkov

unread,

Feb 26, 2014, 9:00:02 AM2/26/14

to

On Wed, Feb 26, 2014 at 02:57:16PM +0100, Rafael J. Wysocki wrote:

> On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
> > This started happening this morning after booting -rc4+tip, let's
> > add *everybody* to CC :-)
>
> What about -rc4 without tip?

I don't think so because

commit b9e1ab6d4c0582cad97699285a6b3cf992251b00
Author: Stephane Eranian <era...@google.com>
Date: Tue Feb 11 16:20:12 2014 +0100

perf/x86/uncore: add SNB/IVB/HSW client uncore memory controller support

in -tip introduces that snb_uncore_imc_init_box() thing which causes the
ioremap conflict.

Btw, see my last email on this thread for more details about what I'm
seeing here.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Stephane Eranian

unread,

Feb 27, 2014, 5:20:03 AM2/27/14

to

On Wed, Feb 26, 2014 at 10:59 AM, Borislav Petkov <b...@alien8.de> wrote:
> Can you please, pretty please, not top-post...
>
> On Wed, Feb 26, 2014 at 10:47:05AM +0100, Stephane Eranian wrote:
>> Hi,
>>
>> Ok, so I am getting the same error message as you.
>> I checked my syslog now.
>>
>> I have my uncore_imc addr=0xfed10000 (after masking)
>>
>> And I also have pnp 00:01 overlapping the imc range completely.
>>
>> What pnp device does it really represent? the DRAM controller?
>>
>> So I think my laptop behaves like yours.
>
> grep -Er . /sys/devices/pnp0/00\:01/* 2>/dev/null
> /sys/devices/pnp0/00:01/firmware_node/hid:PNP0C02
> ...
>
> so this PNP0C02 is
>
> [ 0.363943] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)
>

My Lenovo IVB is like yours. But I tried on my SandyBridge desktop and
there to BAR is at a completely different address. Same thing on my Haswell
desktop system.

As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
They hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I am
not so sure this is all related to the uncore IMC support, though.

Borislav Petkov

unread,

Feb 27, 2014, 5:30:02 AM2/27/14

to

On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
> My Lenovo IVB is like yours. But I tried on my SandyBridge desktop and
> there to BAR is at a completely different address. Same thing on my
> Haswell desktop system.

Hrrm, I'd like to see what Rafael finds out, whether what we're reading
from PCI config space is even sane.

> As a asides, my SNB and HSW desktops with 3.14-rc4 are totally
> unstable. They hang if I type make in my kernel tree. Whereas 3.14-rc3
> is stable. I am not so sure this is all related to the uncore IMC
> support, though.

Easy to test - just disable the uncore thing.

Stephane Eranian

unread,

Feb 27, 2014, 5:40:02 AM2/27/14

to

On Thu, Feb 27, 2014 at 11:30 AM, Peter Zijlstra <pet...@infradead.org> wrote:
> On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:

>> As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
>> They hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I am
>> not so sure this is all related to the uncore IMC support, though.
>

> Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
> patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
> up soon.

Yes, I mean from tip.git.

Peter Zijlstra

unread,

Feb 27, 2014, 5:40:03 AM2/27/14

to

On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:

> As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
> They hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I am
> not so sure this is all related to the uncore IMC support, though.

Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
up soon.

Peter Zijlstra

unread,

Feb 27, 2014, 6:10:01 AM2/27/14

to

On Thu, Feb 27, 2014 at 11:32:58AM +0100, Stephane Eranian wrote:
> On Thu, Feb 27, 2014 at 11:30 AM, Peter Zijlstra <pet...@infradead.org> wrote:
> > On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
> >> As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
> >> They hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I am
> >> not so sure this is all related to the uncore IMC support, though.
> >
> > Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
> > patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
> > up soon.
>
> Yes, I mean from tip.git.

lkml.kernel.org/r/20140224121...@twins.programming.kicks-ass.net

Should cure things; unless there's more borkage.

Stephane Eranian

unread,

Feb 27, 2014, 7:30:02 AM2/27/14

to

On Thu, Feb 27, 2014 at 12:08 PM, Peter Zijlstra <pet...@infradead.org> wrote:
> On Thu, Feb 27, 2014 at 11:32:58AM +0100, Stephane Eranian wrote:
>> On Thu, Feb 27, 2014 at 11:30 AM, Peter Zijlstra <pet...@infradead.org> wrote:
>> > On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
>> >> As a asides, my SNB and HSW desktops with 3.14-rc4 are totally unstable.
>> >> They hang if I type make in my kernel tree. Whereas 3.14-rc3 is stable. I am
>> >> not so sure this is all related to the uncore IMC support, though.
>> >
>> > Unstable with 3.14-rc4-tip you mean? Yeah, there's a rather crucial
>> > patch missing. I'll try and get Thomas to merge it if Ingo doesn't show
>> > up soon.
>>
>> Yes, I mean from tip.git.
>
> lkml.kernel.org/r/20140224121...@twins.programming.kicks-ass.net
>
> Should cure things; unless there's more borkage.

Works again now with your patch.
Thanks.

Rafael J. Wysocki

unread,

Feb 27, 2014, 5:00:03 PM2/27/14

to

On Thursday, February 27, 2014 11:27:22 AM Borislav Petkov wrote:
> On Thu, Feb 27, 2014 at 11:12:32AM +0100, Stephane Eranian wrote:
> > My Lenovo IVB is like yours. But I tried on my SandyBridge desktop and
> > there to BAR is at a completely different address. Same thing on my
> > Haswell desktop system.
>
> Hrrm, I'd like to see what Rafael finds out, whether what we're reading
> from PCI config space is even sane.

I won't be able to look at that before Monday I'm afraid (personal stuff).

Rafael

Borislav Petkov

unread,

Feb 27, 2014, 5:30:02 PM2/27/14

to

On Thu, Feb 27, 2014 at 11:12:17PM +0100, Rafael J. Wysocki wrote:
> I won't be able to look at that before Monday I'm afraid (personal
> stuff).

No worries, sir, whenever. It can wait.

Thanks a lot!

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Stephane Eranian

unread,

Mar 5, 2014, 4:10:02 PM3/5/14

to

Hi,

Any update on this problem?

Rafael J. Wysocki

unread,

Mar 15, 2014, 10:00:02 AM3/15/14

to

[CC list rearranged]

On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:

> This started happening this morning after booting -rc4+tip, let's
> add *everybody* to CC :-)
>
> We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe and
> other goodies on the stack.

I've just gone throught this.

So the problem is that we have the PNP "system" driver whose only purpose seems
to be to reserve system resources so that the PCI layer doesn't assign them to
new devices on hotplug (disclaimer: I didn't invent it, I only read the code and
comments in there).

It does that for ACPI device objects having the "PNP0C02" and "PNP0C01" IDs.

Apparently, snb_uncore_imc_init_box() steps on a range already reserved by that
driver on your box. And this doesn't seem to be a coincidence, because the ACPI
device object in question probably *does* correspond to the memory controller
that the uncore driver attempts to use.

I'm not sure how to address that right now to be honest. Arguably, the PNP
"system" driver should be replaced with something saner, but still the
resources it claims need to be kept out of reach of the PCI's resource
allocation code.

I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

Borislav Petkov

unread,

Mar 16, 2014, 8:00:02 AM3/16/14

to

On Sat, Mar 15, 2014 at 03:15:04PM +0100, Rafael J. Wysocki wrote:
> I've just gone throught this.

Thanks.

> So the problem is that we have the PNP "system" driver whose only purpose seems
> to be to reserve system resources so that the PCI layer doesn't assign them to
> new devices on hotplug (disclaimer: I didn't invent it, I only read the code and
> comments in there).
>
> It does that for ACPI device objects having the "PNP0C02" and "PNP0C01" IDs.

Right, pnp 00:01 is PNP0C02.

> Apparently, snb_uncore_imc_init_box() steps on a range already reserved by that
> driver on your box. And this doesn't seem to be a coincidence, because the ACPI
> device object in question probably *does* correspond to the memory controller
> that the uncore driver attempts to use.
>
> I'm not sure how to address that right now to be honest. Arguably, the PNP
> "system" driver should be replaced with something saner, but still the
> resources it claims need to be kept out of reach of the PCI's resource
> allocation code.

Well, I'm only conjecturing here but there should be a way for the
uncore code to tell the PNP "system" driver to free this resource
because uncore is going to use it now. Or something to that effect.

Oh well.

Stephane Eranian

unread,

Mar 16, 2014, 9:10:01 AM3/16/14

to

Rafael,

Thanks for the analysis.

On Sun, Mar 16, 2014 at 12:55 PM, Borislav Petkov <b...@alien8.de> wrote:
> On Sat, Mar 15, 2014 at 03:15:04PM +0100, Rafael J. Wysocki wrote:
>> I've just gone throught this.
>
> Thanks.
>
>> So the problem is that we have the PNP "system" driver whose only purpose seems
>> to be to reserve system resources so that the PCI layer doesn't assign them to
>> new devices on hotplug (disclaimer: I didn't invent it, I only read the code and
>> comments in there).
>>
>> It does that for ACPI device objects having the "PNP0C02" and "PNP0C01" IDs.
>
> Right, pnp 00:01 is PNP0C02.
>
>> Apparently, snb_uncore_imc_init_box() steps on a range already reserved by that
>> driver on your box. And this doesn't seem to be a coincidence, because the ACPI
>> device object in question probably *does* correspond to the memory controller
>> that the uncore driver attempts to use.
>>
>> I'm not sure how to address that right now to be honest. Arguably, the PNP
>> "system" driver should be replaced with something saner, but still the
>> resources it claims need to be kept out of reach of the PCI's resource
>> allocation code.
>
> Well, I'm only conjecturing here but there should be a way for the
> uncore code to tell the PNP "system" driver to free this resource
> because uncore is going to use it now. Or something to that effect.
>

I agree. The snb_uncore_imc() is making real (good) use of the device.
It needs to own it. So we need a way to free the resource from the PNP
system or a way to tell PNP need to grab it on systems with the
snb_uncore_imc() support. Does that kind of API exist?

Where do I look to prevent PNP from grabbing the IMC?

Rafael J. Wysocki

unread,

Mar 16, 2014, 8:00:03 PM3/16/14

to

drivers/pnp/system.c is the driver in question and system_pnp_probe() makes
the reservations via reserve_resources_of_dev(), so you'd need to modify that.

I'm not sure what's the right way to go here, though.

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

Rafael J. Wysocki

unread,

Mar 16, 2014, 8:10:02 PM3/16/14

to

Boris, can you please sent the acpidump output from that machine?

Aaron Lu

unread,

Mar 19, 2014, 10:30:01 PM3/19/14

to

On 03/15/2014 10:15 PM, Rafael J. Wysocki wrote:
> [CC list rearranged]
>
> On Monday, February 24, 2014 05:24:00 PM Borislav Petkov wrote:
>> This started happening this morning after booting -rc4+tip, let's
>> add *everybody* to CC :-)
>>
>> We have intel_uncore_init, snb_uncore_imc_init_box, uncore_pci_probe and
>> other goodies on the stack.
>
> I've just gone throught this.
>
> So the problem is that we have the PNP "system" driver whose only purpose seems
> to be to reserve system resources so that the PCI layer doesn't assign them to
> new devices on hotplug (disclaimer: I didn't invent it, I only read the code and
> comments in there).

And to PCI devices which have uninitialized BARs.

>
> It does that for ACPI device objects having the "PNP0C02" and "PNP0C01" IDs.
>
> Apparently, snb_uncore_imc_init_box() steps on a range already reserved by that
> driver on your box. And this doesn't seem to be a coincidence, because the ACPI
> device object in question probably *does* correspond to the memory controller
> that the uncore driver attempts to use.
>
> I'm not sure how to address that right now to be honest. Arguably, the PNP
> "system" driver should be replaced with something saner, but still the
> resources it claims need to be kept out of reach of the PCI's resource
> allocation code.

The quirk_system_pci_resources is meant to disable PNP devices' resource if
they collide with any known PCI device's BAR. I'm not sure why it doesn't work
here, perhaps the uncore PCI device doesn't have a BAR that falls in the PNP
device's resource window?

Thanks,
Aaron

Stephane Eranian

unread,

Mar 19, 2014, 10:30:01 PM3/19/14

to

Another hypothesis I am exploring with Bjorn is that the BIOS does not advertise
this correctly or that this BAR has non-standard size or behavior. So
far, we have
observed the collision only on Lenovo IvyBridge laptops. I have tried
on my desktop
SNB, IVB, HSW machines and never saw the assertion.

Zhang, Rui

unread,

Mar 19, 2014, 11:10:01 PM3/19/14

to

I've talked with Yan Zheng, and I was told that this resource "0xfed10000 - 0xfed15fff"
is got from PCI device register directly, which is not in its BAR range.
Thus IMO, it is impossible for PNP layer to be aware of this resource.

BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
resources from being allocated to uninitialized PCI devices, then IMO,
the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
resources directly, probably via a platform callback, say,
1. make drivers/pnp/system.c a no-op for PNPACPI, by checking pnp_dev->protocol.
2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 resources.
3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
resources to PCI devices.

Thanks,
rui

Stephane Eranian

unread,

Mar 19, 2014, 11:40:01 PM3/19/14

to

That is not what the perf_event code does. Nothing is hardcoded except
the IMC PCI device ids. The BAR offset is hardcoded that's all. The 0xfed10000
is discovered.

Zhang, Rui

unread,

Mar 20, 2014, 4:00:02 AM3/20/14

to

The resource length is also hardcoded to 0x6000, right?
This is probably a problem, because
only if the resource length read from PCI config space is larger than 0x4000,
drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
resource 0xfed10000 - 0xfed13fff, and the PCI device can request this
resource successfully.
In order to check this, can you please attach the dmesg output after boot?

Thanks,
rui

Yan, Zheng

unread,

Mar 20, 2014, 4:20:02 AM3/20/14

to

On 03/20/2014 03:53 PM, Zhang, Rui wrote:
> The resource length is also hardcoded to 0x6000, right?
> This is probably a problem, because
> only if the resource length read from PCI config space is larger than 0x4000,
> drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
> resource 0xfed10000 - 0xfed13fff, and the PCI device can request this
> resource successfully.
> In order to check this, can you please attach the dmesg output after boot?

maybe the issue can be fixed by below untested patch

---
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index fd5e883..2b3d834 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -1701,7 +1701,7 @@ static struct uncore_event_desc snb_uncore_imc_events[] = {
#define SNB_UNCORE_PCI_IMC_BAR_OFFSET 0x48

/* page size multiple covering all config regs */
-#define SNB_UNCORE_PCI_IMC_MAP_SIZE 0x6000
+#define SNB_UNCORE_PCI_IMC_MAP_SIZE 0x8

#define SNB_UNCORE_PCI_IMC_DATA_READS 0x1
#define SNB_UNCORE_PCI_IMC_DATA_READS_BASE 0x5050
@@ -1736,7 +1736,8 @@ static void snb_uncore_imc_init_box(struct intel_uncore_box *box)

addr &= ~(PAGE_SIZE - 1);

- box->io_addr = ioremap(addr, SNB_UNCORE_PCI_IMC_MAP_SIZE);
+ box->io_addr = ioremap(addr + SNB_UNCORE_PCI_IMC_CTR_BASE,
+ SNB_UNCORE_PCI_IMC_MAP_SIZE);
box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
}

@@ -1832,7 +1833,7 @@ static int snb_uncore_imc_event_init(struct perf_event *event)
}

/* must be done before validate_group */
- event->hw.event_base = base;
+ event->hw.event_base = base - SNB_UNCORE_PCI_IMC_CTR_BASE;
event->hw.config = cfg;
event->hw.idx = idx;

Rafael J. Wysocki

unread,

Mar 20, 2014, 8:20:01 AM3/20/14

to

Then we can drop drivers/pnp/system.c entirely I think.

> 2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 resources.
> 3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
> resources to PCI devices.

Well, sounds reasonable.

> N��r��y��b�X��ǧv�^�)޺{.n�+��{�i�b�{ay� ʇڙ�,j ��f��h��z� �w��j:+v��w�j�m�� zZ+��ݢj"��!�ir

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

Zhang Rui

unread,

Mar 20, 2014, 9:40:01 AM3/20/14

to

sorry, one correction here, I should say,
if the resource length read from PCI config space is smaller than
0x4000, the problem still exists because drivers/pnp/quirks.c do not
think this is a conflict.
But if the resource length read from PCI config space is larger than
0x4000, drivers/pnp/quirks.c can detect this conflict and prevent
resource 0xfed10000 - 0xfed13fff from being reserved.

thanks,
rui

> N��r��y��b�X��ǧv�^�)޺{.n�+��{�i�b�{ay� ʇڙ�,j ��f��h��z� �w��j:+v��w�j�m�� zZ+��ݢj"��!�i

Zhang Rui

unread,

Mar 20, 2014, 9:50:03 AM3/20/14

to

you're remapping 0xfed15050 - 0xfed1b04f instead of 0xfed10000 -
0xfed15fff ?
I do not quite understand this, but apparently this is not a FIX.
If it works for this problem, it is because 0xfed15050 - 0xfed1b04f
happens to be not conflict with any resource reserved by PNP system
driver, on this machine.

thanks,
rui

Stephane Eranian

unread,

Mar 20, 2014, 12:10:02 PM3/20/14

to

On Thu, Mar 20, 2014 at 9:16 AM, Yan, Zheng <zheng...@intel.com> wrote:
> On 03/20/2014 03:53 PM, Zhang, Rui wrote:
>> The resource length is also hardcoded to 0x6000, right?
>> This is probably a problem, because
>> only if the resource length read from PCI config space is larger than 0x4000,
>> drivers/pnp/quirks.c will detect the conflict and disable the PNP0C02
>> resource 0xfed10000 - 0xfed13fff, and the PCI device can request this
>> resource successfully.
>> In order to check this, can you please attach the dmesg output after boot?
>
> maybe the issue can be fixed by below untested patch
>
> ---
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> index fd5e883..2b3d834 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> @@ -1701,7 +1701,7 @@ static struct uncore_event_desc snb_uncore_imc_events[] = {
> #define SNB_UNCORE_PCI_IMC_BAR_OFFSET 0x48
>
> /* page size multiple covering all config regs */
> -#define SNB_UNCORE_PCI_IMC_MAP_SIZE 0x6000
> +#define SNB_UNCORE_PCI_IMC_MAP_SIZE 0x8
>

I assume ioremap() works on page boundaries.
Eventually want to expose the other counters too, not just read and
writes ( 8 bytes total).

The size of 0x6000 comes from the counter offsets: BAR + 0x5040 to BAR + 0x5054.
May be a better way of doing this would be to remap just the one page
holding them
instead of the 6 covering the entire BAR + counters. That would need
changes in the
read_counter() but that is okay.

So that would something along the line of:

addr = (addr + 0x5040) & (PAGE_SIZE - 1);
ioremap(addr, 0x1000);

Bjorn Helgaas

unread,

Mar 20, 2014, 12:50:02 PM3/20/14

to

On Wed, Mar 19, 2014 at 9:03 PM, Zhang, Rui <rui....@intel.com> wrote:

> I've talked with Yan Zheng, and I was told that this resource "0xfed10000 - 0xfed15fff"
> is got from PCI device register directly, which is not in its BAR range.
> Thus IMO, it is impossible for PNP layer to be aware of this resource.

Slow down, this isn't quite correct. The *base* address (0xfed10000)
is from a PCI config register (MCHBAR, at PCI config offset 0x48) [1].
This is a device-dependent register, so the PCI core knows neither
the base nor the size.

The device consumes address space that is not reported via the
architected PCI mechanism, so the only way to report that space is via
the PNP0C02 ACPI device. The BIOS has to determine the base and size
based on its knowledge of the hardware. On this hardware, per the
spec in [1], the region described by MCHBAR is 32KB in size.

The 0x6000 (24KB) size of the region above comes from the driver and
is actually less than what the device consumes. It is legal for a
driver to request only the area it requires, but the entire area
consumed by the device should be reported via the PNP0C02 device. The
fact that PNP0C02 reports 16KB but the device actually consumes 32KB
is a BIOS defect. This probably happened because previous versions of
this chip consumed only 16KB, and the BIOS didn't get updated for the
change.

> BTW, about drivers/pnp/system.c, if its ONLY purpose is to prevent those
> resources from being allocated to uninitialized PCI devices, then IMO,
> the best way to do this is make PCI bus handle those PNP0C01/PNP0C02
> resources directly, probably via a platform callback, say,
> 1. make drivers/pnp/system.c a no-op for PNPACPI, by checking pnp_dev->protocol.
> 2. introduce acpi_check_reserved_resource() to parsing PNP0C01/PNP0C02 resources.
> 3. in PCI bus, invoke acpi_check_reserved_resource() when assigning
> resources to PCI devices.

The purpose of system.c is indeed to prevent resources from being
allocated to other devices. This is really a question for Rafael, but
in my opinion this function (reserving resources of PNP/ACPI devices
to prevent their allocation to other devices) should be done for *all*
PNP and ACPI devices, not just the PNP0C01/PNP0C02 devices handled by
system.c.

So I think the best solution would be to move that into the ACPI core
somehow so it happens for all devices. If we had that, we could get
rid of system.c altogether, and we wouldn't have to do anything
special in PCI. This is much easier to say than to do, however,
because there are all kinds of issues with legacy resource
reservations, and we currently can't really deal with overlapping
resources.

Bjorn

[1] https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet,
sec. 3.1.2 on p. 61

Rafael J. Wysocki

unread,

Mar 20, 2014, 4:40:02 PM3/20/14

to

Well, I think you got to the bottom of this, but that's something we can
do long-term. Still, we need to find a short-term solution for the
particular issue at hand.

> If we had that, we could get
> rid of system.c altogether, and we wouldn't have to do anything
> special in PCI. This is much easier to say than to do, however,
> because there are all kinds of issues with legacy resource
> reservations, and we currently can't really deal with overlapping
> resources.

Indeed.

All above said, appended is the relevant piece of the DSDT from the machine
in question (and that is in the PCI host bridge scope).

So we have a PCI device with an ACPI object called LPC which has a child
called SIO and the _HID of that child is "PNP0C02".

I'm not sure if the way system.c handles this is correct in this particular
case to be honest.

Device (LPC)
{
Name (_ADR, 0x001F0000)
Name (_S3D, 0x03)
Name (RID, 0x00)
Device (SIO)
{
Name (_HID, EisaId ("PNP0C02"))
Name (_UID, 0x00)
Name (SCRS, ResourceTemplate ()
{
IO (Decode16,
0x0010, // Range Minimum
0x0010, // Range Maximum
0x01, // Alignment
0x10, // Length
)
IO (Decode16,
0x0090, // Range Minimum
0x0090, // Range Maximum
0x01, // Alignment
0x10, // Length
)
IO (Decode16,
0x0024, // Range Minimum
0x0024, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x0028, // Range Minimum
0x0028, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x002C, // Range Minimum
0x002C, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x0030, // Range Minimum
0x0030, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x0034, // Range Minimum
0x0034, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x0038, // Range Minimum
0x0038, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x003C, // Range Minimum
0x003C, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x00A4, // Range Minimum
0x00A4, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x00A8, // Range Minimum
0x00A8, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x00AC, // Range Minimum
0x00AC, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x00B0, // Range Minimum
0x00B0, // Range Maximum
0x01, // Alignment
0x06, // Length
)
IO (Decode16,
0x00B8, // Range Minimum
0x00B8, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x00BC, // Range Minimum
0x00BC, // Range Maximum
0x01, // Alignment
0x02, // Length
)
IO (Decode16,
0x0050, // Range Minimum
0x0050, // Range Maximum
0x01, // Alignment
0x04, // Length
)
IO (Decode16,
0x0072, // Range Minimum
0x0072, // Range Maximum
0x01, // Alignment
0x06, // Length
)
IO (Decode16,
0x0400, // Range Minimum
0x0400, // Range Maximum
0x01, // Alignment
0x80, // Length
)
IO (Decode16,
0x0500, // Range Minimum
0x0500, // Range Maximum
0x01, // Alignment
0x80, // Length
)
IO (Decode16,
0x0800, // Range Minimum
0x0800, // Range Maximum
0x01, // Alignment
0x10, // Length
)
IO (Decode16,
0x15E0, // Range Minimum
0x15E0, // Range Maximum
0x01, // Alignment
0x10, // Length
)
IO (Decode16,
0x1600, // Range Minimum
0x1600, // Range Maximum
0x01, // Alignment
0x80, // Length
)
Memory32Fixed (ReadWrite,
0xF8000000, // Address Base
0x04000000, // Address Length
)
Memory32Fixed (ReadWrite,
0x00000000, // Address Base
0x00001000, // Address Length
_Y26)
Memory32Fixed (ReadWrite,
0xFED1C000, // Address Base
0x00004000, // Address Length
)
Memory32Fixed (ReadWrite,
0xFED10000, // Address Base
0x00004000, // Address Length
)
Memory32Fixed (ReadWrite,
0xFED18000, // Address Base
0x00001000, // Address Length
)
Memory32Fixed (ReadWrite,
0xFED19000, // Address Base
0x00001000, // Address Length
)
Memory32Fixed (ReadWrite,
0xFED45000, // Address Base
0x00007000, // Address Length
)
})
CreateDWordField (SCRS, \_SB.PCI0.LPC.SIO._Y26._BAS, TRMB)
Method (_CRS, 0, NotSerialized)
{
Store (\TBAB, TRMB)
If (LEqual (\_SB.PCI0.LPC.TPM._STA (), 0x0F))
{
Return (SCRS)
}
Else
{
Subtract (SizeOf (SCRS), 0x02, Local0)
Name (BUF0, Buffer (Local0) {})
Add (Local0, SizeOf (\_SB.PCI0.LPC.TPM.BUF1), Local0)
Name (BUF1, Buffer (Local0) {})
Store (SCRS, BUF0)
Concatenate (BUF0, \_SB.PCI0.LPC.TPM.BUF1, BUF1)
Return (BUF1)

Bjorn Helgaas

unread,

Mar 20, 2014, 4:50:01 PM3/20/14

to

On Thu, Mar 20, 2014 at 2:55 PM, Rafael J. Wysocki <r...@rjwysocki.net> wrote:
> On Thursday, March 20, 2014 10:45:52 AM Bjorn Helgaas wrote:
>> The purpose of system.c is indeed to prevent resources from being
>> allocated to other devices. This is really a question for Rafael, but
>> in my opinion this function (reserving resources of PNP/ACPI devices
>> to prevent their allocation to other devices) should be done for *all*
>> PNP and ACPI devices, not just the PNP0C01/PNP0C02 devices handled by
>> system.c.
>>
>> So I think the best solution would be to move that into the ACPI core
>> somehow so it happens for all devices.
>
> Well, I think you got to the bottom of this, but that's something we can
> do long-term. Still, we need to find a short-term solution for the
> particular issue at hand.

Right. Even if we had this long-term solution, we'd still have
Stephane's current problem, because the PNP0C02 _CRS is still wrong.

We do have a drivers/pnp/quirks.c where we could conceivably adjust
the PNP resource if we found the matching PCI device and MCHBAR. That
should solve Stephane's problem even with the current
drivers/pnp/system.c.

Bjorn

Borislav Petkov

unread,

Apr 16, 2014, 3:10:02 PM4/16/14

to

On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
> Right. Even if we had this long-term solution, we'd still have
> Stephane's current problem, because the PNP0C02 _CRS is still wrong.
>
> We do have a drivers/pnp/quirks.c where we could conceivably adjust
> the PNP resource if we found the matching PCI device and MCHBAR. That
> should solve Stephane's problem even with the current
> drivers/pnp/system.c.

Guys, this still triggers in -rc1. Do we have a fix or something
testable at least?

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Zhang, Rui

unread,

Apr 16, 2014, 4:30:01 PM4/16/14

to

> -----Original Message-----
> From: Borislav Petkov [mailto:b...@alien8.de]
> Sent: Wednesday, April 16, 2014 12:04 PM
> To: Bjorn Helgaas; Rafael J. Wysocki
> Cc: Zhang, Rui; Lu, Aaron; lkml; x...@kernel.org; Linux PCI; ACPI Devel
> Maling List; Yinghai Lu; H. Peter Anvin; Stephane Eranian; Yan, Zheng Z
> Subject: Re: Info: mapping multiple BARs. Your kernel is fine.
> Importance: High
>

> On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
> > Right. Even if we had this long-term solution, we'd still have
> > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
> >
> > We do have a drivers/pnp/quirks.c where we could conceivably adjust
> > the PNP resource if we found the matching PCI device and MCHBAR.
> That
> > should solve Stephane's problem even with the current
> > drivers/pnp/system.c.
>
> Guys, this still triggers in -rc1. Do we have a fix or something
> testable at least?
>

Could you please attach the dmesg output after a fresh boot in -rc1?

Thanks,
rui

Bjorn Helgaas

unread,

Apr 16, 2014, 4:40:02 PM4/16/14

to

On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
> On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
> > Right. Even if we had this long-term solution, we'd still have
> > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
> >
> > We do have a drivers/pnp/quirks.c where we could conceivably adjust
> > the PNP resource if we found the matching PCI device and MCHBAR. That
> > should solve Stephane's problem even with the current
> > drivers/pnp/system.c.
>
> Guys, this still triggers in -rc1. Do we have a fix or something
> testable at least?

Hi Boris,

Can you try the patch below?

PNP: Work around Haswell BIOS defect in MCH area reporting

From: Bjorn Helgaas <bhel...@google.com>

Work around a Haswell BIOS defect that causes part of the MCH area to be
unreported.

MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
PNP0C02 resource. The MCH space was 16KB prior to Haswell, but it is 32KB
in Haswell. Some Haswell BIOSes still report a PNP0C02 resource that is
only 16KB, which means the rest of the MCH space is consumed but
unreported.

This can cause resource map sanity check warnings or (theoretically) a
device conflict if we assigned the unreported space to another device.

The Intel perf event uncore driver tripped over this when it claimed the
MCH region:

resource map sanity check conflict: 0xfed10000 0xfed15fff 0xfed10000 0xfed13fff pnp 00:01

Info: mapping multiple BARs. Your kernel is fine.

To prevent this, if we find a PNP0C02 resource that covers part of the MCH
space, extend it to cover the entire space.

Link: http://lkml.kernel.org/r/20140224162...@pd.tnic
Reported-by: Borislav Petkov <b...@alien8.de>
Signed-off-by: Bjorn Helgaas <bhel...@google.com>
---
drivers/pnp/quirks.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)

diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
index 258fef272ea7..023edf592371 100644
--- a/drivers/pnp/quirks.c
+++ b/drivers/pnp/quirks.c
@@ -334,6 +334,57 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
}
#endif

+static void quirk_intel_haswell_mch(struct pnp_dev *dev)
+{
+ struct pci_dev *host;
+ u32 addr_lo, addr_hi;
+ struct pci_bus_region region;
+ struct resource mch;
+ struct pnp_resource *pnp_res;
+ struct resource *res;
+
+ host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0c00, NULL);
+ if (!host)
+ return;
+
+ /*
+ * MCHBAR is not an architected PCI BAR, so MCH space is usually
+ * reported as a PNP0C02 resource. The MCH space was 16KB prior to
+ * Haswell, but it is 32KB in Haswell. Some Haswell BIOSes still
+ * report a PNP0C02 resource that is only 16KB, which means the
+ * rest of the MCH space is consumed but unreported.
+ */
+
+ /*
+ * Read MCHBAR for Host Member Mapped Register Range Base
+ * https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
+ * Sec 3.1.12.
+ */
+ pci_read_config_dword(host, 0x48, &addr_lo);
+ region.start = addr_lo & ~0x7fff;
+ pci_read_config_dword(host, 0x4c, &addr_hi);
+ region.start |= (dma_addr_t) addr_hi << 32;
+ region.end = region.start + 32*1024 - 1 ;
+ pcibios_bus_to_resource(host->bus, &mch, &region);
+
+ list_for_each_entry(pnp_res, &dev->resources, list) {
+ res = &pnp_res->res;
+ if (res->end < mch.start || res->start > mch.end)
+ continue; /* no overlap */
+ if (res->start == mch.start && res->end == mch.end)
+ continue; /* exact match */
+
+ dev_info(&dev->dev, FW_BUG
+ "%pR covers only part of Intel Haswell MCH; extending to %pR\n",
+ res, &mch);
+ res->start = mch.start;
+ res->end = mch.end;
+ break;
+ }
+
+ pci_dev_put(host);
+}
+
/*
* PnP Quirks
* Cards or devices that need some tweaking due to incomplete resource info
@@ -364,6 +415,7 @@ static struct pnp_fixup pnp_fixups[] = {
#ifdef CONFIG_AMD_NB
{"PNP0c01", quirk_amd_mmconfig_area},
#endif
+ {"PNP0c02", quirk_intel_haswell_mch},
{""}
};

Bjorn Helgaas

unread,

Apr 16, 2014, 7:00:01 PM4/16/14

to

On Wed, Apr 16, 2014 at 06:31:22PM -0400, Dave Jones wrote:

> On Wed, Apr 16, 2014 at 02:31:38PM -0600, Bjorn Helgaas wrote:
> > On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
> > > On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
> > > > Right. Even if we had this long-term solution, we'd still have
> > > > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
> > > >
> > > > We do have a drivers/pnp/quirks.c where we could conceivably adjust
> > > > the PNP resource if we found the matching PCI device and MCHBAR. That
> > > > should solve Stephane's problem even with the current
> > > > drivers/pnp/system.c.
> > >
> > > Guys, this still triggers in -rc1. Do we have a fix or something
> > > testable at least?
> >
> > Hi Boris,
> >
> > Can you try the patch below?
>

> I'm seeing the exact same message on my thinkpad t430s.
> When I try your patch, modesetting no longer works. When it tries
> to change to the framebuffer I get a black screen and lockup.
> If I boot with nomodeset it locks up when it gets to X.
> It all scrolls by too fast to read, but it looks like there's still
> a backtrace present.

Ouch, sorry about that. I do see a bug in my patch (fixed below), but I
don't see how that could cause what you're seeing. Maybe I could figure
out something from this info (this can be from a kernel without my patch):

- dmesg log
- output of "find /sys/devices/pnp0 -name id -o -name resources | xargs grep ."
- output of "sudo lspci -s00:00.0 -xxx"

PNP: Work around Haswell BIOS defect in MCH area reporting

From: Bjorn Helgaas <bhel...@google.com>

Work around a Haswell BIOS defect that causes part of the MCH area to be
unreported.

MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
PNP0C02 resource. The MCH space was 16KB prior to Haswell, but it is 32KB
in Haswell. Some Haswell BIOSes still report a PNP0C02 resource that is
only 16KB, which means the rest of the MCH space is consumed but
unreported.

This can cause resource map sanity check warnings or (theoretically) a
device conflict if we assigned the unreported space to another device.

The Intel perf event uncore driver tripped over this when it claimed the
MCH region:

resource map sanity check conflict: 0xfed10000 0xfed15fff 0xfed10000 0xfed13fff pnp 00:01
Info: mapping multiple BARs. Your kernel is fine.

To prevent this, if we find a PNP0C02 resource that covers part of the MCH
space, extend it to cover the entire space.

Link: http://lkml.kernel.org/r/20140224162...@pd.tnic
Reported-by: Borislav Petkov <b...@alien8.de>
Signed-off-by: Bjorn Helgaas <bhel...@google.com>
---

drivers/pnp/quirks.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 55 insertions(+)

diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
index 258fef272ea7..8402088d4145 100644
--- a/drivers/pnp/quirks.c
+++ b/drivers/pnp/quirks.c
@@ -334,6 +334,60 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)

+ memset(&mch, 0, sizeof(mch));
+ mch.flags = IORESOURCE_MEM;

+ pcibios_bus_to_resource(host->bus, &mch, &region);
+
+ list_for_each_entry(pnp_res, &dev->resources, list) {
+ res = &pnp_res->res;
+ if (res->end < mch.start || res->start > mch.end)
+ continue; /* no overlap */
+ if (res->start == mch.start && res->end == mch.end)
+ continue; /* exact match */
+
+ dev_info(&dev->dev, FW_BUG
+ "%pR covers only part of Intel Haswell MCH; extending to %pR\n",
+ res, &mch);
+ res->start = mch.start;
+ res->end = mch.end;
+ break;
+ }
+
+ pci_dev_put(host);
+}
+
/*
* PnP Quirks
* Cards or devices that need some tweaking due to incomplete resource info

@@ -364,6 +418,7 @@ static struct pnp_fixup pnp_fixups[] = {

Bjorn Helgaas

unread,

Apr 16, 2014, 7:20:01 PM4/16/14

to

On Wed, Apr 16, 2014 at 5:08 PM, Stephane Eranian <era...@google.com> wrote:

> On Wed, Apr 16, 2014 at 1:31 PM, Bjorn Helgaas <bhel...@google.com> wrote:
>> On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
>>> On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
>>> > Right. Even if we had this long-term solution, we'd still have
>>> > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
>>> >
>>> > We do have a drivers/pnp/quirks.c where we could conceivably adjust
>>> > the PNP resource if we found the matching PCI device and MCHBAR. That
>>> > should solve Stephane's problem even with the current
>>> > drivers/pnp/system.c.
>>>
>>> Guys, this still triggers in -rc1. Do we have a fix or something
>>> testable at least?
>>
>> Hi Boris,
>>
>> Can you try the patch below?
>>
>>
>>
>> PNP: Work around Haswell BIOS defect in MCH area reporting
>>
>> From: Bjorn Helgaas <bhel...@google.com>
>>
>> Work around a Haswell BIOS defect that causes part of the MCH area to be
>> unreported.
>>
>> MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
>> PNP0C02 resource. The MCH space was 16KB prior to Haswell, but it is 32KB
>> in Haswell. Some Haswell BIOSes still report a PNP0C02 resource that is
>> only 16KB, which means the rest of the MCH space is consumed but
>> unreported.
>>

> Why are you saying this is Haswell vs. others. I see the problem on my
> IvyBridge laptop, like Boris.

Ah, good question. Somewhere I got pointed to the Haswell docs, which
say 32KB. I don't know what other parts have 32KB MCH spaces. If we
could figure out a list of device IDs with 32KB spaces, we could add
that to the quirk.

But I don't know how to come up with a complete list.

Bjorn

Dave Jones

unread,

Apr 16, 2014, 6:40:01 PM4/16/14

to

On Wed, Apr 16, 2014 at 02:31:38PM -0600, Bjorn Helgaas wrote:

> On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
> > On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
> > > Right. Even if we had this long-term solution, we'd still have
> > > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
> > >
> > > We do have a drivers/pnp/quirks.c where we could conceivably adjust
> > > the PNP resource if we found the matching PCI device and MCHBAR. That
> > > should solve Stephane's problem even with the current
> > > drivers/pnp/system.c.
> >
> > Guys, this still triggers in -rc1. Do we have a fix or something
> > testable at least?
>
> Hi Boris,
>
> Can you try the patch below?

I'm seeing the exact same message on my thinkpad t430s.
When I try your patch, modesetting no longer works. When it tries
to change to the framebuffer I get a black screen and lockup.
If I boot with nomodeset it locks up when it gets to X.
It all scrolls by too fast to read, but it looks like there's still
a backtrace present.

Dave

Dave Jones

unread,

Apr 16, 2014, 8:20:01 PM4/16/14

to

On Wed, Apr 16, 2014 at 04:56:00PM -0600, Bjorn Helgaas wrote:

> > I'm seeing the exact same message on my thinkpad t430s.
> > When I try your patch, modesetting no longer works. When it tries
> > to change to the framebuffer I get a black screen and lockup.
> > If I boot with nomodeset it locks up when it gets to X.
> > It all scrolls by too fast to read, but it looks like there's still
> > a backtrace present.
>
> Ouch, sorry about that. I do see a bug in my patch (fixed below), but I
> don't see how that could cause what you're seeing.

updated diff made no difference fwiw.

> Maybe I could figure
> out something from this info (this can be from a kernel without my patch):
>
> - dmesg log
> - output of "find /sys/devices/pnp0 -name id -o -name resources | xargs grep ."
> - output of "sudo lspci -s00:00.0 -xxx"

attached from a fedora build of rc1.

Dave

pnp.txt

pci

dmesg

Stephane Eranian

unread,

Apr 16, 2014, 7:10:01 PM4/16/14

to

On Wed, Apr 16, 2014 at 1:31 PM, Bjorn Helgaas <bhel...@google.com> wrote:

> On Wed, Apr 16, 2014 at 09:04:04PM +0200, Borislav Petkov wrote:
>> On Thu, Mar 20, 2014 at 02:48:30PM -0600, Bjorn Helgaas wrote:
>> > Right. Even if we had this long-term solution, we'd still have
>> > Stephane's current problem, because the PNP0C02 _CRS is still wrong.
>> >
>> > We do have a drivers/pnp/quirks.c where we could conceivably adjust
>> > the PNP resource if we found the matching PCI device and MCHBAR. That
>> > should solve Stephane's problem even with the current
>> > drivers/pnp/system.c.
>>
>> Guys, this still triggers in -rc1. Do we have a fix or something
>> testable at least?
>
> Hi Boris,
>
> Can you try the patch below?
>
>
>
> PNP: Work around Haswell BIOS defect in MCH area reporting
>
> From: Bjorn Helgaas <bhel...@google.com>
>
> Work around a Haswell BIOS defect that causes part of the MCH area to be
> unreported.
>
> MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
> PNP0C02 resource. The MCH space was 16KB prior to Haswell, but it is 32KB
> in Haswell. Some Haswell BIOSes still report a PNP0C02 resource that is
> only 16KB, which means the rest of the MCH space is consumed but
> unreported.
>

Why are you saying this is Haswell vs. others. I see the problem on my
IvyBridge laptop, like Boris.

Borislav Petkov

unread,

Apr 17, 2014, 6:50:02 AM4/17/14

to

Hi Bjorn,

thanks for the patch, a couple of notes below:

On Wed, Apr 16, 2014 at 04:56:00PM -0600, Bjorn Helgaas wrote:

> PNP: Work around Haswell BIOS defect in MCH area reporting
>
> From: Bjorn Helgaas <bhel...@google.com>
>
> Work around a Haswell BIOS defect that causes part of the MCH area to be
> unreported.

Yep, what Stephane said, this is not HSW only.

And because it is not HSW only, this PCI device ID doesn't match on my
IVB system. On mine the hostbridge is

00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller (rev 09)
Subsystem: Lenovo Device 21fa
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
Latency: 0
Capabilities: <access denied>
Kernel driver in use: ivb_uncore
00: 86 80 54 01 06 00 90 20 09 00 00 06 00 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 aa 17 fa 21
30: 00 00 00 00 e0 00 00 00 00 00 00 00 00 00 00 00

and from looking at Dave's, it is the same one, so PCI device ID is
0x154.

With that changed to

host = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0154, NULL);

and a bit of debugging code, it says now:

[ 0.235739] quirk_intel_haswell_mch: entry
[ 0.235800] quirk_intel_haswell_mch: got host: 0x0
[ 0.235860] quirk_intel_haswell_mch: mch: [mem 0xfed10000-0xfed17fff]
[ 0.235930] quirk_intel_haswell_mch: res: [mem 0xfed10000-0xfed13fff]
[ 0.235990] pnp 00:01: [Firmware Bug]: [mem 0xfed10000-0xfed13fff] covers only part of Intel Haswell MCH; extending to [mem 0xfed10000-0xfed17fff]

So you probably want to have a list of hostbridge pci ids in the quirk
or so.

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Bjorn Helgaas

unread,

Apr 17, 2014, 2:40:02 PM4/17/14

to

Thanks a lot for testing this out and debugging my issues.

Here's a new version that looks for both device IDs I know about.

I'm still nervous about the modeset problem Dave is seeing. Since the
original patch wouldn't find an 8086:0c00 device on Dave's system, it
should have done nothing. But since it caused a modesetting problem,
there's something else doing on that I don't understand.

Bjorn

PNP: Work around BIOS defects in Intel MCH area reporting

From: Bjorn Helgaas <bhel...@google.com>

Work around BIOSes that don't report the entire Intel MCH area.

MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a

PNP0C02 resource. The MCH space was once 16KB, but is 32KB in newer parts.
Some BIOSes still report a PNP0C02 resource that is only 16KB, which means

the rest of the MCH space is consumed but unreported.

This can cause resource map sanity check warnings or (theoretically) a
device conflict if we assigned the unreported space to another device.

The Intel perf event uncore driver tripped over this when it claimed the
MCH region:

resource map sanity check conflict: 0xfed10000 0xfed15fff 0xfed10000 0xfed13fff pnp 00:01
Info: mapping multiple BARs. Your kernel is fine.

To prevent this, if we find a PNP0C02 resource that covers part of the MCH
space, extend it to cover the entire space.

Link: http://lkml.kernel.org/r/20140224162...@pd.tnic
Reported-by: Borislav Petkov <b...@alien8.de>
Signed-off-by: Bjorn Helgaas <bhel...@google.com>
---

drivers/pnp/quirks.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 74 insertions(+)

diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
index 258fef272ea7..403bd5c42ed1 100644
--- a/drivers/pnp/quirks.c
+++ b/drivers/pnp/quirks.c
@@ -334,6 +334,79 @@ static void quirk_amd_mmconfig_area(struct pnp_dev *dev)
}
#endif

+/* Device IDs of parts that have 32KB MCH space */
+static const unsigned int mch_quirk_devices[] = {
+ 0x0154, /* Ivy Bridge */
+ 0x0c00, /* Haswell */
+};
+
+static struct pci_dev *get_intel_host(void)
+{
+ int i;

+ struct pci_dev *host;
+

+ for (i = 0; i < ARRAY_SIZE(mch_quirk_devices); i++) {
+ host = pci_get_device(PCI_VENDOR_ID_INTEL, mch_quirk_devices[i],
+ NULL);
+ if (host)
+ return host;
+ }
+ return NULL;
+}
+
+static void quirk_intel_mch(struct pnp_dev *dev)

+{
+ struct pci_dev *host;
+ u32 addr_lo, addr_hi;
+ struct pci_bus_region region;
+ struct resource mch;
+ struct pnp_resource *pnp_res;
+ struct resource *res;
+

+ host = get_intel_host();

+ if (!host)
+ return;
+
+ /*

+ * MCHBAR is not an architected PCI BAR, so MCH space is usually
+ * reported as a PNP0C02 resource. The MCH space was originally
+ * 16KB, but is 32KB in newer parts. Some BIOSes still report a
+ * PNP0C02 resource that is only 16KB, which means the rest of the
+ * MCH space is consumed but unreported.

+ */
+
+ /*
+ * Read MCHBAR for Host Member Mapped Register Range Base
+ * https://www-ssl.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-2-datasheet
+ * Sec 3.1.12.
+ */
+ pci_read_config_dword(host, 0x48, &addr_lo);
+ region.start = addr_lo & ~0x7fff;
+ pci_read_config_dword(host, 0x4c, &addr_hi);
+ region.start |= (dma_addr_t) addr_hi << 32;
+ region.end = region.start + 32*1024 - 1 ;
+
+ memset(&mch, 0, sizeof(mch));
+ mch.flags = IORESOURCE_MEM;
+ pcibios_bus_to_resource(host->bus, &mch, &region);
+
+ list_for_each_entry(pnp_res, &dev->resources, list) {
+ res = &pnp_res->res;
+ if (res->end < mch.start || res->start > mch.end)
+ continue; /* no overlap */
+ if (res->start == mch.start && res->end == mch.end)
+ continue; /* exact match */
+

+ dev_info(&dev->dev, FW_BUG "PNP resource %pR covers only part of %s Intel MCH; extending to %pR\n",
+ res, pci_name(host), &mch);

+ res->start = mch.start;
+ res->end = mch.end;
+ break;
+ }
+
+ pci_dev_put(host);
+}
+
/*
* PnP Quirks
* Cards or devices that need some tweaking due to incomplete resource info

@@ -364,6 +437,7 @@ static struct pnp_fixup pnp_fixups[] = {

#ifdef CONFIG_AMD_NB
{"PNP0c01", quirk_amd_mmconfig_area},
#endif

+ {"PNP0c02", quirk_intel_mch},
{""}
};

Borislav Petkov

unread,

Apr 17, 2014, 3:50:02 PM4/17/14

to

On Thu, Apr 17, 2014 at 12:26:37PM -0600, Bjorn Helgaas wrote:
> Thanks a lot for testing this out and debugging my issues.
>
> Here's a new version that looks for both device IDs I know about.
>
> I'm still nervous about the modeset problem Dave is seeing. Since the
> original patch wouldn't find an 8086:0c00 device on Dave's system, it
> should have done nothing. But since it caused a modesetting problem,
> there's something else doing on that I don't understand.

Yeah, this is strange, to put it mildly. This quirk wouldnt've done
anything besides the iteration over the pci devices with pci_get_device.
Which wouldn't do anything (refcount increment or so) if it didn't find
the device, right?

Bah, today is the day of the strange bugs. :-\

> PNP: Work around BIOS defects in Intel MCH area reporting
>
> From: Bjorn Helgaas <bhel...@google.com>
>
> Work around BIOSes that don't report the entire Intel MCH area.
>
> MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
> PNP0C02 resource. The MCH space was once 16KB, but is 32KB in newer parts.
> Some BIOSes still report a PNP0C02 resource that is only 16KB, which means
> the rest of the MCH space is consumed but unreported.
>
> This can cause resource map sanity check warnings or (theoretically) a
> device conflict if we assigned the unreported space to another device.
>
> The Intel perf event uncore driver tripped over this when it claimed the
> MCH region:
>
> resource map sanity check conflict: 0xfed10000 0xfed15fff 0xfed10000 0xfed13fff pnp 00:01
> Info: mapping multiple BARs. Your kernel is fine.
>
> To prevent this, if we find a PNP0C02 resource that covers part of the MCH
> space, extend it to cover the entire space.
>
> Link: http://lkml.kernel.org/r/20140224162...@pd.tnic
> Reported-by: Borislav Petkov <b...@alien8.de>

Yep, this one works fine:

[ 0.403855] pnp 00:01: [Firmware Bug]: PNP resource [mem 0xfed10000-0xfed13fff] covers only part of 0000:00:00.0 Intel MCH; extending to [mem 0xfed10000-0xfed17fff]

Acked-by: Borislav Petkov <b...@suse.de>
Tested-by: Borislav Petkov <b...@suse.de>

Just a minor nitpick below.

checkpatch complains about a trailing space before the semicolon.

Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Dave Jones

unread,

Apr 17, 2014, 4:00:02 PM4/17/14

to

On Thu, Apr 17, 2014 at 12:26:37PM -0600, Bjorn Helgaas wrote:

> Thanks a lot for testing this out and debugging my issues.
>
> Here's a new version that looks for both device IDs I know about.

I can confirm this patch does fix the backtrace.
I disabled lockdep, and now I can get to X each boot, but I still see
a black screen rather than a console between modesetting becoming active, and X starting.

(The lockdep thing turned out to be a known XFS false positive, but for
some reason it actually caused X to lock up)

> I'm still nervous about the modeset problem Dave is seeing. Since the
> original patch wouldn't find an 8086:0c00 device on Dave's system, it
> should have done nothing. But since it caused a modesetting problem,
> there's something else doing on that I don't understand.

I don't know if it's relevant, but this laptop (and I suspect many other
thinkpads which seem affected) have dual gfx, both show up on the bus,
even if though the nvidia isn't in use..

00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09) (prog-if 00 [VGA controller])
Subsystem: Lenovo Device 2200
Flags: bus master, fast devsel, latency 0, IRQ 44
Memory at f1000000 (64-bit, non-prefetchable) [size=4M]
Memory at e0000000 (64-bit, prefetchable) [size=256M]
I/O ports at 6000 [size=64]
Expansion ROM at <unassigned> [disabled]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [d0] Power Management version 2
Capabilities: [a4] PCI Advanced Features
Kernel driver in use: i915

01:00.0 3D controller: NVIDIA Corporation GF117M [GeForce 610M/710M/820M / GT 620M/625M/630M/720M] (rev a1)
Subsystem: Lenovo NVS 5200M
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at f0000000 (32-bit, non-prefetchable) [size=16M]
Memory at c0000000 (64-bit, prefetchable) [size=256M]
Memory at d0000000 (64-bit, prefetchable) [size=32M]
I/O ports at 5000 [size=128]
Expansion ROM at <ignored> [disabled]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [b4] Vendor Specific Information: Len=14 <?>
Capabilities: [100] Virtual Channel
Capabilities: [128] Power Budgeting <?>
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>

Just as X starts up, I see this in dmesg..

[ 42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun

Dave

Borislav Petkov

unread,

Apr 17, 2014, 4:10:02 PM4/17/14

to

On Thu, Apr 17, 2014 at 03:52:40PM -0400, Dave Jones wrote:
> Just as X starts up, I see this in dmesg..
>
> [ 42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun

FWIW, I have that too. It should be something i915-related:

[ 0.617673] [drm] Memory usable by graphics device = 2048M
[ 0.694445] i915 0000:00:02.0: irq 42 for MSI/MSI-X
[ 0.694549] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[ 0.694631] [drm] Driver supports precise vblank timestamp query.
[ 0.695313] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[ 0.788300] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit banging on pin 5
[ 0.799829] fbcon: inteldrmfb (fb0) is primary device
[ 1.176845] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Dave Jones

unread,

Apr 17, 2014, 4:10:03 PM4/17/14

to

On Thu, Apr 17, 2014 at 10:01:27PM +0200, Borislav Petkov wrote:
> On Thu, Apr 17, 2014 at 03:52:40PM -0400, Dave Jones wrote:
> > Just as X starts up, I see this in dmesg..
> >
> > [ 42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun
>
> FWIW, I have that too. It should be something i915-related:
>
> [ 0.617673] [drm] Memory usable by graphics device = 2048M
> [ 0.694445] i915 0000:00:02.0: irq 42 for MSI/MSI-X
> [ 0.694549] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [ 0.694631] [drm] Driver supports precise vblank timestamp query.
> [ 0.695313] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
> [ 0.788300] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit banging on pin 5
> [ 0.799829] fbcon: inteldrmfb (fb0) is primary device
> [ 1.176845] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun

Can you send me your .config off-list ?
I wonder if this is something config specific that's causing me to see
this, and you not, given we've apparently got similar machines.

Dave

Bjorn Helgaas

unread,

Apr 17, 2014, 4:20:01 PM4/17/14

to

On Thu, Apr 17, 2014 at 1:48 PM, Borislav Petkov <b...@alien8.de> wrote:
> On Thu, Apr 17, 2014 at 12:26:37PM -0600, Bjorn Helgaas wrote:
>> Thanks a lot for testing this out and debugging my issues.
>>
>> Here's a new version that looks for both device IDs I know about.
>>
>> I'm still nervous about the modeset problem Dave is seeing. Since the
>> original patch wouldn't find an 8086:0c00 device on Dave's system, it
>> should have done nothing. But since it caused a modesetting problem,
>> there's something else doing on that I don't understand.
>
> Yeah, this is strange, to put it mildly. This quirk wouldnt've done
> anything besides the iteration over the pci devices with pci_get_device.
> Which wouldn't do anything (refcount increment or so) if it didn't find
> the device, right?

Right.

> Bah, today is the day of the strange bugs. :-\
>
>> PNP: Work around BIOS defects in Intel MCH area reporting
>>
>> From: Bjorn Helgaas <bhel...@google.com>
>>
>> Work around BIOSes that don't report the entire Intel MCH area.
>>
>> MCHBAR is not an architected PCI BAR, so MCH space is usually reported as a
>> PNP0C02 resource. The MCH space was once 16KB, but is 32KB in newer parts.
>> Some BIOSes still report a PNP0C02 resource that is only 16KB, which means
>> the rest of the MCH space is consumed but unreported.
>>
>> This can cause resource map sanity check warnings or (theoretically) a
>> device conflict if we assigned the unreported space to another device.
>>
>> The Intel perf event uncore driver tripped over this when it claimed the
>> MCH region:
>>
>> resource map sanity check conflict: 0xfed10000 0xfed15fff 0xfed10000 0xfed13fff pnp 00:01
>> Info: mapping multiple BARs. Your kernel is fine.
>>
>> To prevent this, if we find a PNP0C02 resource that covers part of the MCH
>> space, extend it to cover the entire space.
>>
>> Link: http://lkml.kernel.org/r/20140224162...@pd.tnic
>> Reported-by: Borislav Petkov <b...@alien8.de>
>
> Yep, this one works fine:
>
> [ 0.403855] pnp 00:01: [Firmware Bug]: PNP resource [mem 0xfed10000-0xfed13fff] covers only part of 0000:00:00.0 Intel MCH; extending to [mem 0xfed10000-0xfed17fff]
>
> Acked-by: Borislav Petkov <b...@suse.de>
> Tested-by: Borislav Petkov <b...@suse.de>

>> + region.end = region.start + 32*1024 - 1 ;

> checkpatch complains about a trailing space before the semicolon.

Thanks! I hate typos like that.

I'll fix this, add your tested-by and ack, and send to Rafael.

Bjorn

Dave Jones

unread,

Apr 17, 2014, 5:00:02 PM4/17/14

to

On Thu, Apr 17, 2014 at 04:03:52PM -0400, Dave Jones wrote:
> On Thu, Apr 17, 2014 at 10:01:27PM +0200, Borislav Petkov wrote:
> > On Thu, Apr 17, 2014 at 03:52:40PM -0400, Dave Jones wrote:
> > > Just as X starts up, I see this in dmesg..
> > >
> > > [ 42.879049] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun
> >
> > FWIW, I have that too. It should be something i915-related:
> >
> > [ 0.617673] [drm] Memory usable by graphics device = 2048M
> > [ 0.694445] i915 0000:00:02.0: irq 42 for MSI/MSI-X
> > [ 0.694549] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> > [ 0.694631] [drm] Driver supports precise vblank timestamp query.
> > [ 0.695313] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
> > [ 0.788300] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit banging on pin 5
> > [ 0.799829] fbcon: inteldrmfb (fb0) is primary device
> > [ 1.176845] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun
>
> Can you send me your .config off-list ?
> I wonder if this is something config specific that's causing me to see
> this, and you not, given we've apparently got similar machines.

ok, with your config I get back to a console after the modesetting
switch, but then it hangs in USB init.

Hrmm.

Borislav Petkov

unread,

Apr 17, 2014, 5:10:02 PM4/17/14

to

On Thu, Apr 17, 2014 at 04:53:55PM -0400, Dave Jones wrote:
> ok, with your config I get back to a console after the modesetting
> switch, but then it hangs in USB init.

Maybe because of our machines are not that similar there? Can you take
my config but paste the usb part of yours and see whether it boots fine
then? It could be yours and mine have different USB hw...

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Borislav Petkov

unread,

Apr 18, 2014, 6:50:02 AM4/18/14

to

On Thu, Apr 17, 2014 at 05:30:27PM -0400, Dave Jones wrote:
> I think it's just implicated because that's the next thing that seems
> to init after the modeswitch. The config differences are small, just
> things like =m instead of =y or vice-versa.
>
> I'm about to head into a long weekend, so I'll get back to this on
> Monday, but for now I'm out of ideas.

This is for when you get back: :-)

Can you debug that hang a bit more, like enable some sensible options
under "Kernel Hacking" or somesuch, boot with initcall_debug, add
more printks at key places? If the machine would tell us why exactly
it hangs, we might have an idea, like corruption, transaction stall,
whatever...