Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH v1 0/7] PCI: try enabling "pci=use_crs" again

5 views
Skip to first unread message

Bjorn Helgaas

unread,
Feb 3, 2010, 6:50:03 PM2/3/10
to
Historically, Linux has assumed a single PCI host bridge, with that bridge
claiming all the address space left after RAM and legacy devices are taken out.

If the system contains multiple host bridges, we can no longer operate under
that assumption. We have to know what parts of the address space are claimed
by each bridge so that when we assign resources to a PCI device, we take them
from a range claimed by the upstream host bridge.

We use ACPI to enumerate all the PCI host bridges in the system, and part of
the host bridge description is the "_CRS" (current resource settings" property,
which lists the address space used by the bridge. On x86, we currently ignore
most of the _CRS information. This patch series changes this, so we will use
_CRS to learn about the host bridge windows.

Since most x86 machines with multiple host bridges are relatively new, this
series only turns this on for machines with BIOS dates of 2010 or newer and for
a few machines that we know need it.

These apply on 0148b041be4e7, which is the current head of the linux-next
branch of Jesse's pci-2.6 git tree. The first patch is just Jeff Garrett's
patch to remove intel_bus.c, so that is only here for people who want to test
the rest of the patches. I expect Jesse will pick up Jeff's patch via Linus'
tree.

Gary and Peter have some of these problem machines, so I'm hoping they can give
this a whirl.

Larry, you reported the problem the last time I tried to turn on "pci=use_crs"
by default. This series shouldn't affect your machine because it's not in the
whitelist, but I expect that if you boot the current kernel with "pci=use_crs",
it should still fail, and if you boot with these patches and "pci=use_crs", it
*should* work. I know it's a lot to ask, but it'd be great if you had a chance
to try that.

Bjorn

---

Bjorn Helgaas (7):
x86/PCI: remove IOH range fetching
PCI: break out primary/secondary/subordinate for readability
PCI: split up pci_read_bridge_bases()
PCI: read bridge windows before filling in subtractive decode resources
PCI: replace bus resource table with a list
x86/PCI: use host bridge _CRS info by default on 2010 and newer machines
PCI: reference bridge window resources explicitly


Documentation/kernel-parameters.txt | 8 +-
arch/ia64/include/asm/acpi.h | 1
arch/ia64/pci/pci.c | 20 ++--
arch/x86/include/asm/pci_x86.h | 1
arch/x86/pci/Makefile | 2
arch/x86/pci/acpi.c | 105 +++++++++++++++-------
arch/x86/pci/bus_numa.c | 9 +-
arch/x86/pci/common.c | 3 +
arch/x86/pci/intel_bus.c | 94 --------------------
drivers/acpi/pci_root.c | 1
drivers/pci/bus.c | 50 ++++++++++-
drivers/pci/hotplug/shpchp_sysfs.c | 15 ++-
drivers/pci/pci.c | 6 +
drivers/pci/probe.c | 102 +++++++++++++++-------
drivers/pci/quirks.c | 4 -
drivers/pci/setup-bus.c | 166 ++++++++++++++++++-----------------
drivers/pcmcia/rsrc_nonstatic.c | 7 +
drivers/pcmcia/yenta_socket.c | 46 ++++++----
include/acpi/acpi_drivers.h | 1
include/linux/pci.h | 27 ++++--
20 files changed, 367 insertions(+), 301 deletions(-)
delete mode 100644 arch/x86/pci/intel_bus.c

--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Bjorn Helgaas

unread,
Feb 3, 2010, 6:50:02 PM2/3/10
to

This is Jeff Garrett's patch to remove intel_bus.c. It's already in
Linus' tree (e8e06eae4ffd), but not yet in Jesse's tree. It's only
here so the subsequent patches don't have to update intel_bus.c.
---

arch/x86/pci/Makefile | 2 -
arch/x86/pci/intel_bus.c | 94 ----------------------------------------------
2 files changed, 1 insertions(+), 95 deletions(-)
delete mode 100644 arch/x86/pci/intel_bus.c


diff --git a/arch/x86/pci/Makefile b/arch/x86/pci/Makefile
index 564b008..39fba37 100644
--- a/arch/x86/pci/Makefile
+++ b/arch/x86/pci/Makefile
@@ -15,7 +15,7 @@ obj-$(CONFIG_X86_NUMAQ) += numaq_32.o

obj-y += common.o early.o
obj-y += amd_bus.o
-obj-$(CONFIG_X86_64) += bus_numa.o intel_bus.o
+obj-$(CONFIG_X86_64) += bus_numa.o

ifeq ($(CONFIG_PCI_DEBUG),y)
EXTRA_CFLAGS += -DDEBUG
diff --git a/arch/x86/pci/intel_bus.c b/arch/x86/pci/intel_bus.c
deleted file mode 100644
index f81a2fa..0000000
--- a/arch/x86/pci/intel_bus.c
+++ /dev/null
@@ -1,94 +0,0 @@
-/*
- * to read io range from IOH pci conf, need to do it after mmconfig is there
- */
-
-#include <linux/delay.h>
-#include <linux/dmi.h>
-#include <linux/pci.h>
-#include <linux/init.h>
-#include <asm/pci_x86.h>
-
-#include "bus_numa.h"
-
-static inline void print_ioh_resources(struct pci_root_info *info)
-{
- int res_num;
- int busnum;
- int i;
-
- printk(KERN_DEBUG "IOH bus: [%02x, %02x]\n",
- info->bus_min, info->bus_max);
- res_num = info->res_num;
- busnum = info->bus_min;
- for (i = 0; i < res_num; i++) {
- struct resource *res;
-
- res = &info->res[i];
- printk(KERN_DEBUG "IOH bus: %02x index %x %s: [%llx, %llx]\n",
- busnum, i,
- (res->flags & IORESOURCE_IO) ? "io port" :
- "mmio",
- res->start, res->end);
- }
-}
-
-#define IOH_LIO 0x108
-#define IOH_LMMIOL 0x10c
-#define IOH_LMMIOH 0x110
-#define IOH_LMMIOH_BASEU 0x114
-#define IOH_LMMIOH_LIMITU 0x118
-#define IOH_LCFGBUS 0x11c
-
-static void __devinit pci_root_bus_res(struct pci_dev *dev)
-{
- u16 word;
- u32 dword;
- struct pci_root_info *info;
- u16 io_base, io_end;
- u32 mmiol_base, mmiol_end;
- u64 mmioh_base, mmioh_end;
- int bus_base, bus_end;
-
- /* some sys doesn't get mmconf enabled */
- if (dev->cfg_size < 0x120)
- return;
-
- if (pci_root_num >= PCI_ROOT_NR) {
- printk(KERN_DEBUG "intel_bus.c: PCI_ROOT_NR is too small\n");
- return;
- }
-
- info = &pci_root_info[pci_root_num];
- pci_root_num++;
-
- pci_read_config_word(dev, IOH_LCFGBUS, &word);
- bus_base = (word & 0xff);
- bus_end = (word & 0xff00) >> 8;
- sprintf(info->name, "PCI Bus #%02x", bus_base);
- info->bus_min = bus_base;
- info->bus_max = bus_end;
-
- pci_read_config_word(dev, IOH_LIO, &word);
- io_base = (word & 0xf0) << (12 - 4);
- io_end = (word & 0xf000) | 0xfff;
- update_res(info, io_base, io_end, IORESOURCE_IO, 0);
-
- pci_read_config_dword(dev, IOH_LMMIOL, &dword);
- mmiol_base = (dword & 0xff00) << (24 - 8);
- mmiol_end = (dword & 0xff000000) | 0xffffff;
- update_res(info, mmiol_base, mmiol_end, IORESOURCE_MEM, 0);
-
- pci_read_config_dword(dev, IOH_LMMIOH, &dword);
- mmioh_base = ((u64)(dword & 0xfc00)) << (26 - 10);
- mmioh_end = ((u64)(dword & 0xfc000000) | 0x3ffffff);
- pci_read_config_dword(dev, IOH_LMMIOH_BASEU, &dword);
- mmioh_base |= ((u64)(dword & 0x7ffff)) << 32;
- pci_read_config_dword(dev, IOH_LMMIOH_LIMITU, &dword);
- mmioh_end |= ((u64)(dword & 0x7ffff)) << 32;
- update_res(info, mmioh_base, mmioh_end, IORESOURCE_MEM, 0);
-
- print_ioh_resources(info);
-}
-
-/* intel IOH */
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x342e, pci_root_bus_res);

Bjorn Helgaas

unread,
Feb 3, 2010, 6:50:03 PM2/3/10
to

The main benefit of using ACPI host bridge window information is that
we can do better resource allocation in systems with multiple host bridges.
Most of these systems are new, so this patch turns on "pci=use_crs"
only on machines with a BIOS date of 2010 or newer. In addition, it
whitelists a few older machines that are known to benefit.

We previously turned on "pci=use_crs" by default on *all* machines
(9e9f46c44e48), but had to revert that because of problems such
as Larry Finger's: http://lkml.org/lkml/2009/6/23/715.

I think the problem Larry saw was caused by overflowing the pci_bus
resource table. This patch should not affect Larry's machine directly
because it is older than 2010, but the table overflow should be fixed
by the previous patch in this series.

http://bugzilla.kernel.org/show_bug.cgi?id=14183

Signed-off-by: Bjorn Helgaas <bjorn....@hp.com>
---

Documentation/kernel-parameters.txt | 8 +++-
arch/ia64/include/asm/acpi.h | 1
arch/x86/include/asm/pci_x86.h | 1
arch/x86/pci/acpi.c | 76 ++++++++++++++++++++++++++++++++---
arch/x86/pci/common.c | 3 +
drivers/acpi/pci_root.c | 1
include/acpi/acpi_drivers.h | 1
7 files changed, 83 insertions(+), 8 deletions(-)


diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 01e2a98..ca99c53 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1939,8 +1939,12 @@ and is between 256 and 4096 characters. It is defined in the file
IRQ routing is enabled.
noacpi [X86] Do not use ACPI for IRQ routing
or for PCI scanning.
- use_crs [X86] Use _CRS for PCI resource
- allocation.
+ use_crs [X86] Use PCI host bridge window information
+ from ACPI. On BIOSes from 2010 or later, this
+ is enabled by default. If you need to use this,
+ please report a bug.
+ nocrs [X86] Ignore PCI host bridge windows from ACPI.
+ If you need to use this, please report a bug.
routeirq Do IRQ routing for all PCI devices.
This is normally done in pci_enable_device(),
so this option is a temporary workaround
diff --git a/arch/ia64/include/asm/acpi.h b/arch/ia64/include/asm/acpi.h
index 7ae5889..7f2d7f2 100644
--- a/arch/ia64/include/asm/acpi.h
+++ b/arch/ia64/include/asm/acpi.h
@@ -97,6 +97,7 @@ ia64_acpi_release_global_lock (unsigned int *lock)
#endif
#define acpi_processor_cstate_check(x) (x) /* no idle limits on IA64 :) */
static inline void disable_acpi(void) { }
+static inline void pci_acpi_crs_quirks(void) { }

const char *acpi_get_sysname (void);
int acpi_request_vector (u32 int_type);
diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h
index b4bf9a9..05b58cc 100644
--- a/arch/x86/include/asm/pci_x86.h
+++ b/arch/x86/include/asm/pci_x86.h
@@ -29,6 +29,7 @@
#define PCI_CHECK_ENABLE_AMD_MMCONF 0x20000
#define PCI_HAS_IO_ECS 0x40000
#define PCI_NOASSIGN_ROMS 0x80000
+#define PCI_ROOT_NO_CRS 0x100000

extern unsigned int pci_probe;
extern unsigned long pirq_table_addr;
diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index a2f8cdb..fb044e3 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -15,6 +15,74 @@ struct pci_root_info {
int busnum;
};

+static int pci_use_crs; /* default is off */
+
+static int __init set_use_crs(const struct dmi_system_id *id)
+{
+ pci_use_crs = 1;
+ return 0;
+}
+
+static const struct dmi_system_id pci_use_crs_table[] __initconst = {
+ {
+ .callback = set_use_crs,
+ .ident = "IBM System x3800",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "IBM"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "x3800"),
+ },
+ },
+ {
+ .callback = set_use_crs,
+ .ident = "IBM System x3850",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "IBM"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "x3850"),
+ },
+ },
+ {
+ .callback = set_use_crs,
+ .ident = "IBM System x3950",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "IBM"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "x3950"),
+ },
+ },
+ {
+ .callback = set_use_crs,
+ .ident = "Toshiba Satellite A355",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "TOSHIBA"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Satellite A355"),
+ },
+ },
+ {}
+};
+
+void __init pci_acpi_crs_quirks(void)
+{
+ int year;
+
+ if (dmi_get_date(DMI_BIOS_DATE, &year, NULL, NULL) && year >= 2010)
+ pci_use_crs = 1;
+
+ dmi_check_system(pci_use_crs_table);
+
+ /*
+ * If the user specifies "pci=use_crs" or "pci=nocrs" explicitly, that
+ * takes precedence over anything we figured out above.
+ */
+ if (pci_probe & PCI_ROOT_NO_CRS)
+ pci_use_crs = 0;
+ else if (pci_probe & PCI_USE__CRS)
+ pci_use_crs = 1;
+
+ printk(KERN_INFO "PCI: %s host bridge windows from ACPI; "
+ "if necessary, use \"pci=%s\" and report a bug\n",
+ pci_use_crs ? "Using" : "Ignoring",
+ pci_use_crs ? "nocrs" : "use_crs");
+}
+
static acpi_status
resource_to_addr(struct acpi_resource *resource,
struct acpi_resource_address64 *addr)
@@ -106,7 +174,7 @@ setup_resource(struct acpi_resource *acpi_res, void *data)
res->child = NULL;
align_resource(info->bridge, res);

- if (!(pci_probe & PCI_USE__CRS)) {
+ if (!pci_use_crs) {
dev_printk(KERN_DEBUG, &info->bridge->dev,
"host bridge window %pR (ignored)\n", res);
return AE_OK;
@@ -137,12 +205,8 @@ get_current_resources(struct acpi_device *device, int busnum,
struct pci_root_info info;
size_t size;

- if (pci_probe & PCI_USE__CRS)
+ if (pci_use_crs)
pci_bus_remove_resources(bus);
- else
- dev_info(&device->dev,
- "ignoring host bridge windows from ACPI; "
- "boot with \"pci=use_crs\" to use them\n");

info.bridge = device;
info.bus = bus;
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index d2552c6..3736176 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -520,6 +520,9 @@ char * __devinit pcibios_setup(char *str)
} else if (!strcmp(str, "use_crs")) {
pci_probe |= PCI_USE__CRS;
return NULL;
+ } else if (!strcmp(str, "nocrs")) {
+ pci_probe |= PCI_ROOT_NO_CRS;
+ return NULL;
} else if (!strcmp(str, "earlydump")) {
pci_early_dump_regs = 1;
return NULL;
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 3810057..d127443 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -566,6 +566,7 @@ static int __init acpi_pci_root_init(void)
if (acpi_pci_disabled)
return 0;

+ pci_acpi_crs_quirks();
if (acpi_bus_register_driver(&acpi_pci_root_driver) < 0)
return -ENODEV;

diff --git a/include/acpi/acpi_drivers.h b/include/acpi/acpi_drivers.h
index f4906f6..3a4767c 100644
--- a/include/acpi/acpi_drivers.h
+++ b/include/acpi/acpi_drivers.h
@@ -104,6 +104,7 @@ int acpi_pci_bind_root(struct acpi_device *device);

struct pci_bus *pci_acpi_scan_root(struct acpi_device *device, int domain,
int bus);
+void pci_acpi_crs_quirks(void);

/* --------------------------------------------------------------------------
Processor

Linus Torvalds

unread,
Feb 3, 2010, 7:10:01 PM2/3/10
to

On Wed, 3 Feb 2010, Bjorn Helgaas wrote:
>
> These apply on 0148b041be4e7, which is the current head of the linux-next
> branch of Jesse's pci-2.6 git tree. The first patch is just Jeff Garrett's
> patch to remove intel_bus.c, so that is only here for people who want to test
> the rest of the patches. I expect Jesse will pick up Jeff's patch via Linus'
> tree.

All patches look sane to me. Let's get them merged early in the next merge
window, and hope for the best.

Linus

Larry Finger

unread,
Feb 3, 2010, 11:40:02 PM2/3/10
to

On my system, "git describe" returns v2.6.33-rc6-146-gc80d292. Patch 1 does not
apply and can be reverted. That is not a problem, but beginning with patch 5,
these do not apply.

In addition to the above, my system now boots with "pci=use_crs", unlike when I
filed the Bugzilla.

What kernel should I be running to test these patches?

Larry

Bjorn Helgaas

unread,
Feb 4, 2010, 1:00:02 PM2/4/10
to
On Wednesday 03 February 2010 09:37:43 pm Larry Finger wrote:
> On 02/03/2010 05:38 PM, Bjorn Helgaas wrote:
> > Larry, you reported the problem the last time I tried to turn on "pci=use_crs"
> > by default. This series shouldn't affect your machine because it's not in the
> > whitelist, but I expect that if you boot the current kernel with "pci=use_crs",
> > it should still fail, and if you boot with these patches and "pci=use_crs", it
> > *should* work. I know it's a lot to ask, but it'd be great if you had a chance
> > to try that.
>
> On my system, "git describe" returns v2.6.33-rc6-146-gc80d292. Patch 1 does not
> apply and can be reverted. That is not a problem, but beginning with patch 5,
> these do not apply.

Looks like you're using Linus' tree. My patches go on top of Jesse's
PCI linux-next tree. Here's how you can do this (assuming you have
stgit as well as git):

Save all the patches in files "/tmp/use-crs.1" through "/tmp/use-crs.7".
These can be plain email; you don't have to remove headers or
anything.

$ cd <git repo>
$ git branch
$ git fetch git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6.git linux-next
$ stg branch -c use-crs 0148b041be4e7
$ for F in `seq 7`; do stg import -m /tmp/use-crs.$F; done

Now you should have a tree with all the patches applied.

After you're done testing, to return to where you were before, use
"git checkout <branch>" where <branch> is the name marked with a "*"
from the very first "git branch" command.

> In addition to the above, my system now boots with "pci=use_crs", unlike when I
> filed the Bugzilla.

Huh. From http://lkml.org/lkml/2009/6/24/11, I had assumed the main
problem was that we overflowed the 16-entry bus resource table, but
there must be more to it.

If you can build and boot the linux-next branch with my patches and
collect the dmesg log, maybe it will have a clue. You can boot without
"pci=use_crs"; I don't think that will make any difference on your box.

Bjorn

Larry Finger

unread,
Feb 4, 2010, 5:40:03 PM2/4/10
to
On 02/04/2010 11:55 AM, Bjorn Helgaas wrote:
> On Wednesday 03 February 2010 09:37:43 pm Larry Finger wrote:
>> On 02/03/2010 05:38 PM, Bjorn Helgaas wrote:
>>> Larry, you reported the problem the last time I tried to turn on "pci=use_crs"
>>> by default. This series shouldn't affect your machine because it's not in the
>>> whitelist, but I expect that if you boot the current kernel with "pci=use_crs",
>>> it should still fail, and if you boot with these patches and "pci=use_crs", it
>>> *should* work. I know it's a lot to ask, but it'd be great if you had a chance
>>> to try that.
>>
>> On my system, "git describe" returns v2.6.33-rc6-146-gc80d292. Patch 1 does not
>> apply and can be reverted. That is not a problem, but beginning with patch 5,
>> these do not apply.
>
> Looks like you're using Linus' tree. My patches go on top of Jesse's
> PCI linux-next tree. Here's how you can do this (assuming you have
> stgit as well as git):
>
> Save all the patches in files "/tmp/use-crs.1" through "/tmp/use-crs.7".
> These can be plain email; you don't have to remove headers or
> anything.
>
> $ cd <git repo>
> $ git branch
> $ git fetch git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6.git linux-next
> $ stg branch -c use-crs 0148b041be4e7
> $ for F in `seq 7`; do stg import -m /tmp/use-crs.$F; done
>
> Now you should have a tree with all the patches applied.

That worked. I actually used quilt to apply the patches as I am more familiar
with it. Patch #2 was already applied, but the rest applied cleanly.

The patched version of the linux-next kernel booted fine. I put the dmesg output
as "Attachment #24914 to bug 14183".

Thanks for the help,

Larry

Bjorn Helgaas

unread,
Feb 4, 2010, 6:00:02 PM2/4/10
to
On Thursday 04 February 2010 10:55:57 am Bjorn Helgaas wrote:
> On Wednesday 03 February 2010 09:37:43 pm Larry Finger wrote:

> > In addition to the above, my system now boots with "pci=use_crs", unlike when I
> > filed the Bugzilla.
>
> Huh. From http://lkml.org/lkml/2009/6/24/11, I had assumed the main
> problem was that we overflowed the 16-entry bus resource table, but
> there must be more to it.
>
> If you can build and boot the linux-next branch with my patches and
> collect the dmesg log, maybe it will have a clue. You can boot without
> "pci=use_crs"; I don't think that will make any difference on your box.

Thanks for testing these and collecting the dmesg log
(http://bugzilla.kernel.org/attachment.cgi?id=24914). That
log is from the PCI linux-next branch plus my patches, without
using "pci=use_crs".

On the current upstream (e.g., the c80d292 kernel you started with),
we have PCI_BUS_NUM_RESOURCES == 16. Your _CRS returns 17 windows
(the "pci_root PNP0A08:00 host bridge window" lines), so when you
boot with "pci=use_crs", we should be discarding the last window,
which is an important one:

pci_root PNP0A08:00: host bridge window [mem 0xc0000000-0xfebfffff]

Can you collect the dmesg log from the c80d292 kernel with "pci=use_crs"?
I'm sorry to trouble you for this, but it still looks to me like that
should fail, so I'd really like to understand why it's working.

0 new messages