[PATCH 0/4] PCI: Add support and tests for FIXUP quirks in modules

0 views
Skip to first unread message

Brian Norris

unread,
Sep 12, 2025, 7:09:54 PM (12 days ago) Sep 12
to Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org, Brian Norris
This series primarily adds support for DECLARE_PCI_FIXUP_*() in modules.
There are a few drivers that already use this, and so they are
presumably broken when built as modules.

While at it, I wrote some unit tests that emulate a fake PCI device, and
let the PCI framework match/not-match its vendor/device IDs. This test
can be built into the kernel or built as a module.

I also include some infrastructure changes (patch 3 and 4), so that
ARCH=um (the default for kunit.py), ARCH=arm, and ARCH=arm64 will run
these tests by default. These patches have different maintainers and are
independent, so they can probably be picked up separately. I included
them because otherwise the tests in patch 2 aren't so easy to run.


Brian Norris (4):
PCI: Support FIXUP quirks in modules
PCI: Add KUnit tests for FIXUP quirks
um: Select PCI_DOMAINS_GENERIC
kunit: qemu_configs: Add PCI to arm, arm64

arch/um/Kconfig | 1 +
drivers/pci/Kconfig | 11 ++
drivers/pci/Makefile | 1 +
drivers/pci/fixup-test.c | 197 ++++++++++++++++++++++
drivers/pci/quirks.c | 62 +++++++
include/linux/module.h | 18 ++
kernel/module/main.c | 26 +++
tools/testing/kunit/qemu_configs/arm.py | 1 +
tools/testing/kunit/qemu_configs/arm64.py | 1 +
9 files changed, 318 insertions(+)
create mode 100644 drivers/pci/fixup-test.c

--
2.51.0.384.g4c02a37b29-goog

Brian Norris

unread,
Sep 12, 2025, 7:09:55 PM (12 days ago) Sep 12
to Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org, Brian Norris
The PCI framework supports "quirks" for PCI devices via several
DECLARE_PCI_FIXUP_*() macros. These macros allow arch or driver code to
match device IDs to provide customizations or workarounds for broken
devices.

This mechanism is generally used in code that can only be built into the
kernel, but there are a few occasions where this mechanism is used in
drivers that can be modules. For example, see commit 574f29036fce ("PCI:
iproc: Apply quirk_paxc_bridge() for module as well as built-in").

The PCI fixup mechanism only works for built-in code, however, because
pci_fixup_device() only scans the ".pci_fixup_*" linker sections found
in the main kernel; it never touches modules.

Extend the fixup approach to modules.

I don't attempt to be clever here; the algorithm here scales with the
number of modules in the system.

Signed-off-by: Brian Norris <brian...@chromium.org>
---

drivers/pci/quirks.c | 62 ++++++++++++++++++++++++++++++++++++++++++
include/linux/module.h | 18 ++++++++++++
kernel/module/main.c | 26 ++++++++++++++++++
3 files changed, 106 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index d97335a40193..db5e0ac82ed7 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -207,6 +207,62 @@ extern struct pci_fixup __end_pci_fixups_suspend_late[];

static bool pci_apply_fixup_final_quirks;

+struct pci_fixup_arg {
+ struct pci_dev *dev;
+ enum pci_fixup_pass pass;
+};
+
+static int pci_module_fixup(struct module *mod, void *parm)
+{
+ struct pci_fixup_arg *arg = parm;
+ void *start;
+ unsigned int size;
+
+ switch (arg->pass) {
+ case pci_fixup_early:
+ start = mod->pci_fixup_early;
+ size = mod->pci_fixup_early_size;
+ break;
+ case pci_fixup_header:
+ start = mod->pci_fixup_header;
+ size = mod->pci_fixup_header_size;
+ break;
+ case pci_fixup_final:
+ start = mod->pci_fixup_final;
+ size = mod->pci_fixup_final_size;
+ break;
+ case pci_fixup_enable:
+ start = mod->pci_fixup_enable;
+ size = mod->pci_fixup_enable_size;
+ break;
+ case pci_fixup_resume:
+ start = mod->pci_fixup_resume;
+ size = mod->pci_fixup_resume_size;
+ break;
+ case pci_fixup_suspend:
+ start = mod->pci_fixup_suspend;
+ size = mod->pci_fixup_suspend_size;
+ break;
+ case pci_fixup_resume_early:
+ start = mod->pci_fixup_resume_early;
+ size = mod->pci_fixup_resume_early_size;
+ break;
+ case pci_fixup_suspend_late:
+ start = mod->pci_fixup_suspend_late;
+ size = mod->pci_fixup_suspend_late_size;
+ break;
+ default:
+ return 0;
+ }
+
+ if (!size)
+ return 0;
+
+ pci_do_fixups(arg->dev, start, start + size);
+
+ return 0;
+}
+
void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev)
{
struct pci_fixup *start, *end;
@@ -259,6 +315,12 @@ void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev)
return;
}
pci_do_fixups(dev, start, end);
+
+ struct pci_fixup_arg arg = {
+ .dev = dev,
+ .pass = pass,
+ };
+ module_for_each_mod(pci_module_fixup, &arg);
}
EXPORT_SYMBOL(pci_fixup_device);

diff --git a/include/linux/module.h b/include/linux/module.h
index 3319a5269d28..7faa8987b9eb 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -539,6 +539,24 @@ struct module {
int num_kunit_suites;
struct kunit_suite **kunit_suites;
#endif
+#ifdef CONFIG_PCI_QUIRKS
+ void *pci_fixup_early;
+ unsigned int pci_fixup_early_size;
+ void *pci_fixup_header;
+ unsigned int pci_fixup_header_size;
+ void *pci_fixup_final;
+ unsigned int pci_fixup_final_size;
+ void *pci_fixup_enable;
+ unsigned int pci_fixup_enable_size;
+ void *pci_fixup_resume;
+ unsigned int pci_fixup_resume_size;
+ void *pci_fixup_suspend;
+ unsigned int pci_fixup_suspend_size;
+ void *pci_fixup_resume_early;
+ unsigned int pci_fixup_resume_early_size;
+ void *pci_fixup_suspend_late;
+ unsigned int pci_fixup_suspend_late_size;
+#endif


#ifdef CONFIG_LIVEPATCH
diff --git a/kernel/module/main.c b/kernel/module/main.c
index c66b26184936..50a80c875adc 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -2702,6 +2702,32 @@ static int find_module_sections(struct module *mod, struct load_info *info)
sizeof(*mod->kunit_init_suites),
&mod->num_kunit_init_suites);
#endif
+#ifdef CONFIG_PCI_QUIRKS
+ mod->pci_fixup_early = section_objs(info, ".pci_fixup_early",
+ sizeof(*mod->pci_fixup_early),
+ &mod->pci_fixup_early_size);
+ mod->pci_fixup_header = section_objs(info, ".pci_fixup_header",
+ sizeof(*mod->pci_fixup_header),
+ &mod->pci_fixup_header_size);
+ mod->pci_fixup_final = section_objs(info, ".pci_fixup_final",
+ sizeof(*mod->pci_fixup_final),
+ &mod->pci_fixup_final_size);
+ mod->pci_fixup_enable = section_objs(info, ".pci_fixup_enable",
+ sizeof(*mod->pci_fixup_enable),
+ &mod->pci_fixup_enable_size);
+ mod->pci_fixup_resume = section_objs(info, ".pci_fixup_resume",
+ sizeof(*mod->pci_fixup_resume),
+ &mod->pci_fixup_resume_size);
+ mod->pci_fixup_suspend = section_objs(info, ".pci_fixup_suspend",
+ sizeof(*mod->pci_fixup_suspend),
+ &mod->pci_fixup_suspend_size);
+ mod->pci_fixup_resume_early = section_objs(info, ".pci_fixup_resume_early",
+ sizeof(*mod->pci_fixup_resume_early),
+ &mod->pci_fixup_resume_early_size);
+ mod->pci_fixup_suspend_late = section_objs(info, ".pci_fixup_suspend_late",
+ sizeof(*mod->pci_fixup_suspend_late),
+ &mod->pci_fixup_suspend_late_size);
+#endif

mod->extable = section_objs(info, "__ex_table",
sizeof(*mod->extable), &mod->num_exentries);
--
2.51.0.384.g4c02a37b29-goog

Brian Norris

unread,
Sep 12, 2025, 7:09:57 PM (12 days ago) Sep 12
to Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org, Brian Norris
The PCI framework supports device quirks via a series of macros and
linker sections. This support previously did not work when used in
modules. Add a basic set of tests for matching/non-matching devices.

Example run:

$ ./tools/testing/kunit/kunit.py run 'pci_fixup*'
[...]
[15:31:30] ============ pci_fixup_test_cases (2 subtests) =============
[15:31:30] [PASSED] pci_fixup_match_test
[15:31:30] [PASSED] pci_fixup_nomatch_test
[15:31:30] ============== [PASSED] pci_fixup_test_cases ===============
[15:31:30] ============================================================
[15:31:30] Testing complete. Ran 2 tests: passed: 2
[15:31:30] Elapsed time: 11.197s total, 0.001s configuring, 9.870s building, 1.299s running

Signed-off-by: Brian Norris <brian...@chromium.org>
---

drivers/pci/Kconfig | 11 +++
drivers/pci/Makefile | 1 +
drivers/pci/fixup-test.c | 197 +++++++++++++++++++++++++++++++++++++++
3 files changed, 209 insertions(+)
create mode 100644 drivers/pci/fixup-test.c

diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 9a249c65aedc..a4fa9be797e7 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -68,6 +68,17 @@ config PCI_QUIRKS
Disable this only if your target machine is unaffected by PCI
quirks.

+config PCI_FIXUP_KUNIT_TEST
+ tristate "KUnit tests for PCI fixup code" if !KUNIT_ALL_TESTS
+ depends on KUNIT
+ depends on PCI_DOMAINS_GENERIC
+ default KUNIT_ALL_TESTS
+ help
+ This builds unit tests for the PCI quirk/fixup framework. Recommended
+ only for kernel developers.
+
+ When in doubt, say N.
+
config PCI_DEBUG
bool "PCI Debugging"
depends on DEBUG_KERNEL
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 67647f1880fb..ade400250ceb 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -20,6 +20,7 @@ endif

obj-$(CONFIG_OF) += of.o
obj-$(CONFIG_PCI_QUIRKS) += quirks.o
+obj-$(CONFIG_PCI_FIXUP_KUNIT_TEST) += fixup-test.o
obj-$(CONFIG_HOTPLUG_PCI) += hotplug/
obj-$(CONFIG_PCI_ATS) += ats.o
obj-$(CONFIG_PCI_IOV) += iov.o
diff --git a/drivers/pci/fixup-test.c b/drivers/pci/fixup-test.c
new file mode 100644
index 000000000000..54b895fc8f3e
--- /dev/null
+++ b/drivers/pci/fixup-test.c
@@ -0,0 +1,197 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2025 Google, Inc.
+ */
+
+#include <kunit/device.h>
+#include <kunit/test.h>
+#include <linux/cleanup.h>
+#include <linux/pci.h>
+
+#define DEVICE_NAME "pci_fixup_test_device"
+#define TEST_VENDOR_ID 0xdead
+#define TEST_DEVICE_ID 0xbeef
+
+#define TEST_CONF_SIZE 4096
+static u8 *test_conf_space;
+
+#define test_readb(addr) (*(u8 *)(addr))
+#define test_readw(addr) le16_to_cpu(*((__force __le16 *)(addr)))
+#define test_readl(addr) le32_to_cpu(*((__force __le32 *)(addr)))
+#define test_writeb(addr, v) (*(u8 *)(addr) = (v))
+#define test_writew(addr, v) (*((__force __le16 *)(addr)) = cpu_to_le16(v))
+#define test_writel(addr, v) (*((__force __le32 *)(addr)) = cpu_to_le32(v))
+
+static int test_config_read(struct pci_bus *bus, unsigned int devfn, int where,
+ int size, u32 *val)
+{
+ if (PCI_SLOT(devfn) > 0)
+ return PCIBIOS_DEVICE_NOT_FOUND;
+
+ if (where + size > TEST_CONF_SIZE)
+ return PCIBIOS_BUFFER_TOO_SMALL;
+
+ if (size == 1)
+ *val = test_readb(test_conf_space + where);
+ else if (size == 2)
+ *val = test_readw(test_conf_space + where);
+ else if (size == 4)
+ *val = test_readl(test_conf_space + where);
+
+ return PCIBIOS_SUCCESSFUL;
+}
+
+static int test_config_write(struct pci_bus *bus, unsigned int devfn, int where,
+ int size, u32 val)
+{
+ if (PCI_SLOT(devfn) > 0)
+ return PCIBIOS_DEVICE_NOT_FOUND;
+
+ if (where + size > TEST_CONF_SIZE)
+ return PCIBIOS_BUFFER_TOO_SMALL;
+
+ if (size == 1)
+ test_writeb(test_conf_space + where, val);
+ else if (size == 2)
+ test_writew(test_conf_space + where, val);
+ else if (size == 4)
+ test_writel(test_conf_space + where, val);
+
+ return PCIBIOS_SUCCESSFUL;
+}
+
+static struct pci_ops test_ops = {
+ .read = test_config_read,
+ .write = test_config_write,
+};
+
+static struct pci_dev *hook_device_early;
+static struct pci_dev *hook_device_header;
+static struct pci_dev *hook_device_final;
+static struct pci_dev *hook_device_enable;
+
+static void pci_fixup_early_hook(struct pci_dev *pdev)
+{
+ hook_device_early = pdev;
+}
+DECLARE_PCI_FIXUP_EARLY(TEST_VENDOR_ID, TEST_DEVICE_ID, pci_fixup_early_hook);
+
+static void pci_fixup_header_hook(struct pci_dev *pdev)
+{
+ hook_device_header = pdev;
+}
+DECLARE_PCI_FIXUP_HEADER(TEST_VENDOR_ID, TEST_DEVICE_ID, pci_fixup_header_hook);
+
+static void pci_fixup_final_hook(struct pci_dev *pdev)
+{
+ hook_device_final = pdev;
+}
+DECLARE_PCI_FIXUP_FINAL(TEST_VENDOR_ID, TEST_DEVICE_ID, pci_fixup_final_hook);
+
+static void pci_fixup_enable_hook(struct pci_dev *pdev)
+{
+ hook_device_enable = pdev;
+}
+DECLARE_PCI_FIXUP_ENABLE(TEST_VENDOR_ID, TEST_DEVICE_ID, pci_fixup_enable_hook);
+
+static int pci_fixup_test_init(struct kunit *test)
+{
+ hook_device_early = NULL;
+ hook_device_header = NULL;
+ hook_device_final = NULL;
+ hook_device_enable = NULL;
+
+ return 0;
+}
+
+static void pci_fixup_match_test(struct kunit *test)
+{
+ struct device *dev = kunit_device_register(test, DEVICE_NAME);
+
+ KUNIT_ASSERT_PTR_NE(test, NULL, dev);
+
+ test_conf_space = kunit_kzalloc(test, TEST_CONF_SIZE, GFP_KERNEL);
+ KUNIT_ASSERT_PTR_NE(test, NULL, test_conf_space);
+
+ /* Minimal configuration space: a stub vendor and device ID */
+ test_writew(test_conf_space + PCI_VENDOR_ID, TEST_VENDOR_ID);
+ test_writew(test_conf_space + PCI_DEVICE_ID, TEST_DEVICE_ID);
+
+ struct pci_host_bridge *bridge = devm_pci_alloc_host_bridge(dev, 0);
+
+ KUNIT_ASSERT_PTR_NE(test, NULL, bridge);
+ bridge->ops = &test_ops;
+
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_early);
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_header);
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_final);
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_enable);
+
+ KUNIT_EXPECT_EQ(test, 0, pci_host_probe(bridge));
+
+ struct pci_dev *pdev __free(pci_dev_put) =
+ pci_get_slot(bridge->bus, PCI_DEVFN(0, 0));
+ KUNIT_ASSERT_PTR_NE(test, NULL, pdev);
+
+ KUNIT_EXPECT_PTR_EQ(test, pdev, hook_device_early);
+ KUNIT_EXPECT_PTR_EQ(test, pdev, hook_device_header);
+ KUNIT_EXPECT_PTR_EQ(test, pdev, hook_device_final);
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_enable);
+
+ KUNIT_EXPECT_EQ(test, 0, pci_enable_device(pdev));
+ KUNIT_EXPECT_PTR_EQ(test, pdev, hook_device_enable);
+}
+
+static void pci_fixup_nomatch_test(struct kunit *test)
+{
+ struct device *dev = kunit_device_register(test, DEVICE_NAME);
+
+ KUNIT_ASSERT_PTR_NE(test, NULL, dev);
+
+ test_conf_space = kunit_kzalloc(test, TEST_CONF_SIZE, GFP_KERNEL);
+ KUNIT_ASSERT_PTR_NE(test, NULL, test_conf_space);
+
+ /* Minimal configuration space: a stub vendor and device ID */
+ test_writew(test_conf_space + PCI_VENDOR_ID, TEST_VENDOR_ID + 1);
+ test_writew(test_conf_space + PCI_DEVICE_ID, TEST_DEVICE_ID);
+
+ struct pci_host_bridge *bridge = devm_pci_alloc_host_bridge(dev, 0);
+
+ KUNIT_ASSERT_PTR_NE(test, NULL, bridge);
+ bridge->ops = &test_ops;
+
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_early);
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_header);
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_final);
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_enable);
+
+ KUNIT_EXPECT_EQ(test, 0, pci_host_probe(bridge));
+
+ struct pci_dev *pdev __free(pci_dev_put) =
+ pci_get_slot(bridge->bus, PCI_DEVFN(0, 0));
+ KUNIT_ASSERT_PTR_NE(test, NULL, pdev);
+
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_early);
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_header);
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_final);
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_enable);
+
+ KUNIT_EXPECT_EQ(test, 0, pci_enable_device(pdev));
+ KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_enable);
+}
+
+static struct kunit_case pci_fixup_test_cases[] = {
+ KUNIT_CASE(pci_fixup_match_test),
+ KUNIT_CASE(pci_fixup_nomatch_test),
+ {}
+};
+
+static struct kunit_suite pci_fixup_test_suite = {
+ .name = "pci_fixup_test_cases",
+ .test_cases = pci_fixup_test_cases,
+ .init = pci_fixup_test_init,
+};
+
+kunit_test_suite(pci_fixup_test_suite);
+MODULE_DESCRIPTION("PCI fixups unit test suite");
+MODULE_LICENSE("GPL");
--
2.51.0.384.g4c02a37b29-goog

Brian Norris

unread,
Sep 12, 2025, 7:09:59 PM (12 days ago) Sep 12
to Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org, Brian Norris
This is useful especially for KUnit tests, where we may want to
dynamically add/remove PCI domains.

Signed-off-by: Brian Norris <brian...@chromium.org>
---

arch/um/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index 9083bfdb7735..7fccd63c3229 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -38,6 +38,7 @@ config UML
select HAVE_ARCH_TRACEHOOK
select HAVE_SYSCALL_TRACEPOINTS
select THREAD_INFO_IN_TASK
+ select PCI_DOMAINS_GENERIC if PCI

config MMU
bool
--
2.51.0.384.g4c02a37b29-goog

Brian Norris

unread,
Sep 12, 2025, 7:10:01 PM (12 days ago) Sep 12
to Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org, Brian Norris
To get some more test coverage on PCI tests.

Signed-off-by: Brian Norris <brian...@chromium.org>
---

tools/testing/kunit/qemu_configs/arm.py | 1 +
tools/testing/kunit/qemu_configs/arm64.py | 1 +
2 files changed, 2 insertions(+)

diff --git a/tools/testing/kunit/qemu_configs/arm.py b/tools/testing/kunit/qemu_configs/arm.py
index db2160200566..101d67e5157c 100644
--- a/tools/testing/kunit/qemu_configs/arm.py
+++ b/tools/testing/kunit/qemu_configs/arm.py
@@ -3,6 +3,7 @@ from ..qemu_config import QemuArchParams
QEMU_ARCH = QemuArchParams(linux_arch='arm',
kconfig='''
CONFIG_ARCH_VIRT=y
+CONFIG_PCI=y
CONFIG_SERIAL_AMBA_PL010=y
CONFIG_SERIAL_AMBA_PL010_CONSOLE=y
CONFIG_SERIAL_AMBA_PL011=y
diff --git a/tools/testing/kunit/qemu_configs/arm64.py b/tools/testing/kunit/qemu_configs/arm64.py
index 5c44d3a87e6d..ba2b4e660ba7 100644
--- a/tools/testing/kunit/qemu_configs/arm64.py
+++ b/tools/testing/kunit/qemu_configs/arm64.py
@@ -2,6 +2,7 @@ from ..qemu_config import QemuArchParams

QEMU_ARCH = QemuArchParams(linux_arch='arm64',
kconfig='''
+CONFIG_PCI=y
CONFIG_SERIAL_AMBA_PL010=y
CONFIG_SERIAL_AMBA_PL010_CONSOLE=y
CONFIG_SERIAL_AMBA_PL011=y
--
2.51.0.384.g4c02a37b29-goog

Johannes Berg

unread,
Sep 15, 2025, 2:33:21 AM (10 days ago) Sep 15
to Brian Norris, Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org
On Fri, 2025-09-12 at 15:59 -0700, Brian Norris wrote:
> The PCI framework supports "quirks" for PCI devices via several
> DECLARE_PCI_FIXUP_*() macros. These macros allow arch or driver code to
> match device IDs to provide customizations or workarounds for broken
> devices.
>
> This mechanism is generally used in code that can only be built into the
> kernel, but there are a few occasions where this mechanism is used in
> drivers that can be modules. For example, see commit 574f29036fce ("PCI:
> iproc: Apply quirk_paxc_bridge() for module as well as built-in").
>
> The PCI fixup mechanism only works for built-in code, however, because
> pci_fixup_device() only scans the ".pci_fixup_*" linker sections found
> in the main kernel; it never touches modules.
>
> Extend the fixup approach to modules.

This _feels_ a bit odd to me - what if you reload a module, should the
fixup be done twice? Right now this was not possible in a module, which
is a bit of a gotcha, but at least that's only one for developers, not
for users (unless someone messes up and puts it into modular code, as in
the example you gave.)

Although, come to think of it, you don't even apply the fixup when the
module is loaded, so what I just wrote isn't really true. That almost
seems like an oversight though, now the module has to be loaded before
the PCI device is enumerated, which is unlikely to happen in practice?
But then we get the next gotcha - the device is already enumerated, so
the fixups cannot be applied at the various enumeration stages, and
you're back to having to load the module before PCI enumeration, which
could be tricky, or somehow forcing re-enumeration of a given device
from userspace, but then you're firmly in "gotcha for the user"
territory again ...

I don't really have any skin in this game, but overall I'd probably
argue it's better to occasionally have to fix things such as in the
commit you point out but have a predictable system, than apply things
from modules.

Perhaps it'd be better to extend the section checking infrastructure to
catch and error out on these sections in modules instead, so we catch it
at build time, rather than finding things missing at runtime?

And yeah, now I've totally ignored the kunit angle, but ... not sure how
to combine the two requirements if they are, as I think, conflicting.

johannes

Tzung-Bi Shih

unread,
Sep 15, 2025, 4:06:40 AM (10 days ago) Sep 15
to Brian Norris, Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org
On Fri, Sep 12, 2025 at 03:59:33PM -0700, Brian Norris wrote:
> +static int test_config_read(struct pci_bus *bus, unsigned int devfn, int where,
> + int size, u32 *val)
> +{
> + if (PCI_SLOT(devfn) > 0)
> + return PCIBIOS_DEVICE_NOT_FOUND;
> +
> + if (where + size > TEST_CONF_SIZE)
> + return PCIBIOS_BUFFER_TOO_SMALL;
> +
> + if (size == 1)
> + *val = test_readb(test_conf_space + where);
> + else if (size == 2)
> + *val = test_readw(test_conf_space + where);
> + else if (size == 4)
> + *val = test_readl(test_conf_space + where);
> +
> + return PCIBIOS_SUCCESSFUL;

To handle cases where size might be a value other than {1, 2, 4}, would a
switch statement with a default case be more robust here?

> +static int test_config_write(struct pci_bus *bus, unsigned int devfn, int where,
> + int size, u32 val)
> +{
> + if (PCI_SLOT(devfn) > 0)
> + return PCIBIOS_DEVICE_NOT_FOUND;
> +
> + if (where + size > TEST_CONF_SIZE)
> + return PCIBIOS_BUFFER_TOO_SMALL;
> +
> + if (size == 1)
> + test_writeb(test_conf_space + where, val);
> + else if (size == 2)
> + test_writew(test_conf_space + where, val);
> + else if (size == 4)
> + test_writel(test_conf_space + where, val);
> +
> + return PCIBIOS_SUCCESSFUL;

Same here.

> +static struct pci_dev *hook_device_early;
> +static struct pci_dev *hook_device_header;
> +static struct pci_dev *hook_device_final;
> +static struct pci_dev *hook_device_enable;
> +
> +static void pci_fixup_early_hook(struct pci_dev *pdev)
> +{
> + hook_device_early = pdev;
> +}
> +DECLARE_PCI_FIXUP_EARLY(TEST_VENDOR_ID, TEST_DEVICE_ID, pci_fixup_early_hook);
> [...]
> +static int pci_fixup_test_init(struct kunit *test)
> +{
> + hook_device_early = NULL;
> + hook_device_header = NULL;
> + hook_device_final = NULL;
> + hook_device_enable = NULL;
> +
> + return 0;
> +}

FWIW: if the probe is synchronous and the thread is the same task_struct,
the module level variables can be eliminated by using:

test->priv = kunit_kzalloc(...);
KUNIT_ASSERT_PTR_NE(...);

And in the hooks, kunit_get_current_test() returns the struct kunit *.

> +static void pci_fixup_match_test(struct kunit *test)
> +{
> + struct device *dev = kunit_device_register(test, DEVICE_NAME);
> +
> + KUNIT_ASSERT_PTR_NE(test, NULL, dev);
> +
> + test_conf_space = kunit_kzalloc(test, TEST_CONF_SIZE, GFP_KERNEL);
> + KUNIT_ASSERT_PTR_NE(test, NULL, test_conf_space);

The common initialization code can be moved to pci_fixup_test_init().

> + struct pci_host_bridge *bridge = devm_pci_alloc_host_bridge(dev, 0);
> +
> + KUNIT_ASSERT_PTR_NE(test, NULL, bridge);
> + bridge->ops = &test_ops;

The `bridge` allocation can be moved to .init() too.

> + KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_early);
> + KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_header);
> + KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_final);
> + KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_enable);

Does it really need to check them? They are just initialized by .init().

Christoph Hellwig

unread,
Sep 15, 2025, 9:48:28 AM (10 days ago) Sep 15
to Brian Norris, Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org
On Fri, Sep 12, 2025 at 03:59:31PM -0700, Brian Norris wrote:
> This series primarily adds support for DECLARE_PCI_FIXUP_*() in modules.
> There are a few drivers that already use this, and so they are
> presumably broken when built as modules.

That's a reall bad idea, because it allows random code to insert quirks
not even bound to the hardware they support.

So no, modules should not allow quirks, but the kernel should probably
be nice enough to fail compilation when someone is attemping that
instead of silently ignoring the quirks.

Brian Norris

unread,
Sep 15, 2025, 2:34:15 PM (10 days ago) Sep 15
to Johannes Berg, Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org, Manivannan Sadhasivam
Hi Johannes,

On Mon, Sep 15, 2025 at 08:33:08AM +0200, Johannes Berg wrote:
> On Fri, 2025-09-12 at 15:59 -0700, Brian Norris wrote:
> > The PCI framework supports "quirks" for PCI devices via several
> > DECLARE_PCI_FIXUP_*() macros. These macros allow arch or driver code to
> > match device IDs to provide customizations or workarounds for broken
> > devices.
> >
> > This mechanism is generally used in code that can only be built into the
> > kernel, but there are a few occasions where this mechanism is used in
> > drivers that can be modules. For example, see commit 574f29036fce ("PCI:
> > iproc: Apply quirk_paxc_bridge() for module as well as built-in").
> >
> > The PCI fixup mechanism only works for built-in code, however, because
> > pci_fixup_device() only scans the ".pci_fixup_*" linker sections found
> > in the main kernel; it never touches modules.
> >
> > Extend the fixup approach to modules.
>
> This _feels_ a bit odd to me - what if you reload a module, should the
> fixup be done twice? Right now this was not possible in a module, which
> is a bit of a gotcha, but at least that's only one for developers, not
> for users (unless someone messes up and puts it into modular code, as in
> the example you gave.)

My assumption was that FIXUPs in modules are only legitimate if they
apply to a dependency chain that involves the module they are built
into. So for example, the fixup could apply to a bridge that is
supported only by the module (driver) in question; or it could apply
only to devices that sit under the controller in question [1].

Everything I see that could potentially be in a module works like this
AFAICT.

To answer your question: no, the fixup should not be done twice, unless
the device is removed and recreated. More below.

[1] The quirks in drivers/pci/controller/dwc/pci-keystone.c look like
this. (Side note: pci-keystone.c cannot be built as a module today.)

> Although, come to think of it, you don't even apply the fixup when the
> module is loaded, so what I just wrote isn't really true. That almost
> seems like an oversight though, now the module has to be loaded before
> the PCI device is enumerated, which is unlikely to happen in practice?
> But then we get the next gotcha - the device is already enumerated, so
> the fixups cannot be applied at the various enumeration stages, and
> you're back to having to load the module before PCI enumeration, which
> could be tricky, or somehow forcing re-enumeration of a given device
> from userspace, but then you're firmly in "gotcha for the user"
> territory again ...

With my assumption above, none of this would really be needed. The
relevant device(s) will only exist after the module is loaded, and they
will go away when the module is gone.

Or am I misreading your problem statements?

> I don't really have any skin in this game, but overall I'd probably
> argue it's better to occasionally have to fix things such as in the
> commit you point out but have a predictable system, than apply things
> from modules.

FWIW, I believe some folks are working on making *more* controller
drivers modular. So this problem will bite more people. (Specifically, I
believe Manivannan was working on
drivers/pci/controller/dwc/pcie-qcom.c, and it has plenty of FIXUPs.)

I also don't think it makes things much less predictable, as long as
developers abide by my above assumption. I think that's a perfectly
reasonable assumption (it's not so different than, say,
MODULE_DEVICE_TABLE), but I could perhaps be convinced otherwise.

> Perhaps it'd be better to extend the section checking infrastructure to
> catch and error out on these sections in modules instead, so we catch it
> at build time, rather than finding things missing at runtime?

Maybe I'm missing something here, but it seems like it'd be pretty easy
to do something like:

#ifdef MODULE
#define DECLARE_PCI_FIXUP_SECTION...) BUILD_BUG()
#else
... real definitions ...
#endif

I'd prefer not doing this though, if we can help it, since I believe
(a) FIXUPs are useful in perfectly reasonable ways for controller
drivers and
(b) controller drivers can potentially be modules (yes, there are some
pitfalls besides $subject).

> And yeah, now I've totally ignored the kunit angle, but ... not sure how
> to combine the two requirements if they are, as I think, conflicting.

Right, either we support FIXUPs in modules, or we should outlaw them.

For kunit: we could still add tests, but just force them to be built-in.
It wouldn't be the first kernel subsystem to need that.

Brian

Brian Norris

unread,
Sep 15, 2025, 2:41:41 PM (10 days ago) Sep 15
to Christoph Hellwig, Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org
Hi Christoph,

On Mon, Sep 15, 2025 at 06:48:22AM -0700, Christoph Hellwig wrote:
> On Fri, Sep 12, 2025 at 03:59:31PM -0700, Brian Norris wrote:
> > This series primarily adds support for DECLARE_PCI_FIXUP_*() in modules.
> > There are a few drivers that already use this, and so they are
> > presumably broken when built as modules.
>
> That's a reall bad idea, because it allows random code to insert quirks
> not even bound to the hardware they support.

I see fixups in controller drivers here:

drivers/pci/controller/dwc/pci-imx6.c
drivers/pci/controller/dwc/pci-keystone.c
drivers/pci/controller/dwc/pcie-qcom.c
drivers/pci/controller/pci-loongson.c
drivers/pci/controller/pci-tegra.c
drivers/pci/controller/pcie-iproc-bcma.c
drivers/pci/controller/pcie-iproc.c

Are any of those somehow wrong?

And if they are not wrong, then is this a good reason to disallow making
these drivers modular? (Yes, few of them are currently modular; but I
don't see why that *must* be the case.)

I agree, as with many kernel features, there are plenty of ways to use
them incorrectly. But I'm just trying to patch over one rough edge about
how to use them incorrectly, and I don't really see why it's such a bad
idea.

> So no, modules should not allow quirks, but the kernel should probably
> be nice enough to fail compilation when someone is attemping that
> instead of silently ignoring the quirks.

Sure, if consensus says we should not support this, I'd definitely like
to make this failure mode more obvious -- likely a build error.

Thanks for your thoughts,
Brian

Brian Norris

unread,
Sep 15, 2025, 4:25:15 PM (10 days ago) Sep 15
to Tzung-Bi Shih, Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org
Hi,

On Mon, Sep 15, 2025 at 08:06:33AM +0000, Tzung-Bi Shih wrote:
> On Fri, Sep 12, 2025 at 03:59:33PM -0700, Brian Norris wrote:
> > +static int test_config_read(struct pci_bus *bus, unsigned int devfn, int where,
> > + int size, u32 *val)
> > +{
> > + if (PCI_SLOT(devfn) > 0)
> > + return PCIBIOS_DEVICE_NOT_FOUND;
> > +
> > + if (where + size > TEST_CONF_SIZE)
> > + return PCIBIOS_BUFFER_TOO_SMALL;
> > +
> > + if (size == 1)
> > + *val = test_readb(test_conf_space + where);
> > + else if (size == 2)
> > + *val = test_readw(test_conf_space + where);
> > + else if (size == 4)
> > + *val = test_readl(test_conf_space + where);
> > +
> > + return PCIBIOS_SUCCESSFUL;
>
> To handle cases where size might be a value other than {1, 2, 4}, would a
> switch statement with a default case be more robust here?

I was patterning based on pci_generic_config_read() and friends, but I
see that those use an 'else' for the last block, where I used an 'else
if'.

I suppose I could switch to 'else'. I'm not sure 'switch/case' is much
better.
Ah, good suggestion, will give that a shot.

> > +static void pci_fixup_match_test(struct kunit *test)
> > +{
> > + struct device *dev = kunit_device_register(test, DEVICE_NAME);
> > +
> > + KUNIT_ASSERT_PTR_NE(test, NULL, dev);
> > +
> > + test_conf_space = kunit_kzalloc(test, TEST_CONF_SIZE, GFP_KERNEL);
> > + KUNIT_ASSERT_PTR_NE(test, NULL, test_conf_space);
>
> The common initialization code can be moved to pci_fixup_test_init().
>
> > + struct pci_host_bridge *bridge = devm_pci_alloc_host_bridge(dev, 0);
> > +
> > + KUNIT_ASSERT_PTR_NE(test, NULL, bridge);
> > + bridge->ops = &test_ops;
>
> The `bridge` allocation can be moved to .init() too.
>
> > + KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_early);
> > + KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_header);
> > + KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_final);
> > + KUNIT_EXPECT_PTR_EQ(test, NULL, hook_device_enable);
>
> Does it really need to check them? They are just initialized by .init().

Probably not. I wrote these before there were multiple test cases and an
.init() function, and I didn't reconsider them afterward. And they'll be
especially pointless once these move into a kzalloc'd private structure.

Thanks for the suggestions.

Brian

Christoph Hellwig

unread,
Sep 22, 2025, 2:13:45 PM (3 days ago) Sep 22
to Brian Norris, Christoph Hellwig, Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org
On Mon, Sep 15, 2025 at 11:41:37AM -0700, Brian Norris wrote:
> I see fixups in controller drivers here:
>
> drivers/pci/controller/dwc/pci-imx6.c
> drivers/pci/controller/dwc/pci-keystone.c
> drivers/pci/controller/dwc/pcie-qcom.c
> drivers/pci/controller/pci-loongson.c
> drivers/pci/controller/pci-tegra.c
> drivers/pci/controller/pcie-iproc-bcma.c
> drivers/pci/controller/pcie-iproc.c
>
> Are any of those somehow wrong?

Controller drivers are a special case I guess, but I'd rather still
not open it up to any random driver. When did we allow modular
controller drivers anyway? That feels like a somewhat bad idea, too.

Brian Norris

unread,
Sep 22, 2025, 2:48:42 PM (3 days ago) Sep 22
to Christoph Hellwig, Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org
On Mon, Sep 22, 2025 at 11:13:39AM -0700, Christoph Hellwig wrote:
> Controller drivers are a special case I guess, but I'd rather still
> not open it up to any random driver.

I don't really see why this particular thing should develop restrictions
beyond "can it work in modules?", but if you have an idea for how to do
that reasonably, my ears are open.

> When did we allow modular
> controller drivers anyway?

An approximate count:

$ git grep tristate ./drivers/pci/controller/ | wc -l
39

There's been a steady trickle of module-related changes over the years.
And several modular controller drivers predate the
drivers/pci/controller/ creation in 2018 at commit 6e0832fa432e ("PCI:
Collect all native drivers under drivers/pci/controller/").

> That feels like a somewhat bad idea, too.

Any particular reason behind that feeling? Most other bus frameworks I'm
familiar with support modular drivers.

Brian

Petr Pavlu

unread,
Sep 23, 2025, 8:55:39 AM (2 days ago) Sep 23
to Brian Norris, Bjorn Helgaas, Luis Chamberlain, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org
The function module_for_each_mod() walks not only modules that are LIVE,
but also those in the COMING and GOING states. This means that this code
can potentially execute a PCI fixup from a module before its init
function is invoked, and similarly, a fixup can be executed after the
exit function has already run. Is this intentional?
Nit: I suggest writing the object_size argument passed to section_objs()
here directly as "1" instead of using sizeof(*mod->pci_fixup_...) =
sizeof(void). This makes the style consistent with the other code in
find_module_sections().

--
Thanks,
Petr

Manivannan Sadhasivam

unread,
Sep 23, 2025, 12:21:07 PM (2 days ago) Sep 23
to Christoph Hellwig, Brian Norris, Bjorn Helgaas, Luis Chamberlain, Petr Pavlu, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org
On Mon, Sep 22, 2025 at 11:13:39AM -0700, Christoph Hellwig wrote:
> On Mon, Sep 15, 2025 at 11:41:37AM -0700, Brian Norris wrote:
> > I see fixups in controller drivers here:
> >
> > drivers/pci/controller/dwc/pci-imx6.c
> > drivers/pci/controller/dwc/pci-keystone.c
> > drivers/pci/controller/dwc/pcie-qcom.c
> > drivers/pci/controller/pci-loongson.c
> > drivers/pci/controller/pci-tegra.c
> > drivers/pci/controller/pcie-iproc-bcma.c
> > drivers/pci/controller/pcie-iproc.c
> >
> > Are any of those somehow wrong?
>
> When did we allow modular
> controller drivers anyway? That feels like a somewhat bad idea, too.
>

Why not? We currently only restrict the controller drivers implementing the
irqchip controller from being *removed* because of the IRQ disposal concern.
Other than that, I don't see why kernel should restrict building them as
modules.

- Mani

--
மணிவண்ணன் சதாசிவம்

Brian Norris

unread,
Sep 23, 2025, 1:42:18 PM (2 days ago) Sep 23
to Petr Pavlu, Bjorn Helgaas, Luis Chamberlain, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org
Hi Petr,

On Tue, Sep 23, 2025 at 02:55:34PM +0200, Petr Pavlu wrote:
> On 9/13/25 12:59 AM, Brian Norris wrote:
> > @@ -259,6 +315,12 @@ void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev)
> > return;
> > }
> > pci_do_fixups(dev, start, end);
> > +
> > + struct pci_fixup_arg arg = {
> > + .dev = dev,
> > + .pass = pass,
> > + };
> > + module_for_each_mod(pci_module_fixup, &arg);
>
> The function module_for_each_mod() walks not only modules that are LIVE,
> but also those in the COMING and GOING states. This means that this code
> can potentially execute a PCI fixup from a module before its init
> function is invoked, and similarly, a fixup can be executed after the
> exit function has already run. Is this intentional?

Thanks for the callout. I didn't really give this part much thought
previously.

Per the comments, COMING means "Full formed, running module_init". I
believe that is a good thing, actually; specifically for controller
drivers, module_init() might be probing the controller and enumerating
child PCI devices to which we should apply these FIXUPs. That is a key
case to support.

GOING is not clearly defined in the header comments, but it seems like
it's a relatively narrow window between determining there are no module
refcounts (and transition to GOING) and starting to really tear it down
(transitioning to UNFORMED before any significant teardown).
module_exit() runs in the GOING phase.

I think it does not make sense to execute FIXUPs on a GOING module; I'll
make that change.

Re-quoting one piece:
> This means that this code
> can potentially execute a PCI fixup from a module before its init
> function is invoked,

IIUC, this part is not true? A module is put into COMING state before
its init function is invoked.
Ack.

Thanks,
Brian

Petr Pavlu

unread,
Sep 24, 2025, 3:48:52 AM (yesterday) Sep 24
to Brian Norris, Bjorn Helgaas, Luis Chamberlain, Daniel Gomez, linu...@vger.kernel.org, David Gow, Rae Moar, linux-k...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Johannes Berg, Sami Tolvanen, Richard Weinberger, Wei Liu, Brendan Higgins, kuni...@googlegroups.com, Anton Ivanov, linu...@lists.infradead.org
Note that when walking the modules list using module_for_each_mod(),
the delete_module() operation can concurrently transition a module to
MODULE_STATE_GOING. If you are thinking about simply having
pci_module_fixup() check that mod->state isn't MODULE_STATE_GOING,
I believe this won't quite work.

>
> Re-quoting one piece:
>> This means that this code
>> can potentially execute a PCI fixup from a module before its init
>> function is invoked,
>
> IIUC, this part is not true? A module is put into COMING state before
> its init function is invoked.

When loading a module, the load_module() function calls
complete_formation(), which puts the module into the COMING state. At
this point, the new code in pci_fixup_device() can see the new module
and potentially attempt to invoke its PCI fixups. However, such a module
has still a bit of way to go before its init function is called from
do_init_module(). The module hasn't yet had its arguments parsed, is not
linked in sysfs, isn't fully registered with codetag support, and hasn't
invoked its constructors (needed for gcov/kasan support).

I don't know enough about PCI fixups and what is allowable in them, but
I suspect it would be better to ensure that no fixup can be invoked from
the module during this period.

If the above makes sense, I think using module_for_each_mod() might not
be the right approach. Alternative options include registering a module
notifier or having modules explicitly register their PCI fixups in their
init function.

--
Cheers,
Petr
Reply all
Reply to author
Forward
0 new messages