Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] pci: fix unavailable irq number 255 reported by BIOS

426 views
Skip to first unread message

Chen Fan

unread,
Jan 18, 2016, 8:50:41 PM1/18/16
to linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com
In our environment, when enable Secure boot, we found an abnormal
phenomenon as following call trace shows. after investigation, we
found the firmware assigned an irq number 255 which means unknown
or no connection in PCI local spec for i801_smbus, meanwhile the
ACPI didn't configure the pci irq routing. and the 255 irq number
was assigned for megasa msix without IRQF_SHARED. then in this case
during i801_smbus probe, the i801_smbus driver would request irq with
bad irq number 255. but the 255 irq number was assigned for memgasa
with MSIX enable. which will cause request_irq fails, and call trace
shows, actually, we should expose the error early, rather than in request
irq, here we simply fix the problem by return err when find the irq is
255.

See the call trace:

[ 32.459195] ipmi device interface
[ 32.612907] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[ 32.800459] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 4.0.1-k-rh
[ 32.818319] ixgbe: Copyright (c) 1999-2014 Intel Corporation.
[ 32.844009] lpc_ich 0001:80:1f.0: I/O space for ACPI uninitialized
[ 32.850093] i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143)
[ 32.851134] i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C
[ 32.851136] i801_smbus 0000:00:1f.3: PCI INT C: no GSI
[ 32.851164] genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa
[ 32.851168] CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1
[ 32.851170] Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5
[ 32.851178] Workqueue: events work_for_cpu_fn
[ 32.851208] ffff88086c330b00 00000000e233a9df ffff88086d57bca0 ffffffff81603f36
[ 32.851227] ffff88086d57bcf8 ffffffff8110d23a ffff88686fe02000 0000000000000246
[ 32.851246] ffff88086a9a8c00 00000000e233a9df ffffffffa00ad220 0000000000000080
[ 32.851247] Call Trace:
[ 32.851261] [<ffffffff81603f36>] dump_stack+0x19/0x1b
[ 32.851271] [<ffffffff8110d23a>] __setup_irq+0x54a/0x570
[ 32.851282] [<ffffffffa00ad220>] ? i801_check_pre.isra.5+0xe0/0xe0 [i2c_i801]
[ 32.851289] [<ffffffff8110d3bc>] request_threaded_irq+0xcc/0x170
[ 32.851298] [<ffffffffa00ae87f>] i801_probe+0x32f/0x508 [i2c_i801]
[ 32.851308] [<ffffffff81308385>] local_pci_probe+0x45/0xa0
[ 32.851315] [<ffffffff8108bfd4>] work_for_cpu_fn+0x14/0x20
[ 32.851323] [<ffffffff8108f0ab>] process_one_work+0x17b/0x470
[ 32.851330] [<ffffffff81090003>] worker_thread+0x293/0x400
[ 32.851338] [<ffffffff8108fd70>] ? rescuer_thread+0x400/0x400
[ 32.851346] [<ffffffff8109726f>] kthread+0xcf/0xe0
[ 32.851353] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[ 32.851362] [<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0
[ 32.851369] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[ 32.851373] i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
[ 32.851435] i801_smbus: probe of 0000:00:1f.3 failed with error -16
[ 33.180145] ixgbe 0000:5a:00.0: Multiq[ 33.240538] ixgbe 0000:5a:00.0: (PCI Express:03:e0
[ 33.280826] ixgbe 0000:5a:00.0: MAC: 3, PHY: 0, PBA No: 000000-000

Signed-off-by: Chen Fan <chen.f...@cn.fujitsu.com>
---
drivers/acpi/pci_irq.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index d30184c..d2f47f8 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -439,9 +439,17 @@ int acpi_pci_irq_enable(struct pci_dev *dev)
if (acpi_isa_register_gsi(dev))
dev_warn(&dev->dev, "PCI INT %c: no GSI\n",
pin_name(pin));
+ rc = 0;
+ /*
+ * Excluding the BIOS report the value 255, which meaning
+ * "unknown" or "no connection" in PCI Local Bus Specification
+ * Revision 3.0 February 3, 2004, P223.
+ */
+ if (dev->irq == 0xFF)
+ rc = -EINVAL;

kfree(entry);
- return 0;
+ return rc;
}

rc = acpi_register_gsi(&dev->dev, gsi, triggering, polarity);
--
1.9.3



Rafael J. Wysocki

unread,
Jan 19, 2016, 8:43:21 AM1/19/16
to Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI
You mean the footnote on page 223 talking about the Interrupt Line values, right?

> + */
> + if (dev->irq == 0xFF)
> + rc = -EINVAL;
>
> kfree(entry);
> - return 0;
> + return rc;
> }
>
> rc = acpi_register_gsi(&dev->dev, gsi, triggering, polarity);
>

Well, if you look at acpi_isa_register_gsi(), you'll see that it
actually does the check you're adding, so maybe the following should
be done instead?

---
drivers/acpi/pci_irq.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-pm/drivers/acpi/pci_irq.c
===================================================================
--- linux-pm.orig/drivers/acpi/pci_irq.c
+++ linux-pm/drivers/acpi/pci_irq.c
@@ -436,12 +436,13 @@ int acpi_pci_irq_enable(struct pci_dev *
* driver reported one, then use it. Exit in any case.
*/
if (gsi < 0) {
- if (acpi_isa_register_gsi(dev))
+ rc = acpi_isa_register_gsi(dev);
+ if (rc)
dev_warn(&dev->dev, "PCI INT %c: no GSI\n",
pin_name(pin));

Rafael J. Wysocki

unread,
Jan 19, 2016, 9:20:30 AM1/19/16
to Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI
> --

Alternatively, to be entirely on the safe side and avoid possible regressions
from returning errors when they were not returned previously, we can
just special case the 0xff value as you did, but in a slightly cleaner way:

---
drivers/acpi/pci_irq.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

Index: linux-pm/drivers/acpi/pci_irq.c
===================================================================
--- linux-pm.orig/drivers/acpi/pci_irq.c
+++ linux-pm/drivers/acpi/pci_irq.c
@@ -436,12 +436,18 @@ int acpi_pci_irq_enable(struct pci_dev *
* driver reported one, then use it. Exit in any case.
*/
if (gsi < 0) {
- if (acpi_isa_register_gsi(dev))
+ /*
+ * The Interrupt Line value of 0xff is defined to mean "unknown"
+ * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page
+ * 223), so return an error in that case.
+ */
+ rc = dev->irq == 0xff ? -EINVAL : 0;
+ if (rc || acpi_isa_register_gsi(dev))

Sinan Kaya

unread,
Jan 19, 2016, 9:49:30 AM1/19/16
to Rafael J. Wysocki, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI
"Unknown" does not necessarily mean invalid. I have a platform that is using 255 as a valid legacy
interrupt on PCI Express.
Is the issue limited to ISA? If yes, can we limit the change with ISA/PCI adapters only?

Is there a way to know if the system is PCIe vs. PCI?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majo...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

Rafael J. Wysocki

unread,
Jan 19, 2016, 10:39:31 AM1/19/16
to Sinan Kaya, Rafael J. Wysocki, Chen Fan, ACPI Devel Maling List, Linux Kernel Mailing List, Len Brown, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI
So first off this is about the Interrupt Line value not about an
interrupt vector.

Second, the footnote in question is talking about x86 PCs, so if your
platform is not one of them, there is no connection here.

Which means that the change should be limited to x86 probably.

Thanks,
Rafael

Sinan Kaya

unread,
Jan 19, 2016, 10:52:11 AM1/19/16
to Rafael J. Wysocki, Rafael J. Wysocki, Chen Fan, ACPI Devel Maling List, Linux Kernel Mailing List, Len Brown, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI
Got it. Just to be clear, I assume this is not the value that code reads from the ACPI table.

+ rc = dev->irq == 0xff ? -EINVAL : 0;

I was nervous to see this check in common code.

>
> Second, the footnote in question is talking about x86 PCs, so if your
> platform is not one of them, there is no connection here.
>
> Which means that the change should be limited to x86 probably.
>
> Thanks,
> Rafael

Rafael J. Wysocki

unread,
Jan 19, 2016, 11:02:22 AM1/19/16
to Sinan Kaya, Rafael J. Wysocki, Rafael J. Wysocki, Chen Fan, ACPI Devel Maling List, Linux Kernel Mailing List, Len Brown, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI
No, this value is read from the PCI register, but the interpretation
of it is arch-specific according to the spec.

Thanks,
Rafael

Bjorn Helgaas

unread,
Jan 19, 2016, 7:24:50 PM1/19/16
to Rafael J. Wysocki, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI, Jiang Liu
[+cc Jiang]

Hi Chen,

On Tue, Jan 19, 2016 at 02:43:30PM +0100, Rafael J. Wysocki wrote:
> On Tuesday, January 19, 2016 09:45:13 AM Chen Fan wrote:
> > In our environment, when enable Secure boot, we found an abnormal

This has more information than necessary. I don't think Secure Boot is
really relevant, and nor are the timestamps and stack addresses below.
Since the Interrupt Line register is writable and might contain any
value, it would be nice if Linux could at least tolerate anything
firmware might leave there without a backtrace, even if we end up not
being able to use the device.

Your patch changes the acpi_pci_irq_enable() return value from 0 to
-EINVAL for this case. You're running v3.10, and this change probably
makes pci_enable_device() fail. I suppose the user-visible effect is
that with your patch,

- there's no backtrace,
- i801_smbus fails with "Failed to enable SMBus PCI device" instead
of with "Failed to allocate irq 255", and
- i801_smbus fails even if no other device is using IRQ 255, instead
of "succeeding" and using an IRQ 255 that probably doesn't work
(this seems like maybe the most important difference)

Jiang has changed this path with 890e4847587f ("PCI: Add
pcibios_alloc_irq() and pcibios_free_irq()"), so I think on newer
kernels, we'll never even call the i801_smbus probe function.

What behavior are you looking for from i801_smbus? Decline to claim
the device? Try to use the device without interrupts? Try to figure
out an interrupt in some other way?

I'm not 100% sure that 890e4847587f does the right thing by preventing
a driver from claiming a device where we can't set up an IRQ. It's
conceivable that a driver could still operate a device even without an
IRQ.

Chen Fan

unread,
Jan 19, 2016, 11:26:21 PM1/19/16
to Bjorn Helgaas, Rafael J. Wysocki, linux...@vger.kernel.org, linux-...@vger.kernel.org, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI, Jiang Liu

On 01/20/2016 08:24 AM, Bjorn Helgaas wrote:
> [+cc Jiang]
>
> Hi Chen,
>
> On Tue, Jan 19, 2016 at 02:43:30PM +0100, Rafael J. Wysocki wrote:
>> On Tuesday, January 19, 2016 09:45:13 AM Chen Fan wrote:
>>> In our environment, when enable Secure boot, we found an abnormal
> This has more information than necessary. I don't think Secure Boot is
> really relevant, and nor are the timestamps and stack addresses below.
I just think enable the Secure Boot, probably the firmware assigned
a 0xff interrupt to the device which unauthenticated.
no, on newer kernels, this phenomenon also probably appearance,
with this patch 890e4847587f change, it didn't change the
acpi_pci_irq_enable() return value, with the problem it also return 0,
and then still call __pci_device_probe() to do i801_smbus probe
function in pci_device_probe() function.

>
> What behavior are you looking for from i801_smbus? Decline to claim
> the device? Try to use the device without interrupts? Try to figure
> out an interrupt in some other way?
I think if BIOS assigned 0xff interrupt line to device, and kernel can't
look
up a valid interrupt for the device, we should not allow to use the device.
>
> I'm not 100% sure that 890e4847587f does the right thing by preventing
> a driver from claiming a device where we can't set up an IRQ. It's
> conceivable that a driver could still operate a device even without an
> IRQ.
I don't understanding, does without IRQ for device still work?

Thanks,
Chen
> .
>



Chen Fan

unread,
Jan 20, 2016, 12:01:34 AM1/20/16
to Rafael J. Wysocki, linux...@vger.kernel.org, linux-...@vger.kernel.org, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI
no, the function acpi_isa_register_gsi() which check
the dev->irq > 0 && (dev->irq <= 0xF), so if the dev->irq is 0xff,
it should return 0 directly. right?

>
> ---
> drivers/acpi/pci_irq.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> Index: linux-pm/drivers/acpi/pci_irq.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/pci_irq.c
> +++ linux-pm/drivers/acpi/pci_irq.c
> @@ -436,12 +436,13 @@ int acpi_pci_irq_enable(struct pci_dev *
> * driver reported one, then use it. Exit in any case.
> */
> if (gsi < 0) {
> - if (acpi_isa_register_gsi(dev))
> + rc = acpi_isa_register_gsi(dev);
> + if (rc)
> dev_warn(&dev->dev, "PCI INT %c: no GSI\n",
> pin_name(pin));
>
> kfree(entry);
> - return 0;
> + return rc;
> }
>
> rc = acpi_register_gsi(&dev->dev, gsi, triggering, polarity);
>
>
>
> .
>



Chen Fan

unread,
Jan 20, 2016, 12:01:47 AM1/20/16
to Rafael J. Wysocki, Sinan Kaya, Rafael J. Wysocki, ACPI Devel Maling List, Linux Kernel Mailing List, Len Brown, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI
That's right, we should wrap this code in arch x86.

Thanks,
Chen

>
> Thanks,
> Rafael
>
>
> .
>



Bjorn Helgaas

unread,
Jan 20, 2016, 12:13:08 PM1/20/16
to Chen Fan, Rafael J. Wysocki, linux...@vger.kernel.org, linux-...@vger.kernel.org, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI, Jiang Liu
On Wed, Jan 20, 2016 at 12:21:24PM +0800, Chen Fan wrote:
>
> On 01/20/2016 08:24 AM, Bjorn Helgaas wrote:
> >[+cc Jiang]
> >
> >Hi Chen,
> >
> >On Tue, Jan 19, 2016 at 02:43:30PM +0100, Rafael J. Wysocki wrote:
> >>On Tuesday, January 19, 2016 09:45:13 AM Chen Fan wrote:
> >>>In our environment, when enable Secure boot, we found an abnormal
> >This has more information than necessary. I don't think Secure Boot is
> >really relevant, and nor are the timestamps and stack addresses below.
> I just think enable the Secure Boot, probably the firmware assigned
> a 0xff interrupt to the device which unauthenticated.

The important thing is that you're changing the way we handle
Interrupt Line being 0xff. That affects more than just Secure Boot
users. It's fine to mention Secure Boot later, as one example of an
affected scenario.

I don't know anything about Secure Boot, but setting Interrupt Line to
0xff would obviously not be a robust way of hiding an unauthenticated
device. But it sounds like you're just speculating about that anyway.
I meant that *with your patch*, newer kernels won't call the
i801_smbus probe function.

> >What behavior are you looking for from i801_smbus? Decline to claim
> >the device? Try to use the device without interrupts? Try to figure
> >out an interrupt in some other way?
>
> I think if BIOS assigned 0xff interrupt line to device, and kernel
> can't look
> up a valid interrupt for the device, we should not allow to use the device.
> >
> >I'm not 100% sure that 890e4847587f does the right thing by preventing
> >a driver from claiming a device where we can't set up an IRQ. It's
> >conceivable that a driver could still operate a device even without an
> >IRQ.
>
> I don't understanding, does without IRQ for device still work?

Polling drivers do not need IRQs. The PCI core has no idea whether a
driver is interrupt-driven or polling, so we can't assume that a
device with no IRQ is useless.

Bjorn

Chen Fan

unread,
Jan 21, 2016, 3:07:16 AM1/21/16
to Bjorn Helgaas, Rafael J. Wysocki, linux...@vger.kernel.org, linux-...@vger.kernel.org, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI, Jiang Liu
Got it, I observed the smbus driver has changed to polling when request_irq
failed on new kernel.
can we use a broken_irq flag in pci_dev to mark the device irq if invalid ?
then if a device broken_irq set, we don't need to call request_irq and
directly return failure. of course, maybe we need to check this for all
devices.

BTW, can we skip the 0xff irq number when allocating irq in x86 arch ?

Thanks,
Chen


>
> Bjorn
>
>
> .
>



Cao jin

unread,
Jan 21, 2016, 9:37:53 AM1/21/16
to Bjorn Helgaas, Rafael J. Wysocki, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI, Jiang Liu
Hi,

IMHO, I think maybe modification on i801_smbus driver is easier.

Because when i801_smbus request_irq using pci_dev->irq, this
pci_dev->irq seems still holds the value read from register(
pci_setup_device->pci_read_irq), if the value is 255, it is invalid in
register, but when request_irq, 255 is a valid value(with
CONFIG_X86_IO_APIC & CONFIG_PCI_MSI defined). so I guess maybe we can do
a judgement in i801_smbus, like:

if (pci_dev->irq == 255) {

conversion to another irq # which maybe still not be used
}


On 01/20/2016 08:24 AM, Bjorn Helgaas wrote:
> .
>

--
Yours Sincerely,

Cao jin


Rafael J. Wysocki

unread,
Jan 21, 2016, 5:59:01 PM1/21/16
to Cao jin, Bjorn Helgaas, Rafael J. Wysocki, Chen Fan, ACPI Devel Maling List, Linux Kernel Mailing List, Len Brown, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI, Jiang Liu
On Thu, Jan 21, 2016 at 3:41 PM, Cao jin <caoj...@cn.fujitsu.com> wrote:
> Hi,
>
> IMHO, I think maybe modification on i801_smbus driver is easier.
>
> Because when i801_smbus request_irq using pci_dev->irq, this
> pci_dev->irq seems still holds the value read from register(
> pci_setup_device->pci_read_irq), if the value is 255, it is invalid in
> register,

Right.

Which is why the PCI core should not leak it into the driver's ->probe callback.

Thanks,
Rafael

Bjorn Helgaas

unread,
Jan 22, 2016, 12:53:44 PM1/22/16
to Rafael J. Wysocki, Cao jin, Rafael J. Wysocki, Chen Fan, ACPI Devel Maling List, Linux Kernel Mailing List, Len Brown, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI, Jiang Liu
Is there a reserved IRQ value we could use to mean "invalid"?

I guess we have NR_IRQS as a ceiling, so the range of valid IRQs would be
[0 .. NR_IRQS - 1]. It looks like irq_desc() and a few drivers already
rely on NR_IRQS being the bound:

lpc32xx_kscan_probe
lpc32xx_nand_probe
pcmcia_setup_isa_irq
lpc32xx_rtc_probe
apbuart_verify_port
ar933x_uart_verify_port
lqasc_verify_port

So I guess we could use ~0 as "invalid IRQ", and maybe the PCI core could
set dev->irq to ~0 in these cases, and drivers like i801_smbus could check
for that. Maybe a wrapper like irq_valid() would be useful.

Bjorn

David Daney

unread,
Jan 22, 2016, 2:23:51 PM1/22/16
to Bjorn Helgaas, Rafael J. Wysocki, Cao jin, Rafael J. Wysocki, Chen Fan, ACPI Devel Maling List, Linux Kernel Mailing List, Len Brown, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI, Jiang Liu
On 01/22/2016 09:53 AM, Bjorn Helgaas wrote:
> On Thu, Jan 21, 2016 at 11:58:26PM +0100, Rafael J. Wysocki wrote:
>> On Thu, Jan 21, 2016 at 3:41 PM, Cao jin <caoj...@cn.fujitsu.com> wrote:
>>> Hi,
>>>
>>> IMHO, I think maybe modification on i801_smbus driver is easier.
>>>
>>> Because when i801_smbus request_irq using pci_dev->irq, this
>>> pci_dev->irq seems still holds the value read from register(
>>> pci_setup_device->pci_read_irq), if the value is 255, it is invalid in
>>> register,
>>
>> Right.
>>
>> Which is why the PCI core should not leak it into the driver's ->probe callback.
>
> Is there a reserved IRQ value we could use to mean "invalid"?

In many (most) cases, zero indicates no irq.

Rafael J. Wysocki

unread,
Jan 22, 2016, 9:01:49 PM1/22/16
to David Daney, Bjorn Helgaas, Rafael J. Wysocki, Cao jin, Rafael J. Wysocki, Chen Fan, ACPI Devel Maling List, Linux Kernel Mailing List, Len Brown, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, Bjorn Helgaas, Linux PCI, Jiang Liu
On Fri, Jan 22, 2016 at 8:23 PM, David Daney <ddane...@gmail.com> wrote:
> On 01/22/2016 09:53 AM, Bjorn Helgaas wrote:
>>
>> On Thu, Jan 21, 2016 at 11:58:26PM +0100, Rafael J. Wysocki wrote:
>>>
>>> On Thu, Jan 21, 2016 at 3:41 PM, Cao jin <caoj...@cn.fujitsu.com>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> IMHO, I think maybe modification on i801_smbus driver is easier.
>>>>
>>>> Because when i801_smbus request_irq using pci_dev->irq, this
>>>> pci_dev->irq seems still holds the value read from register(
>>>> pci_setup_device->pci_read_irq), if the value is 255, it is invalid in
>>>> register,
>>>
>>>
>>> Right.
>>>
>>> Which is why the PCI core should not leak it into the driver's ->probe
>>> callback.
>>
>>
>> Is there a reserved IRQ value we could use to mean "invalid"?
>
>
> In many (most) cases, zero indicates no irq.

Zero is a valid timer IRQ on x86, though, so it's better not to give
any special meaning to it in general.

Using ~0 as suggested by Bjorn should work as it would cause
request_irq() to return -EINVAL if passed to it AFAICS.

Thanks,
Rafael

Chen Fan

unread,
Jan 25, 2016, 2:04:42 AM1/25/16
to linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, chen.f...@cn.fujitsu.com, hel...@kernel.org, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
In our environment, when enable Secure boot, we found an abnormal
phenomenon as following call trace shows. after investigation, we
found the firmware assigned an irq number 255 which means unknown
or no connection in PCI local spec for i801_smbus, meanwhile the
ACPI didn't configure the pci irq routing. and the 255 irq number
was assigned for megasa msix without IRQF_SHARED. then in this case
during i801_smbus probe, the i801_smbus driver would request irq with
bad irq number 255. but the 255 irq number was assigned for memgasa
with MSIX enable. which will cause request_irq fails as call trace
shows, here we use ~0U as invalid IRQ to identify the 0xff IRQ specified
by BIOS.
[ 33.180145] ixgbe 0000:5a:00.0: Multiq[ 33.240538] ixgbe 0000:5a:00.0: (PCI Express:03:e0
[ 33.280826] ixgbe 0000:5a:00.0: MAC: 3, PHY: 0, PBA No: 000000-000

Signed-off-by: Chen Fan <chen.f...@cn.fujitsu.com>
---
arch/x86/include/asm/irq_vectors.h | 2 ++
drivers/acpi/pci_irq.c | 11 ++++++++++-
include/linux/interrupt.h | 9 +++++++++
3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 6ca9fd6..b616d69 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -146,4 +146,6 @@
#define NR_IRQS NR_IRQS_LEGACY
#endif

+#define IRQ_INVALID (~0U)
+
#endif /* _ASM_X86_IRQ_VECTORS_H */
diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index d30184c..819eb23 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -33,6 +33,7 @@
#include <linux/pci.h>
#include <linux/acpi.h>
#include <linux/slab.h>
+#include <linux/interrupt.h>

#define PREFIX "ACPI: "

@@ -436,7 +437,15 @@ int acpi_pci_irq_enable(struct pci_dev *dev)
* driver reported one, then use it. Exit in any case.
*/
if (gsi < 0) {
- if (acpi_isa_register_gsi(dev))
+#ifdef CONFIG_X86
+ /*
+ * The Interrupt Line value of 0xff is defined to mean "unknown"
+ * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page
+ * 223), using ~0U as invalid IRQ.
+ */
+ dev->irq = (dev->irq == 0xff) ? IRQ_INVALID : dev->irq;
+#endif
+ if (!irq_is_valid(dev->irq) || acpi_isa_register_gsi(dev))
dev_warn(&dev->dev, "PCI INT %c: no GSI\n",
pin_name(pin));

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index cb30edb..2f0d46b 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -198,6 +198,15 @@ extern void enable_percpu_irq(unsigned int irq, unsigned int type);
extern bool irq_percpu_is_enabled(unsigned int irq);
extern void irq_wake_thread(unsigned int irq, void *dev_id);

+static inline bool irq_is_valid(unsigned int irq)
+{
+#ifdef CONFIG_X86
+ if (irq == IRQ_INVALID)
+ return false;
+#endif
+ return true;
+}
+
/* The following three functions are for the core kernel use only. */
extern void suspend_device_irqs(void);
extern void resume_device_irqs(void);
--
1.9.3



Bjorn Helgaas

unread,
Jan 25, 2016, 3:58:19 PM1/25/16
to Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org, Thomas Gleixner
[+cc Thomas]

On Mon, Jan 25, 2016 at 02:59:38PM +0800, Chen Fan wrote:
> In our environment, when enable Secure boot, we found an abnormal
> phenomenon as following call trace shows. after investigation, we
> found the firmware assigned an irq number 255 which means unknown
> or no connection in PCI local spec for i801_smbus, meanwhile the
> ACPI didn't configure the pci irq routing. and the 255 irq number
> was assigned for megasa msix without IRQF_SHARED. then in this case
> during i801_smbus probe, the i801_smbus driver would request irq with
> bad irq number 255. but the 255 irq number was assigned for memgasa
> with MSIX enable. which will cause request_irq fails as call trace
> shows, here we use ~0U as invalid IRQ to identify the 0xff IRQ specified
> by BIOS.
>
> See the call trace:

Maybe you missed my suggestion that the timestamps aren't useful;
here's my suggestion again in more detail:

Changelogs are written once, but read dozens or hundreds of time, so
stripping out irrelevant details shows consideration for the readers.

> [ 32.459195] ipmi device interface
> [ 32.612907] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> [ 32.800459] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 4.0.1-k-rh
> [ 32.818319] ixgbe: Copyright (c) 1999-2014 Intel Corporation.
> [ 32.844009] lpc_ich 0001:80:1f.0: I/O space for ACPI uninitialized

I think the lines above are completely irrelevant.

> [ 32.850093] i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143)
> [ 32.851134] i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C
> [ 32.851136] i801_smbus 0000:00:1f.3: PCI INT C: no GSI
> [ 32.851164] genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa

These are useful, but the timestamps ("[ 32.850093]") are not.

> [ 32.851168] CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1
> [ 32.851170] Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5

These are probably useful; it's nice to know what kernel and hardware
is involved.

> [ 32.851178] Workqueue: events work_for_cpu_fn
> [ 32.851208] ffff88086c330b00 00000000e233a9df ffff88086d57bca0 ffffffff81603f36
> [ 32.851227] ffff88086d57bcf8 ffffffff8110d23a ffff88686fe02000 0000000000000246
> [ 32.851246] ffff88086a9a8c00 00000000e233a9df ffffffffa00ad220 0000000000000080

I doubt these are useful.

> [ 32.851247] Call Trace:
> [ 32.851261] [<ffffffff81603f36>] dump_stack+0x19/0x1b
> [ 32.851271] [<ffffffff8110d23a>] __setup_irq+0x54a/0x570
> [ 32.851282] [<ffffffffa00ad220>] ? i801_check_pre.isra.5+0xe0/0xe0 [i2c_i801]
> [ 32.851289] [<ffffffff8110d3bc>] request_threaded_irq+0xcc/0x170
> [ 32.851298] [<ffffffffa00ae87f>] i801_probe+0x32f/0x508 [i2c_i801]
> [ 32.851308] [<ffffffff81308385>] local_pci_probe+0x45/0xa0

The above might be useful, but the addresses ("[<ffffffff81603f36>]")
are not, and you should go through them manually and strip out the
lines that are junk from the stack. For example, I don't think
request_threaded_irq() really calls i801_check_pre().

> [ 32.851315] [<ffffffff8108bfd4>] work_for_cpu_fn+0x14/0x20
> [ 32.851323] [<ffffffff8108f0ab>] process_one_work+0x17b/0x470
> [ 32.851330] [<ffffffff81090003>] worker_thread+0x293/0x400
> [ 32.851338] [<ffffffff8108fd70>] ? rescuer_thread+0x400/0x400
> [ 32.851346] [<ffffffff8109726f>] kthread+0xcf/0xe0
> [ 32.851353] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
> [ 32.851362] [<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0
> [ 32.851369] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140

The lines above are completely useless.

> [ 32.851373] i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
> [ 32.851435] i801_smbus: probe of 0000:00:1f.3 failed with error -16

> [ 33.180145] ixgbe 0000:5a:00.0: Multiq[ 33.240538] ixgbe 0000:5a:00.0: (PCI Express:03:e0
> [ 33.280826] ixgbe 0000:5a:00.0: MAC: 3, PHY: 0, PBA No: 000000-000

These ixgbe entries are irrelevant.

> Signed-off-by: Chen Fan <chen.f...@cn.fujitsu.com>
> ---
> arch/x86/include/asm/irq_vectors.h | 2 ++
> drivers/acpi/pci_irq.c | 11 ++++++++++-
> include/linux/interrupt.h | 9 +++++++++
> 3 files changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
> index 6ca9fd6..b616d69 100644
> --- a/arch/x86/include/asm/irq_vectors.h
> +++ b/arch/x86/include/asm/irq_vectors.h
> @@ -146,4 +146,6 @@
> #define NR_IRQS NR_IRQS_LEGACY
> #endif
>
> +#define IRQ_INVALID (~0U)

If this is a good idea (I cc'd Thomas, the IRQ maintainer, for his
thoughts), I'd like to see this in a more generic place so it isn't
x86-specific.

> #endif /* _ASM_X86_IRQ_VECTORS_H */
> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
> index d30184c..819eb23 100644
> --- a/drivers/acpi/pci_irq.c
> +++ b/drivers/acpi/pci_irq.c
> @@ -33,6 +33,7 @@
> #include <linux/pci.h>
> #include <linux/acpi.h>
> #include <linux/slab.h>
> +#include <linux/interrupt.h>
>
> #define PREFIX "ACPI: "
>
> @@ -436,7 +437,15 @@ int acpi_pci_irq_enable(struct pci_dev *dev)
> * driver reported one, then use it. Exit in any case.
> */
> if (gsi < 0) {
> - if (acpi_isa_register_gsi(dev))
> +#ifdef CONFIG_X86
> + /*
> + * The Interrupt Line value of 0xff is defined to mean "unknown"
> + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page
> + * 223), using ~0U as invalid IRQ.
> + */
> + dev->irq = (dev->irq == 0xff) ? IRQ_INVALID : dev->irq;

It's much simpler and clearer to write:

if (dev->irq == 0xff)
dev->irq = IRQ_INVALID;

> +#endif
> + if (!irq_is_valid(dev->irq) || acpi_isa_register_gsi(dev))
> dev_warn(&dev->dev, "PCI INT %c: no GSI\n",
> pin_name(pin));
>
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index cb30edb..2f0d46b 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -198,6 +198,15 @@ extern void enable_percpu_irq(unsigned int irq, unsigned int type);
> extern bool irq_percpu_is_enabled(unsigned int irq);
> extern void irq_wake_thread(unsigned int irq, void *dev_id);
>
> +static inline bool irq_is_valid(unsigned int irq)
> +{
> +#ifdef CONFIG_X86
> + if (irq == IRQ_INVALID)
> + return false;
> +#endif
> + return true;
> +}

I don't like the x86 ifdef. I'd prefer:

static inline bool irq_valid(unsigned int irq)
{
if (irq < NR_IRQS)
return true;
return false;
}

This could be used in many of the places that currently use NR_IRQS.

My suggestion:

- patch 1: Add IRQ_INVALID and irq_valid() as generic things
- patch 2: Use irq_valid() in all the places where it can obviously
replace NR_IRQS
- patch 3: Add the acpi_pci_irq_enable() check. This is now a
trivial patch, basically just this:

+ #ifdef CONFIG_X86
+ if (dev->irq == 0xff)
+ dev->irq = IRQ_INVALID;
+ #endif
+ if (!irq_valid(dev->irq) ...

Bjorn

Chen Fan

unread,
Jan 25, 2016, 8:45:44 PM1/25/16
to Bjorn Helgaas, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org, Thomas Gleixner

On 01/26/2016 04:58 AM, Bjorn Helgaas wrote:
> [+cc Thomas]
>
> On Mon, Jan 25, 2016 at 02:59:38PM +0800, Chen Fan wrote:
>> In our environment, when enable Secure boot, we found an abnormal
>> phenomenon as following call trace shows. after investigation, we
>> found the firmware assigned an irq number 255 which means unknown
>> or no connection in PCI local spec for i801_smbus, meanwhile the
>> ACPI didn't configure the pci irq routing. and the 255 irq number
>> was assigned for megasa msix without IRQF_SHARED. then in this case
>> during i801_smbus probe, the i801_smbus driver would request irq with
>> bad irq number 255. but the 255 irq number was assigned for memgasa
>> with MSIX enable. which will cause request_irq fails as call trace
>> shows, here we use ~0U as invalid IRQ to identify the 0xff IRQ specified
>> by BIOS.
>>
>> See the call trace:
> Maybe you missed my suggestion that the timestamps aren't useful;
> here's my suggestion again in more detail:
>
> Changelogs are written once, but read dozens or hundreds of time, so
> stripping out irrelevant details shows consideration for the readers.
Got it, thanks for your correction, I will remake it as you suggest.
this will be more useful.

Thanks,
Chen

>
> Bjorn
>
>
> .
>



Guenter Roeck

unread,
Jan 25, 2016, 11:05:40 PM1/25/16
to Bjorn Helgaas, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.int
On Mon, Jan 25, 2016 at 02:58:04PM -0600, Bjorn Helgaas wrote:
> [+cc Thomas]
>
[ ... ]

>
> I don't like the x86 ifdef. I'd prefer:
>
> static inline bool irq_valid(unsigned int irq)
> {
> if (irq < NR_IRQS)
> return true;
> return false;
> }
>

Or:

static inline bool irq_valid(unsigned int irq)
{
return irq < NR_IRQS;
}

Guenter

Thomas Gleixner

unread,
Jan 26, 2016, 3:27:50 AM1/26/16
to Bjorn Helgaas, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
On Mon, 25 Jan 2016, Bjorn Helgaas wrote:
> On Mon, Jan 25, 2016 at 02:59:38PM +0800, Chen Fan wrote:

> > i801_smbus 0000:00:1f.3: PCI INT C: no GSI
> > i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
> > i801_smbus: probe of 0000:00:1f.3 failed with error -16

The current code does not not fail when the interrupt request fails. It
reports it and clears the IRQ feature flag.

> > @@ -436,7 +437,15 @@ int acpi_pci_irq_enable(struct pci_dev *dev)
> > * driver reported one, then use it. Exit in any case.
> > */
> > if (gsi < 0) {
> > - if (acpi_isa_register_gsi(dev))
> > +#ifdef CONFIG_X86
> > + /*
> > + * The Interrupt Line value of 0xff is defined to mean "unknown"
> > + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page
> > + * 223), using ~0U as invalid IRQ.
> > + */

And why would this be x86 specific? PCI3.0 is architecture independent, right?

> > + dev->irq = (dev->irq == 0xff) ? IRQ_INVALID : dev->irq;
>
> It's much simpler and clearer to write:
>
> if (dev->irq == 0xff)
> dev->irq = IRQ_INVALID;

I do not understand that IRQ_INVALID business at all.

> > +#endif
> > + if (!irq_is_valid(dev->irq) || acpi_isa_register_gsi(dev))
> > dev_warn(&dev->dev, "PCI INT %c: no GSI\n",
> > pin_name(pin));
> >

The existing code already drops into this place because
acpi_isa_register_gsi() fails.

> > i801_smbus 0000:00:1f.3: PCI INT C: no GSI

What extra value does that !irq_is_valid() provide?

And how does setting dev->irq to ~0U prevent that request_irq() is called in
the i801 device driver? Not at all, AFAICT. It will just fail with a different
error.

So the whole 'fix' relies on the fact that irq ~0U does not exist (at least
not today) and therefor the false sharing with some other driver using irq 255
will not happen.

Relying on undocumented behaviour is not a fix, that's voodoo programming.

The proper solution here is to flag that this device does not have an
interrupt connected and act accordingly in the device driver, i.e. do not call
request_irq() in the first place.

> > +static inline bool irq_is_valid(unsigned int irq)
> > +{
> > +#ifdef CONFIG_X86
> > + if (irq == IRQ_INVALID)
> > + return false;
> > +#endif
> > + return true;
> > +}
>
> I don't like the x86 ifdef. I'd prefer:
>
> static inline bool irq_valid(unsigned int irq)
> {
> if (irq < NR_IRQS)
> return true;
> return false;
> }
>
> This could be used in many of the places that currently use NR_IRQS.

No. NR_IRQS cannot be used at all if sparse irqs are enabled. Nothing in any
generic code is supposed to rely on NR_IRQS.

Thanks,

tglx

Chen Fan

unread,
Jan 26, 2016, 4:50:44 AM1/26/16
to Thomas Gleixner, Bjorn Helgaas, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org

On 01/26/2016 04:26 PM, Thomas Gleixner wrote:
> On Mon, 25 Jan 2016, Bjorn Helgaas wrote:
>> On Mon, Jan 25, 2016 at 02:59:38PM +0800, Chen Fan wrote:
>>> i801_smbus 0000:00:1f.3: PCI INT C: no GSI
>>> i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
>>> i801_smbus: probe of 0000:00:1f.3 failed with error -16
> The current code does not not fail when the interrupt request fails. It
> reports it and clears the IRQ feature flag.
>
>>> @@ -436,7 +437,15 @@ int acpi_pci_irq_enable(struct pci_dev *dev)
>>> * driver reported one, then use it. Exit in any case.
>>> */
>>> if (gsi < 0) {
>>> - if (acpi_isa_register_gsi(dev))
>>> +#ifdef CONFIG_X86
>>> + /*
>>> + * The Interrupt Line value of 0xff is defined to mean "unknown"
>>> + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page
>>> + * 223), using ~0U as invalid IRQ.
>>> + */
> And why would this be x86 specific? PCI3.0 is architecture independent, right?
quoting the spec document:
"For x86 based PCs, the values in this register correspond to IRQ
numbers (0-15) of the standard dual
8259 configuration. The value 255 is defined as meaning "unknown" or "no
connection" to the interrupt
controller. Values between 15 and 254 are reserved."

>
>>> + dev->irq = (dev->irq == 0xff) ? IRQ_INVALID : dev->irq;
>> It's much simpler and clearer to write:
>>
>> if (dev->irq == 0xff)
>> dev->irq = IRQ_INVALID;
> I do not understand that IRQ_INVALID business at all.
>
>>> +#endif
>>> + if (!irq_is_valid(dev->irq) || acpi_isa_register_gsi(dev))
>>> dev_warn(&dev->dev, "PCI INT %c: no GSI\n",
>>> pin_name(pin));
>>>
> The existing code already drops into this place because
> acpi_isa_register_gsi() fails.
>
>>> i801_smbus 0000:00:1f.3: PCI INT C: no GSI
> What extra value does that !irq_is_valid() provide?
>
> And how does setting dev->irq to ~0U prevent that request_irq() is called in
> the i801 device driver? Not at all, AFAICT. It will just fail with a different
> error.
>
> So the whole 'fix' relies on the fact that irq ~0U does not exist (at least
> not today) and therefor the false sharing with some other driver using irq 255
> will not happen.
>
> Relying on undocumented behaviour is not a fix, that's voodoo programming.
>
> The proper solution here is to flag that this device does not have an
> interrupt connected and act accordingly in the device driver, i.e. do not call
> request_irq() in the first place.
yes, this is what I thought in previous email, I has asked that
whether we can use a broken_irq flag in pci_dev to mark the device irq
if invalid.
and then if the device broken_irq set, we could directly skip call the
request_irq.
maybe we can set the broken_irq in pci_read_irq if the irq is 0xff.

Thanks,
Chen

>
>>> +static inline bool irq_is_valid(unsigned int irq)
>>> +{
>>> +#ifdef CONFIG_X86
>>> + if (irq == IRQ_INVALID)
>>> + return false;
>>> +#endif
>>> + return true;
>>> +}
>> I don't like the x86 ifdef. I'd prefer:
>>
>> static inline bool irq_valid(unsigned int irq)
>> {
>> if (irq < NR_IRQS)
>> return true;
>> return false;
>> }
>>
>> This could be used in many of the places that currently use NR_IRQS.
> No. NR_IRQS cannot be used at all if sparse irqs are enabled. Nothing in any
> generic code is supposed to rely on NR_IRQS.
>
> Thanks,
>
> tglx
>
>
>
> .
>



Thomas Gleixner

unread,
Jan 26, 2016, 4:53:17 AM1/26/16
to Chen Fan, Bjorn Helgaas, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
On Tue, 26 Jan 2016, Chen Fan wrote:
> On 01/26/2016 04:26 PM, Thomas Gleixner wrote:
> > > > if (gsi < 0) {
> > > > - if (acpi_isa_register_gsi(dev))
> > > > +#ifdef CONFIG_X86
> > > > + /*
> > > > + * The Interrupt Line value of 0xff is defined to mean
> > > > "unknown"
> > > > + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on
> > > > page
> > > > + * 223), using ~0U as invalid IRQ.
> > > > + */
> > And why would this be x86 specific? PCI3.0 is architecture independent,
> > right?
> quoting the spec document:
> "For x86 based PCs, the values in this register correspond to IRQ numbers
> (0-15) of the standard dual
> 8259 configuration. The value 255 is defined as meaning "unknown" or "no
> connection" to the interrupt
> controller. Values between 15 and 254 are reserved."

So if that is x86 specific then it needs to be documented proper. The fact
that the comment is inside the #ifdef x86 does not tell.

Thanks,

tglx


Bjorn Helgaas

unread,
Jan 26, 2016, 10:22:45 AM1/26/16
to Thomas Gleixner, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
On Tue, Jan 26, 2016 at 09:26:29AM +0100, Thomas Gleixner wrote:
> On Mon, 25 Jan 2016, Bjorn Helgaas wrote:
> > On Mon, Jan 25, 2016 at 02:59:38PM +0800, Chen Fan wrote:
>
> > > i801_smbus 0000:00:1f.3: PCI INT C: no GSI
> > > i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
> > > i801_smbus: probe of 0000:00:1f.3 failed with error -16
>
> The current code does not not fail when the interrupt request fails. It
> reports it and clears the IRQ feature flag.
>
> > > @@ -436,7 +437,15 @@ int acpi_pci_irq_enable(struct pci_dev *dev)
> > > * driver reported one, then use it. Exit in any case.
> > > */
> > > if (gsi < 0) {
> > > - if (acpi_isa_register_gsi(dev))
> > > +#ifdef CONFIG_X86
> > > + /*
> > > + * The Interrupt Line value of 0xff is defined to mean "unknown"
> > > + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page
> > > + * 223), using ~0U as invalid IRQ.
> > > + */
>
> And why would this be x86 specific? PCI3.0 is architecture independent, right?

Yes, PCI is mostly arch-independent, but Interrupt Line is
arch-specific, and the corner case of it being 255 is only mentioned
in an x86-specific footnote. We can improve the comment.

> > > + dev->irq = (dev->irq == 0xff) ? IRQ_INVALID : dev->irq;
> >
> > It's much simpler and clearer to write:
> >
> > if (dev->irq == 0xff)
> > dev->irq = IRQ_INVALID;
>
> I do not understand that IRQ_INVALID business at all.
>
> > > +#endif
> > > + if (!irq_is_valid(dev->irq) || acpi_isa_register_gsi(dev))
> > > dev_warn(&dev->dev, "PCI INT %c: no GSI\n",
> > > pin_name(pin));
> > >
>
> The existing code already drops into this place because
> acpi_isa_register_gsi() fails.
>
> > > i801_smbus 0000:00:1f.3: PCI INT C: no GSI
>
> What extra value does that !irq_is_valid() provide?
>
> And how does setting dev->irq to ~0U prevent that request_irq() is called in
> the i801 device driver? Not at all, AFAICT. It will just fail with a different
> error.
>
> So the whole 'fix' relies on the fact that irq ~0U does not exist (at least
> not today) and therefor the false sharing with some other driver using irq 255
> will not happen.
>
> Relying on undocumented behaviour is not a fix, that's voodoo programming.
>
> The proper solution here is to flag that this device does not have an
> interrupt connected and act accordingly in the device driver, i.e. do not call
> request_irq() in the first place.

This is the crux of the problem. As far as I know, PCI doesn't have
a flag to indicate that dev->irq is a wire that's not connected, so
there's no generic way for a driver to know whether it should call
request_irq().

We could add one, of course, but that only helps in the drivers we
update. It'd be nice if we could figure out a way to fix this
without having to touch all the drivers.

I think any driver that uses line-based interrupts can potentially
fail if the platform uses Interrupt Line == 255 to indicate that the
line is not connected. If another driver happens to be using IRQ 255,
request_irq() may fail as it does here. Otherwise, I suspect
request_irq() will return success, but the driver won't get any
interrupts.

> > I don't like the x86 ifdef. I'd prefer:
> >
> > static inline bool irq_valid(unsigned int irq)
> > {
> > if (irq < NR_IRQS)
> > return true;
> > return false;
> > }
> >
> > This could be used in many of the places that currently use NR_IRQS.
>
> No. NR_IRQS cannot be used at all if sparse irqs are enabled.

Ah, thanks, this is a critical piece I missed. There *are* a few
places where we do define or use NR_IRQS when CONFIG_SPARSE_IRQ=y. Do
these need updates?

include/asm-generic/irq.h defines NR_IRQS if not defined yet
arch/arm/include/asm/irq.h defines NR_IRQS to NR_IRQS_LEGACY
arch/arm/kernel/irq.c uses NR_IRQS
arch/sh/include/asm/irq.h defines NR_IRQS to 8
kernel/irq/internals.h uses NR_IRQS to define IRQ_BITMAP_BITS

> Nothing in any
> generic code is supposed to rely on NR_IRQS.

I guess that means these uses are suspect:

$ git grep -w NR_IRQS | grep -v "^arch" | grep -v "^kernel" | grep -v "^drivers/irqchip" | grep -v "^include"
drivers/input/keyboard/lpc32xx-keys.c: if (irq < 0 || irq >= NR_IRQS) {
drivers/mtd/nand/lpc32xx_mlc.c: if ((host->irq < 0) || (host->irq >= NR_IRQS)) {
drivers/net/hamradio/scc.c:static struct irqflags { unsigned char used : 1; } Ivec[NR_IRQS];
drivers/pcmcia/pcmcia_resource.c: if (irq > NR_IRQS)
drivers/tty/serial/apbuart.c: if (ser->irq < 0 || ser->irq >= NR_IRQS)
drivers/tty/serial/ar933x_uart.c: if (ser->irq < 0 || ser->irq >= NR_IRQS)
drivers/tty/serial/lantiq.c: if (ser->irq < 0 || ser->irq >= NR_IRQS)
drivers/tty/serial/m32r_sio.c:static struct irq_info irq_lists[NR_IRQS];

Bjorn

Thomas Gleixner

unread,
Jan 26, 2016, 10:49:49 AM1/26/16
to Bjorn Helgaas, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
On Tue, 26 Jan 2016, Bjorn Helgaas wrote:
> On Tue, Jan 26, 2016 at 09:26:29AM +0100, Thomas Gleixner wrote:
> > And why would this be x86 specific? PCI3.0 is architecture independent, right?
>
> Yes, PCI is mostly arch-independent, but Interrupt Line is
> arch-specific, and the corner case of it being 255 is only mentioned
> in an x86-specific footnote. We can improve the comment.

Yes please :)

> > The proper solution here is to flag that this device does not have an
> > interrupt connected and act accordingly in the device driver, i.e. do not call
> > request_irq() in the first place.
>
> This is the crux of the problem. As far as I know, PCI doesn't have
> a flag to indicate that dev->irq is a wire that's not connected, so
> there's no generic way for a driver to know whether it should call
> request_irq().

Ok.

> We could add one, of course, but that only helps in the drivers we
> update. It'd be nice if we could figure out a way to fix this
> without having to touch all the drivers.

Hmm.

> I think any driver that uses line-based interrupts can potentially
> fail if the platform uses Interrupt Line == 255 to indicate that the
> line is not connected. If another driver happens to be using IRQ 255,
> request_irq() may fail as it does here. Otherwise, I suspect
> request_irq() will return success, but the driver won't get any
> interrupts.

Right. So we could certainly do something like this INVALID_IRQ thingy, but
that looks a bit weird. What would request_irq() return?

If it returns success, then drivers might make the wrong decision. If it
returns an error code, then the i801 one works, but we might have to fix
others anyway.

I think it's better to have a software flag in pci_dev to indicate that there
is no irq line and fix up the (probably few) affected drivers so they avoid
calling request_irq() and take the right action.

> > No. NR_IRQS cannot be used at all if sparse irqs are enabled.
>
> Ah, thanks, this is a critical piece I missed. There *are* a few
> places where we do define or use NR_IRQS when CONFIG_SPARSE_IRQ=y. Do
> these need updates?
>
> include/asm-generic/irq.h defines NR_IRQS if not defined yet
> arch/arm/include/asm/irq.h defines NR_IRQS to NR_IRQS_LEGACY
> arch/arm/kernel/irq.c uses NR_IRQS
> arch/sh/include/asm/irq.h defines NR_IRQS to 8

These defines are used for preallocating interrupt descriptors in early
boot. That was invented to help migrating to sparse irq.

> kernel/irq/internals.h uses NR_IRQS to define IRQ_BITMAP_BITS

Right, that's probably pointless, but harmless.

> > Nothing in any generic code is supposed to rely on NR_IRQS.
>
> I guess that means these uses are suspect:
>
> $ git grep -w NR_IRQS | grep -v "^arch" | grep -v "^kernel" | grep -v "^drivers/irqchip" | grep -v "^include"
> drivers/input/keyboard/lpc32xx-keys.c: if (irq < 0 || irq >= NR_IRQS) {
> drivers/mtd/nand/lpc32xx_mlc.c: if ((host->irq < 0) || (host->irq >= NR_IRQS)) {
> drivers/net/hamradio/scc.c:static struct irqflags { unsigned char used : 1; } Ivec[NR_IRQS];
> drivers/pcmcia/pcmcia_resource.c: if (irq > NR_IRQS)
> drivers/tty/serial/apbuart.c: if (ser->irq < 0 || ser->irq >= NR_IRQS)
> drivers/tty/serial/ar933x_uart.c: if (ser->irq < 0 || ser->irq >= NR_IRQS)
> drivers/tty/serial/lantiq.c: if (ser->irq < 0 || ser->irq >= NR_IRQS)
> drivers/tty/serial/m32r_sio.c:static struct irq_info irq_lists[NR_IRQS];

Indeed.

Thanks,

tglx

Bjorn Helgaas

unread,
Jan 26, 2016, 7:25:21 PM1/26/16
to Thomas Gleixner, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
I was thinking request_irq() could return -EINVAL if the caller passed
INVALID_IRQ. That should tell drivers that this interrupt won't work.

We'd be making request_irq() return -EINVAL in some cases where it
currently returns success. But even though it returns success today,
I don't think the driver is getting interrupts, because the wire isn't
connected.

> I think it's better to have a software flag in pci_dev to indicate that there
> is no irq line and fix up the (probably few) affected drivers so they avoid
> calling request_irq() and take the right action.

We could add an "irq_valid" flag in struct pci_dev and make a new
rule that drivers should check dev->irq_valid before using dev->irq.
But realistically, i801 is the only place that will check irq_valid
because that's the only driver where we know about a problem, so that
seems like sort of a point solution.

Bjorn

Cao jin

unread,
Jan 27, 2016, 12:25:01 AM1/27/16
to Bjorn Helgaas, Thomas Gleixner, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
How about using IRQ_BITMAP_BITS as that "irq_valid" flag? because it is
the ceiling of struct irq_desc irq_desc[], and request_irq() will return
-EINVAL in case of the ceiling.

#ifdef CONFIG_SPARSE_IRQ
# define IRQ_BITMAP_BITS (NR_IRQS + 8196)
#else
# define IRQ_BITMAP_BITS NR_IRQS
#endif

Chen Fan

unread,
Jan 27, 2016, 1:13:43 AM1/27/16
to linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, chen.f...@cn.fujitsu.com, hel...@kernel.org, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org, tg...@linutronix.de, li...@roeck-us.net
see each patch comments.

Chen Fan (2):
pci: add an irq_invalid flag to specify the IRQ is not connected
i801_smbus: fix unavailable irq number 255 reported by BIOS

drivers/acpi/pci_irq.c | 10 ++++++++++
drivers/i2c/busses/i2c-i801.c | 5 +++++
include/linux/pci.h | 1 +
3 files changed, 16 insertions(+)

--
1.9.3



Chen Fan

unread,
Jan 27, 2016, 1:13:46 AM1/27/16
to linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, chen.f...@cn.fujitsu.com, hel...@kernel.org, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org, tg...@linutronix.de, li...@roeck-us.net
Quoting the PCI3.0 SPEC on page 223:
"For x86 based PCs, the values in this register correspond
to IRQ numbers (0-15) of the standard dual 8259 configuration.
The value 255 is defined as meaning "unknown" or "no connection"
to the interrupt controller. Values between 15 and 254 are reserved."

So we add an irq_invalid flag to tell driver that the device wire
is not connected. then the driver could know whether it should call
request_irq().

Signed-off-by: Chen Fan <chen.f...@cn.fujitsu.com>
---
drivers/acpi/pci_irq.c | 10 ++++++++++
include/linux/pci.h | 1 +
2 files changed, 11 insertions(+)

diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index d30184c..bb78376 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -436,6 +436,16 @@ int acpi_pci_irq_enable(struct pci_dev *dev)
* driver reported one, then use it. Exit in any case.
*/
if (gsi < 0) {
+#ifdef CONFIG_X86
+ /*
+ * For X86 architecture, The Interrupt Line value of 0xff is
+ * defined to mean "unknown" or "no connection" (PCI 3.0,
+ * Section 6.2.4, footnote on page 223). we flag the IRQ
+ * as INVALID.
+ */
+ if (dev->irq == 0xff)
+ dev->irq_invalid = 1;
+#endif
if (acpi_isa_register_gsi(dev))
dev_warn(&dev->dev, "PCI INT %c: no GSI\n",
pin_name(pin));
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d86378c..06f137e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -359,6 +359,7 @@ struct pci_dev {
unsigned int io_window_1k:1; /* Intel P2P bridge 1K I/O windows */
unsigned int irq_managed:1;
unsigned int has_secondary_link:1;
+ unsigned int irq_invalid:1;
pci_dev_flags_t dev_flags;
atomic_t enable_cnt; /* pci_enable_device has been called */

--
1.9.3



Chen Fan

unread,
Jan 27, 2016, 1:13:54 AM1/27/16
to linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, chen.f...@cn.fujitsu.com, hel...@kernel.org, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org, tg...@linutronix.de, li...@roeck-us.net
In our environment, when enable Secure boot, we found an abnormal
phenomenon as following call trace shows. after investigation, we
found the firmware assigned an irq number 255 which means unknown
or no connection in PCI local spec for i801_smbus, meanwhile the
ACPI didn't configure the pci irq routing. and the 255 irq number
was assigned for megasa msix without IRQF_SHARED. then in this case
during i801_smbus probe, the i801_smbus driver would request irq with
bad irq number 255. but the 255 irq number was assigned for memgasa
with MSIX enable. which will cause request_irq fails and result in
the call trace below, here we add check the irq_invalid flag to identify
the smbus whether support IRQ-driven.

i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143)
i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C
i801_smbus 0000:00:1f.3: PCI INT C: no GSI
genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa)
CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1
Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5

Call Trace:
dump_stack+0x19/0x1b
__setup_irq+0x54a/0x570
request_threaded_irq+0xcc/0x170
i801_probe+0x32f/0x508 [i2c_i801]
local_pci_probe+0x45/0xa0
i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
i801_smbus: probe of 0000:00:1f.3 failed with error -16

Signed-off-by: Chen Fan <chen.f...@cn.fujitsu.com>
---
drivers/i2c/busses/i2c-i801.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
index f62d697..22f9aea 100644
--- a/drivers/i2c/busses/i2c-i801.c
+++ b/drivers/i2c/busses/i2c-i801.c
@@ -1380,6 +1380,11 @@ static int i801_probe(struct pci_dev *dev, const struct pci_device_id *id)
}
}

+ if ((priv->features & FEATURE_IRQ) && dev->irq_invalid) {
+ dev_warn(&dev->dev, "Interrupt line is invalid!\n");
+ priv->features &= ~FEATURE_IRQ;
+ }
+
if (priv->features & FEATURE_IRQ) {
init_waitqueue_head(&priv->waitq);

--
1.9.3



Thomas Gleixner

unread,
Jan 27, 2016, 3:36:44 AM1/27/16
to Cao jin, Bjorn Helgaas, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
On Wed, 27 Jan 2016, Cao jin wrote:
> How about using IRQ_BITMAP_BITS as that "irq_valid" flag? because it is the
> ceiling of struct irq_desc irq_desc[], and request_irq() will return -EINVAL
> in case of the ceiling.
>
> #ifdef CONFIG_SPARSE_IRQ
> # define IRQ_BITMAP_BITS (NR_IRQS + 8196)
> #else
> # define IRQ_BITMAP_BITS NR_IRQS
> #endif

No. This is a core internal implementation detail.

Thanks,

tglx

Thomas Gleixner

unread,
Jan 27, 2016, 4:15:03 AM1/27/16
to Bjorn Helgaas, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
On Tue, 26 Jan 2016, Bjorn Helgaas wrote:
> On Tue, Jan 26, 2016 at 04:48:25PM +0100, Thomas Gleixner wrote:
> > Right. So we could certainly do something like this INVALID_IRQ thingy, but
> > that looks a bit weird. What would request_irq() return?
> >
> > If it returns success, then drivers might make the wrong decision. If it
> > returns an error code, then the i801 one works, but we might have to fix
> > others anyway.
>
> I was thinking request_irq() could return -EINVAL if the caller passed
> INVALID_IRQ. That should tell drivers that this interrupt won't work.
>
> We'd be making request_irq() return -EINVAL in some cases where it
> currently returns success. But even though it returns success today,
> I don't think the driver is getting interrupts, because the wire isn't
> connected.

Correct. What I meant is that the i801 driver can handle the -EINVAL return
from request_irq() today, but other drivers don't. I agree that we need to fix
them anyway and a failure to request the interrupt is better than a silent 'no
interrupts delivered' issue.

Though instead of returning -EINVAL I prefer an explicit error code for this
case. That way a driver can distinguish between the 'not connected' case and
other failure modes. Something like the patch below should work.

Thanks,

tglx

8<------------------
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -387,6 +387,22 @@ static inline int acpi_isa_register_gsi(
}
#endif

+static inline bool acpi_pci_irq_valid(struc pci_dev *dev)
+{
+#ifdef CONFIG_X86
+ /*
+ * On x86 irq line 0xff means "unknown" or "no connection" (PCI 3.0,
+ * Section 6.2.4, footnote on page 223).
+ */
+ if (dev->irq == 0xff) {
+ dev->irq = IRQ_NOTCONNECTED;
+ dev_warn(&dev->dev, "PCI INT not connected\n");
+ return false;
+ }
+#endif
+ return true;
+}
+
int acpi_pci_irq_enable(struct pci_dev *dev)
{
struct acpi_prt_entry *entry;
@@ -409,6 +425,9 @@ int acpi_pci_irq_enable(struct pci_dev *
if (pci_has_managed_irq(dev))
return 0;

+ if (!acpi_pci_irq_valid(dev))
+ return 0;
+
entry = acpi_pci_irq_lookup(dev, pin);
if (!entry) {
/*
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -125,6 +125,16 @@ struct irqaction {

extern irqreturn_t no_action(int cpl, void *dev_id);

+/*
+ * If a (PCI) device interrupt is not connected we set dev->irq to
+ * IRQ_NOTCONNECTED. This causes request_irq() to fail with -ENOTCONN, so we
+ * can distingiush that case from other error returns.
+ *
+ * 0x80000000 is guaranteed to be outside the available range of interrupts
+ * and easy to distinguish from other possible incorrect values.
+ */
+#define IRQ_NOTCONNECTED (1U << 31)
+
extern int __must_check
request_threaded_irq(unsigned int irq, irq_handler_t handler,
irq_handler_t thread_fn,
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1609,6 +1609,9 @@ int request_threaded_irq(unsigned int ir
struct irq_desc *desc;
int retval;

+ if (irq == IRQ_NOTCONNECTED)
+ return -ENOTCONN;
+
/*
* Sanity-check: shared interrupts must pass in a real dev-ID,
* otherwise we'll have trouble later trying to figure out
@@ -1699,9 +1702,13 @@ EXPORT_SYMBOL(request_threaded_irq);
int request_any_context_irq(unsigned int irq, irq_handler_t handler,
unsigned long flags, const char *name, void *dev_id)
{
- struct irq_desc *desc = irq_to_desc(irq);
+ struct irq_desc *desc;
int ret;

+ if (irq == IRQ_NOTCONNECTED)
+ return -ENOTCONN;
+
+ desc = irq_to_desc(irq);
if (!desc)
return -EINVAL;




Bjorn Helgaas

unread,
Jan 27, 2016, 5:33:04 PM1/27/16
to Thomas Gleixner, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
On Wed, Jan 27, 2016 at 10:13:36AM +0100, Thomas Gleixner wrote:
> On Tue, 26 Jan 2016, Bjorn Helgaas wrote:
> > On Tue, Jan 26, 2016 at 04:48:25PM +0100, Thomas Gleixner wrote:
> > > Right. So we could certainly do something like this INVALID_IRQ thingy, but
> > > that looks a bit weird. What would request_irq() return?
> > >
> > > If it returns success, then drivers might make the wrong decision. If it
> > > returns an error code, then the i801 one works, but we might have to fix
> > > others anyway.
> >
> > I was thinking request_irq() could return -EINVAL if the caller passed
> > INVALID_IRQ. That should tell drivers that this interrupt won't work.
> >
> > We'd be making request_irq() return -EINVAL in some cases where it
> > currently returns success. But even though it returns success today,
> > I don't think the driver is getting interrupts, because the wire isn't
> > connected.
>
> Correct. What I meant is that the i801 driver can handle the -EINVAL return
> from request_irq() today, but other drivers don't. I agree that we need to fix
> them anyway and a failure to request the interrupt is better than a silent 'no
> interrupts delivered' issue.
>
> Though instead of returning -EINVAL I prefer an explicit error code for this
> case. That way a driver can distinguish between the 'not connected' case and
> other failure modes. Something like the patch below should work.

This patch looks great to me, thanks for all your help!

Chen, do you want to put all this together as formal patches with
changelogs and post to the mailing lists?

Bjorn
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in

Chen Fan

unread,
Jan 27, 2016, 8:05:21 PM1/27/16
to Bjorn Helgaas, Thomas Gleixner, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
With pleasure. I will test it on my environment first and send it out.
thank all of you for this problem.

Chen
> .
>



Chen Fan

unread,
Jan 27, 2016, 8:41:00 PM1/27/16
to linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, chen.f...@cn.fujitsu.com, hel...@kernel.org, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org, Thomas Gleixner
In our X86 environment, when enable Secure boot, we found an abnormal
phenomenon as following call trace shows. after investigation, we
found the firmware assigned an irq number 255 which means unknown
or no connection in PCI local spec for i801_smbus, meanwhile the
ACPI didn't configure the pci irq routing. and the 255 irq number
was assigned for megasa msix without IRQF_SHARED. then in this case
during i801_smbus probe, the i801_smbus driver would request irq with
bad irq number 255. but the 255 irq number was assigned for memgasa
with MSIX enable. which will cause request_irq fails and result in
the call trace below, here we introduce an IRQ_NOTCONNECTED to identify
the device interrupt is not connected.

i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143)
i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C
i801_smbus 0000:00:1f.3: PCI INT C: no GSI
genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa)
CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1
Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5

Call Trace:
dump_stack+0x19/0x1b
__setup_irq+0x54a/0x570
request_threaded_irq+0xcc/0x170
i801_probe+0x32f/0x508 [i2c_i801]
local_pci_probe+0x45/0xa0
i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
i801_smbus: probe of 0000:00:1f.3 failed with error -16

Signed-off-by: Chen Fan <chen.f...@cn.fujitsu.com>
Signed-off-by: Thomas Gleixner <tg...@linutronix.de>
Cc: Bjorn Helgaas <hel...@kernel.org>
---
drivers/acpi/pci_irq.c | 20 ++++++++++++++++++++
include/linux/interrupt.h | 10 ++++++++++
kernel/irq/manage.c | 9 ++++++++-
3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index d30184c..dda6d25 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -33,6 +33,7 @@
#include <linux/pci.h>
#include <linux/acpi.h>
#include <linux/slab.h>
+#include <linux/interrupt.h>

#define PREFIX "ACPI: "

@@ -387,6 +388,22 @@ static inline int acpi_isa_register_gsi(struct pci_dev *dev)
}
#endif

+static inline bool acpi_pci_irq_valid(struct pci_dev *dev)
+{
+#ifdef CONFIG_X86
+ /*
+ * On x86 irq line 0xff means "unknown" or "no connection" (PCI 3.0,
+ * Section 6.2.4, footnote on page 223).
+ */
+ if (dev->irq == 0xff) {
+ dev->irq = IRQ_NOTCONNECTED;
+ dev_warn(&dev->dev, "PCI INT not connected\n");
+ return false;
+ }
+#endif
+ return true;
+}
+
int acpi_pci_irq_enable(struct pci_dev *dev)
{
struct acpi_prt_entry *entry;
@@ -409,6 +426,9 @@ int acpi_pci_irq_enable(struct pci_dev *dev)
if (pci_has_managed_irq(dev))
return 0;

+ if (!acpi_pci_irq_valid(dev))
+ return 0;
+
entry = acpi_pci_irq_lookup(dev, pin);
if (!entry) {
/*
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index cb30edb..12f7da4 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -125,6 +125,16 @@ struct irqaction {

extern irqreturn_t no_action(int cpl, void *dev_id);

+/*
+ * If a (PCI) device interrupt is not connected we set dev->irq to
+ * IRQ_NOTCONNECTED. This causes request_irq() to fail with -ENOTCONN, so we
+ * can distingiush that case from other error returns.
+ *
+ * 0x80000000 is guaranteed to be outside the available range of interrupts
+ * and easy to distinguish from other possible incorrect values.
+ */
+#define IRQ_NOTCONNECTED (1U << 31)
+
extern int __must_check
request_threaded_irq(unsigned int irq, irq_handler_t handler,
irq_handler_t thread_fn,
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 8411872..e79e60f 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1609,6 +1609,9 @@ int request_threaded_irq(unsigned int irq, irq_handler_t handler,
struct irq_desc *desc;
int retval;

+ if (irq == IRQ_NOTCONNECTED)
+ return -ENOTCONN;
+
/*
* Sanity-check: shared interrupts must pass in a real dev-ID,
* otherwise we'll have trouble later trying to figure out
@@ -1699,9 +1702,13 @@ EXPORT_SYMBOL(request_threaded_irq);
int request_any_context_irq(unsigned int irq, irq_handler_t handler,
unsigned long flags, const char *name, void *dev_id)
{
- struct irq_desc *desc = irq_to_desc(irq);
+ struct irq_desc *desc;
int ret;

+ if (irq == IRQ_NOTCONNECTED)
+ return -ENOTCONN;
+
+ desc = irq_to_desc(irq);
if (!desc)
return -EINVAL;

--
1.9.3



Thomas Gleixner

unread,
Jan 28, 2016, 3:42:32 AM1/28/16
to Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, hel...@kernel.org, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org
Bjorn,

feel free to ship that with s/Signed-off-by tglx/Acked-by tglx/

Thanks,

tglx

Bjorn Helgaas

unread,
Jan 28, 2016, 3:57:51 PM1/28/16
to Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, r...@rjwysocki.net, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org, Thomas Gleixner
Hi Chen,

Thanks a lot for persevering and working this all out!
Acked-by: Bjorn Helgaas <bhel...@google.com>

Rafael, I assume you'll take this if you think it's ready.

This is a subtle problem and, if I understand correctly, can manifest
intermittently depending on the machine configuration. For example,
if you got rid of the "megasa" driver, I suspect i801_smbus would not
complain, but it wouldn't work.

I think we might want to consider doing something for non-x86 arches
as well, but we can do that later. I propose a changelog like the
following. Please correct anything I got wrong. I suspect we will be
revisiting this issue eventually, so I'd like to have a good
description.


x86/PCI: Recognize that Interrupt Line 255 means "not connected"

Per the x86-specific footnote to PCI spec r3.0, sec 6.2.4, the value 255 in
the Interrupt Line register means "unknown" or "no connection."
Previously, when we couldn't derive an IRQ from the _PRT, we fell back to
using the value from Interrupt Line as an IRQ. It's questionable whether
we should do that at all, but the spec clearly suggests we shouldn't do it
for the value 255 on x86.

Calling request_irq() with IRQ 255 may succeed, but the driver won't
receive any interrupts. Or, if IRQ 255 is shared with another device, it
may succeed, and the driver's ISR will be called at random times when the
*other* device interrupts. Or it may fail if another device is using IRQ
255 with incompatible flags. What we *want* is for request_irq() to fail
predictably so the driver can fall back to polling.

On x86, assume 255 in the Interrupt Line means the INTx line is not
connected. In that case, set dev->irq to IRQ_NOTCONNECTED so request_irq()
will fail gracefully with -ENOTCONN.

We found this problem on a system where Secure Boot firmware assigned
Interrupt Line 255 to an i801_smbus device and another device was already
using MSI-X IRQ 255. This was in v3.10, where i801_probe() fails if
request_irq() fails:

i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143)
i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C
i801_smbus 0000:00:1f.3: PCI INT C: no GSI
genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa)
CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1
Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5
Call Trace:
dump_stack+0x19/0x1b
__setup_irq+0x54a/0x570
request_threaded_irq+0xcc/0x170
i801_probe+0x32f/0x508 [i2c_i801]
local_pci_probe+0x45/0xa0
i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
i801_smbus: probe of 0000:00:1f.3 failed with error -16

After aeb8a3d16ae0 ("i2c: i801: Check if interrupts are disabled"),
i801_probe() will fall back to polling if request_irq() fails. But we
still need this patch because request_irq() may succeed or fail depending
on other devices in the system. If request_irq() fails, i801_smbus will
work by falling back to polling, but if it succeeds, i801_smbus won't work
because it expects interrupts that it may not receive.

Rafael J. Wysocki

unread,
Jan 28, 2016, 10:36:32 PM1/28/16
to Bjorn Helgaas, Chen Fan, linux...@vger.kernel.org, linux-...@vger.kernel.org, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org, Thomas Gleixner
I can do that.
I like this. :-)

Chen, can you please add the changelog as suggested by Bjorn to the patch
and resend?

Thanks,
Rafael

Chen Fan

unread,
Jan 28, 2016, 10:40:02 PM1/28/16
to Rafael J. Wysocki, Bjorn Helgaas, linux...@vger.kernel.org, linux-...@vger.kernel.org, le...@kernel.org, izumi...@jp.fujitsu.com, we...@cn.fujitsu.com, caoj...@cn.fujitsu.com, ddane...@gmail.com, ok...@codeaurora.org, bhel...@google.com, jian...@linux.intel.com, linu...@vger.kernel.org, Thomas Gleixner
Sure, Thank all of you.

Chen


>
> Thanks,
> Rafael
>
>
>
> .
>



0 new messages