Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] drivers/perf: arm-pmu: Handle per-interrupt affinity mask

120 views
Skip to first unread message

Marc Zyngier

unread,
Jul 1, 2016, 9:30:09 AM7/1/16
to
On a big-little system, PMUs can be wired to CPUs using per CPU
interrups (PPI). In this case, it is important to make sure that
the enable/disable do happen on the right set of CPUs.

So instead of relying on the interrupt-affinity property, we can
use the actual percpu affinity that DT exposes as part of the
interrupt specifier. The DT binding is also updated to reflect
the fact that the interrupt-affinity property shouldn't be used
in that case.

Signed-off-by: Marc Zyngier <marc.z...@arm.com>
---
Documentation/devicetree/bindings/arm/pmu.txt | 4 +++-
drivers/perf/arm_pmu.c | 22 +++++++++++++++++-----
2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt
index 74d5417..61c8b46 100644
--- a/Documentation/devicetree/bindings/arm/pmu.txt
+++ b/Documentation/devicetree/bindings/arm/pmu.txt
@@ -39,7 +39,9 @@ Optional properties:
When using a PPI, specifies a list of phandles to CPU
nodes corresponding to the set of CPUs which have
a PMU of this type signalling the PPI listed in the
- interrupts property.
+ interrupts property, unless this is already specified
+ by the PPI interrupt specifier itself (in which case
+ the interrupt-affinity property shouldn't be present).

This property should be present when there is more than
a single SPI.
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 140436a..065ccec 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -603,7 +603,8 @@ static void cpu_pmu_free_irq(struct arm_pmu *cpu_pmu)

irq = platform_get_irq(pmu_device, 0);
if (irq >= 0 && irq_is_percpu(irq)) {
- on_each_cpu(cpu_pmu_disable_percpu_irq, &irq, 1);
+ on_each_cpu_mask(&cpu_pmu->supported_cpus,
+ cpu_pmu_disable_percpu_irq, &irq, 1);
free_percpu_irq(irq, &hw_events->percpu_pmu);
} else {
for (i = 0; i < irqs; ++i) {
@@ -645,7 +646,9 @@ static int cpu_pmu_request_irq(struct arm_pmu *cpu_pmu, irq_handler_t handler)
irq);
return err;
}
- on_each_cpu(cpu_pmu_enable_percpu_irq, &irq, 1);
+
+ on_each_cpu_mask(&cpu_pmu->supported_cpus,
+ cpu_pmu_enable_percpu_irq, &irq, 1);
} else {
for (i = 0; i < irqs; ++i) {
int cpu = i;
@@ -961,9 +964,18 @@ static int of_pmu_irq_cfg(struct arm_pmu *pmu)
i++;
} while (1);

- /* If we didn't manage to parse anything, claim to support all CPUs */
- if (cpumask_weight(&pmu->supported_cpus) == 0)
- cpumask_setall(&pmu->supported_cpus);
+ /* If we didn't manage to parse anything, try the interrupt affinity */
+ if (cpumask_weight(&pmu->supported_cpus) == 0) {
+ if (!using_spi) {
+ /* If using PPIs, check the affinity of the partition */
+ int irq = platform_get_irq(pdev, 0);
+ irq_get_percpu_devid_partition(irq,
+ &pmu->supported_cpus);
+ } else {
+ /* Otherwise default to all CPUs */
+ cpumask_setall(&pmu->supported_cpus);
+ }
+ }

/* If we matched up the IRQ affinities, use them to route the SPIs */
if (using_spi && i == pdev->num_resources)
--
2.1.4

Caesar Wang

unread,
Jul 1, 2016, 11:10:08 AM7/1/16
to
Hi Marc,

On 2016年07月01日 21:21, Marc Zyngier wrote:
> On a big-little system, PMUs can be wired to CPUs using per CPU
> interrups (PPI). In this case, it is important to make sure that
> the enable/disable do happen on the right set of CPUs.
>
> So instead of relying on the interrupt-affinity property, we can
> use the actual percpu affinity that DT exposes as part of the
> interrupt specifier. The DT binding is also updated to reflect
> the fact that the interrupt-affinity property shouldn't be used
> in that case.
>
> Signed-off-by: Marc Zyngier <marc.z...@arm.com>

Tested-by: Caesar Wang <w...@rock-chips.com>

I pick this up on my local branch.
8bc671a FROMLIST: drivers/perf: arm-pmu: Handle per-interrupt affinity mask
3d723f4 FROMLIST: arm64: dts: rockchip: support the pmu node for rk3399
1359b92 FIXUP: FROMLIST: arm64: dts: rockchip: change all interrupts
cells for 4 on rk3399 SoCs
...

Tested on rk3399 board.
localhost / # perf list

List of pre-defined events (to be used in -e):
cpu-cycles OR cycles [Hardware event]
instructions [Hardware event]
cache-references [Hardware event]
cache-misses [Hardware event]
branch-instructions OR branches [Hardware event]
branch-misses [Hardware event]
bus-cycles [Hardware event]
...

perf stat --cpu 0/1/2/3..... to minitor
e.g. cpu0;

localhost / # perf stat --cpu 0
^C
Performance counter stats for 'CPU(s) 0':

3374.917571 task-clock (msec) # 1.001 CPUs utilized [100.00%]
20 context-switches # 0.006 K/sec [100.00%]
2 cpu-migrations # 0.001 K/sec [100.00%]
55 page-faults # 0.016 K/sec
7151843 cycles # 0.002 GHz [100.00%]
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
4272536 instructions # 0.60 insns per cycle [100.00%]
568406 branches # 0.168 M/sec [100.00%]
65652 branch-misses # 11.55% of all branches

Also, 'perf top' to monitor the PMU interrupts from cpus
---
caesar wang | software engineer | w...@rock-chip.com

Rob Herring

unread,
Jul 5, 2016, 10:30:06 AM7/5/16
to
On Fri, Jul 01, 2016 at 02:21:31PM +0100, Marc Zyngier wrote:
> On a big-little system, PMUs can be wired to CPUs using per CPU
> interrups (PPI). In this case, it is important to make sure that
> the enable/disable do happen on the right set of CPUs.
>
> So instead of relying on the interrupt-affinity property, we can
> use the actual percpu affinity that DT exposes as part of the
> interrupt specifier. The DT binding is also updated to reflect
> the fact that the interrupt-affinity property shouldn't be used
> in that case.
>
> Signed-off-by: Marc Zyngier <marc.z...@arm.com>
> ---
> Documentation/devicetree/bindings/arm/pmu.txt | 4 +++-

Acked-by: Rob Herring <ro...@kernel.org>

Will Deacon

unread,
Jul 6, 2016, 6:50:06 AM7/6/16
to
On Fri, Jul 01, 2016 at 02:21:31PM +0100, Marc Zyngier wrote:
> On a big-little system, PMUs can be wired to CPUs using per CPU
> interrups (PPI). In this case, it is important to make sure that
> the enable/disable do happen on the right set of CPUs.
>
> So instead of relying on the interrupt-affinity property, we can
> use the actual percpu affinity that DT exposes as part of the
> interrupt specifier. The DT binding is also updated to reflect
> the fact that the interrupt-affinity property shouldn't be used
> in that case.

[...]

> - /* If we didn't manage to parse anything, claim to support all CPUs */
> - if (cpumask_weight(&pmu->supported_cpus) == 0)
> - cpumask_setall(&pmu->supported_cpus);
> + /* If we didn't manage to parse anything, try the interrupt affinity */
> + if (cpumask_weight(&pmu->supported_cpus) == 0) {
> + if (!using_spi) {
> + /* If using PPIs, check the affinity of the partition */
> + int irq = platform_get_irq(pdev, 0);
> + irq_get_percpu_devid_partition(irq,
> + &pmu->supported_cpus);

Should we not at least propagate the failure if this returns -EINVAL?

Will

Marc Zyngier

unread,
Jul 6, 2016, 10:20:07 AM7/6/16
to
Good point. I'll fix that and resend it.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Marc Zyngier

unread,
Jul 6, 2016, 10:40:09 AM7/6/16
to
On a big-little system, PMUs can be wired to CPUs using per CPU
interrups (PPI). In this case, it is important to make sure that
the enable/disable do happen on the right set of CPUs.

So instead of relying on the interrupt-affinity property, we can
use the actual percpu affinity that DT exposes as part of the
interrupt specifier. The DT binding is also updated to reflect
the fact that the interrupt-affinity property shouldn't be used
in that case.

Signed-off-by: Marc Zyngier <marc.z...@arm.com>
---
* From v1:
- propagate the error if irq_get_percpu_devid_partition fails

Documentation/devicetree/bindings/arm/pmu.txt | 4 +++-
drivers/perf/arm_pmu.c | 27 ++++++++++++++++++++++-----
2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt
index 74d5417..61c8b46 100644
--- a/Documentation/devicetree/bindings/arm/pmu.txt
+++ b/Documentation/devicetree/bindings/arm/pmu.txt
@@ -39,7 +39,9 @@ Optional properties:
When using a PPI, specifies a list of phandles to CPU
nodes corresponding to the set of CPUs which have
a PMU of this type signalling the PPI listed in the
- interrupts property.
+ interrupts property, unless this is already specified
+ by the PPI interrupt specifier itself (in which case
+ the interrupt-affinity property shouldn't be present).

This property should be present when there is more than
a single SPI.
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 140436a..8e4d7f5 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -603,7 +603,8 @@ static void cpu_pmu_free_irq(struct arm_pmu *cpu_pmu)

irq = platform_get_irq(pmu_device, 0);
if (irq >= 0 && irq_is_percpu(irq)) {
- on_each_cpu(cpu_pmu_disable_percpu_irq, &irq, 1);
+ on_each_cpu_mask(&cpu_pmu->supported_cpus,
+ cpu_pmu_disable_percpu_irq, &irq, 1);
free_percpu_irq(irq, &hw_events->percpu_pmu);
} else {
for (i = 0; i < irqs; ++i) {
@@ -645,7 +646,9 @@ static int cpu_pmu_request_irq(struct arm_pmu *cpu_pmu, irq_handler_t handler)
irq);
return err;
}
- on_each_cpu(cpu_pmu_enable_percpu_irq, &irq, 1);
+
+ on_each_cpu_mask(&cpu_pmu->supported_cpus,
+ cpu_pmu_enable_percpu_irq, &irq, 1);
} else {
for (i = 0; i < irqs; ++i) {
int cpu = i;
@@ -961,9 +964,23 @@ static int of_pmu_irq_cfg(struct arm_pmu *pmu)
i++;
} while (1);

- /* If we didn't manage to parse anything, claim to support all CPUs */
- if (cpumask_weight(&pmu->supported_cpus) == 0)
- cpumask_setall(&pmu->supported_cpus);
+ /* If we didn't manage to parse anything, try the interrupt affinity */
+ if (cpumask_weight(&pmu->supported_cpus) == 0) {
+ if (!using_spi) {
+ /* If using PPIs, check the affinity of the partition */
+ int ret, irq;
+
+ irq = platform_get_irq(pdev, 0);
+ ret = irq_get_percpu_devid_partition(irq, &pmu->supported_cpus);
+ if (ret) {
+ kfree(irqs);
+ return ret;
+ }
+ } else {
+ /* Otherwise default to all CPUs */
+ cpumask_setall(&pmu->supported_cpus);
+ }
+ }

/* If we matched up the IRQ affinities, use them to route the SPIs */
if (using_spi && i == pdev->num_resources)
--
2.1.4

Will Deacon

unread,
Jul 6, 2016, 12:10:06 PM7/6/16
to
On Wed, Jul 06, 2016 at 03:33:47PM +0100, Marc Zyngier wrote:
> On a big-little system, PMUs can be wired to CPUs using per CPU
> interrups (PPI). In this case, it is important to make sure that
> the enable/disable do happen on the right set of CPUs.
>
> So instead of relying on the interrupt-affinity property, we can
> use the actual percpu affinity that DT exposes as part of the
> interrupt specifier. The DT binding is also updated to reflect
> the fact that the interrupt-affinity property shouldn't be used
> in that case.
>
> Signed-off-by: Marc Zyngier <marc.z...@arm.com>
> ---
> * From v1:
> - propagate the error if irq_get_percpu_devid_partition fails

Thanks, I'll queue this.

Will

Rob Herring

unread,
Jul 11, 2016, 10:40:07 AM7/11/16
to
On Wed, Jul 06, 2016 at 03:33:47PM +0100, Marc Zyngier wrote:
> On a big-little system, PMUs can be wired to CPUs using per CPU
> interrups (PPI). In this case, it is important to make sure that
> the enable/disable do happen on the right set of CPUs.
>
> So instead of relying on the interrupt-affinity property, we can
> use the actual percpu affinity that DT exposes as part of the
> interrupt specifier. The DT binding is also updated to reflect
> the fact that the interrupt-affinity property shouldn't be used
> in that case.
>
> Signed-off-by: Marc Zyngier <marc.z...@arm.com>
> ---
> * From v1:
> - propagate the error if irq_get_percpu_devid_partition fails
>
> Documentation/devicetree/bindings/arm/pmu.txt | 4 +++-

I acked v1, please add acks.

Rob

Will Deacon

unread,
Jul 11, 2016, 10:50:38 AM7/11/16
to
On Mon, Jul 11, 2016 at 09:37:16AM -0500, Rob Herring wrote:
> On Wed, Jul 06, 2016 at 03:33:47PM +0100, Marc Zyngier wrote:
> > On a big-little system, PMUs can be wired to CPUs using per CPU
> > interrups (PPI). In this case, it is important to make sure that
> > the enable/disable do happen on the right set of CPUs.
> >
> > So instead of relying on the interrupt-affinity property, we can
> > use the actual percpu affinity that DT exposes as part of the
> > interrupt specifier. The DT binding is also updated to reflect
> > the fact that the interrupt-affinity property shouldn't be used
> > in that case.
> >
> > Signed-off-by: Marc Zyngier <marc.z...@arm.com>
> > ---
> > * From v1:
> > - propagate the error if irq_get_percpu_devid_partition fails
> >
> > Documentation/devicetree/bindings/arm/pmu.txt | 4 +++-
>
> I acked v1, please add acks.

This is queued in arm64/for-next/core, but I spotted your Ack and added
it when I sent the patch to Catalin.

Will

Geert Uytterhoeven

unread,
Jul 19, 2016, 9:30:09 AM7/19/16
to
Hi Marc, Catalin, Will,

On Wed, Jul 6, 2016 at 4:33 PM, Marc Zyngier <marc.z...@arm.com> wrote:
> On a big-little system, PMUs can be wired to CPUs using per CPU
> interrups (PPI). In this case, it is important to make sure that
> the enable/disable do happen on the right set of CPUs.
>
> So instead of relying on the interrupt-affinity property, we can
> use the actual percpu affinity that DT exposes as part of the
> interrupt specifier. The DT binding is also updated to reflect
> the fact that the interrupt-affinity property shouldn't be used
> in that case.
>
> Signed-off-by: Marc Zyngier <marc.z...@arm.com>
> ---
> * From v1:
> - propagate the error if irq_get_percpu_devid_partition fails

This patch, which is commit 19a469a58720ea96 in arm64/for-next/core, broke
the PMU on r8a7740/armadillo800eva:

-hw perfevents: enabled with armv7_cortex_a9 PMU driver, 7
counters available
+hw perfevents: /pmu: failed to probe PMU!
+hw perfevents: /pmu: failed to register PMU devices!
+armv7-pmu: probe of pmu failed with error -22

This is a single-core Cortex A9.

> Documentation/devicetree/bindings/arm/pmu.txt | 4 +++-
> drivers/perf/arm_pmu.c | 27 ++++++++++++++++++++++-----
> 2 files changed, 25 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt
> index 74d5417..61c8b46 100644
> --- a/Documentation/devicetree/bindings/arm/pmu.txt
> +++ b/Documentation/devicetree/bindings/arm/pmu.txt
> @@ -39,7 +39,9 @@ Optional properties:
> When using a PPI, specifies a list of phandles to CPU
> nodes corresponding to the set of CPUs which have
> a PMU of this type signalling the PPI listed in the
> - interrupts property.
> + interrupts property, unless this is already specified
> + by the PPI interrupt specifier itself (in which case
> + the interrupt-affinity property shouldn't be present).
>
> This property should be present when there is more than
> a single SPI.

On a single core, there's only a single SPI, hence there's no need for an
"interrupt-affinity" property.

> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 140436a..8e4d7f5 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -961,9 +964,23 @@ static int of_pmu_irq_cfg(struct arm_pmu *pmu)
> i++;
> } while (1);
>
> - /* If we didn't manage to parse anything, claim to support all CPUs */
> - if (cpumask_weight(&pmu->supported_cpus) == 0)
> - cpumask_setall(&pmu->supported_cpus);
> + /* If we didn't manage to parse anything, try the interrupt affinity */
> + if (cpumask_weight(&pmu->supported_cpus) == 0) {
> + if (!using_spi) {

However, using_spi is never set to true in the absence of that property,
causing the wrong branch to be taken...

> + /* If using PPIs, check the affinity of the partition */
> + int ret, irq;
> +
> + irq = platform_get_irq(pdev, 0);
> + ret = irq_get_percpu_devid_partition(irq, &pmu->supported_cpus);

... and ret to become -22 here.

> + if (ret) {
> + kfree(irqs);
> + return ret;
> + }
> + } else {
> + /* Otherwise default to all CPUs */
> + cpumask_setall(&pmu->supported_cpus);
> + }
> + }
>
> /* If we matched up the IRQ affinities, use them to route the SPIs */
> if (using_spi && i == pdev->num_resources)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Marc Zyngier

unread,
Jul 19, 2016, 9:50:06 AM7/19/16
to
Hi Geert,
Thanks for the thorough analysis. Could you please give the following
patchlet a go:

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 2513365..9275e08 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -958,11 +958,12 @@ static int of_pmu_irq_cfg(struct arm_pmu *pmu)

/* If we didn't manage to parse anything, try the interrupt affinity */
if (cpumask_weight(&pmu->supported_cpus) == 0) {
- if (!using_spi) {
+ int irq = platform_get_irq(pdev, 0);
+
+ if (irq_is_percpu(irq)) {
/* If using PPIs, check the affinity of the partition */
- int ret, irq;
+ int ret;

- irq = platform_get_irq(pdev, 0);
ret = irq_get_percpu_devid_partition(irq, &pmu->supported_cpus);
if (ret) {
kfree(irqs);


and let me know if that helps?

Geert Uytterhoeven

unread,
Jul 19, 2016, 10:30:06 AM7/19/16
to
Hi Marc,

On Tue, Jul 19, 2016 at 3:46 PM, Marc Zyngier <marc.z...@arm.com> wrote:
> On 19/07/16 14:25, Geert Uytterhoeven wrote:
>> On Wed, Jul 6, 2016 at 4:33 PM, Marc Zyngier <marc.z...@arm.com> wrote:
>>> On a big-little system, PMUs can be wired to CPUs using per CPU
>>> interrups (PPI). In this case, it is important to make sure that
>>> the enable/disable do happen on the right set of CPUs.
>>>
>>> So instead of relying on the interrupt-affinity property, we can
>>> use the actual percpu affinity that DT exposes as part of the
>>> interrupt specifier. The DT binding is also updated to reflect
>>> the fact that the interrupt-affinity property shouldn't be used
>>> in that case.
>>>
>>> Signed-off-by: Marc Zyngier <marc.z...@arm.com>
>>> ---
>>> * From v1:
>>> - propagate the error if irq_get_percpu_devid_partition fails
>>
>> This patch, which is commit 19a469a58720ea96 in arm64/for-next/core, broke
>> the PMU on r8a7740/armadillo800eva:
>>
>> -hw perfevents: enabled with armv7_cortex_a9 PMU driver, 7
>> counters available
>> +hw perfevents: /pmu: failed to probe PMU!
>> +hw perfevents: /pmu: failed to register PMU devices!
>> +armv7-pmu: probe of pmu failed with error -22

> Thanks for the thorough analysis. Could you please give the following
> patchlet a go:
>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 2513365..9275e08 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -958,11 +958,12 @@ static int of_pmu_irq_cfg(struct arm_pmu *pmu)
>
> /* If we didn't manage to parse anything, try the interrupt affinity */
> if (cpumask_weight(&pmu->supported_cpus) == 0) {
> - if (!using_spi) {
> + int irq = platform_get_irq(pdev, 0);
> +
> + if (irq_is_percpu(irq)) {
> /* If using PPIs, check the affinity of the partition */
> - int ret, irq;
> + int ret;
>
> - irq = platform_get_irq(pdev, 0);
> ret = irq_get_percpu_devid_partition(irq, &pmu->supported_cpus);
> if (ret) {
> kfree(irqs);
>
>
> and let me know if that helps?

Thanks, that fixes it, as expected.
0 new messages