Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] perf, x86: Disable perf if the BIOS got its grubby paws on the PMU

100 views
Skip to first unread message

Peter Zijlstra

unread,
Sep 3, 2010, 5:20:02 AM9/3/10
to
Disable all of perf if we find any active PMCs on boot.

It has been reported that some BIOSes access the PMU, for obscure things
better done by the OS. Come down hard on this practice and fully disable
the PMU.

[ Boot tested on a westmere system with a sane BIOS -- I don't actually
have an affected system to test this on. ]

Signed-off-by: Peter Zijlstra <a.p.zi...@chello.nl>
---

Robert, do you know if any AMD system BIOSes carry similar
Feat^H^H^H^HFailureAdd?

arch/x86/kernel/cpu/perf_event_intel.c | 35 ++++++++++++++++++++++++++++++++
1 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index ee05c90..6849653 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -923,6 +923,19 @@ static void intel_clovertown_quirks(void)
x86_pmu.pebs_constraints = NULL;
}

+static void print_BIOS_fail(void)
+{
+ printk(KERN_ERR "\n");
+ printk(KERN_ERR "=============================================\n");
+ printk(KERN_ERR "It appears the BIOS is actively using the PMU\n");
+ printk(KERN_ERR "this avoids Linux from using it, please de- \n");
+ printk(KERN_ERR "activate this BIOS feature or request a BIOS \n");
+ printk(KERN_ERR "update from your vendor. \n");
+ printk(KERN_ERR "=============================================\n");
+
+ memset(&x86_pmu, 0, sizeof(x86_pmu));
+}
+
static __init int intel_pmu_init(void)
{
union cpuid10_edx edx;
@@ -930,6 +943,8 @@ static __init int intel_pmu_init(void)
unsigned int unused;
unsigned int ebx;
int version;
+ u64 val;
+ int i;

if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
switch (boot_cpu_data.x86) {
@@ -968,6 +983,26 @@ static __init int intel_pmu_init(void)
x86_pmu.num_counters_fixed = max((int)edx.split.num_counters_fixed, 3);

/*
+ * Check to see if the BIOS enabled any of the counters, if so
+ * complain and bail.
+ */
+ for (i = 0; i < x86_pmu.num_counters; i++) {
+ rdmsrl(x86_pmu.eventsel + i, val);
+ if (val & ARCH_PERFMON_EVENTSEL_ENABLE) {
+ print_BIOS_fail();
+ return -EBUSY;
+ }
+ }
+
+ for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
+ rdmsrl(MSR_ARCH_PERFMON_FIXED_CTR_CTRL, val);
+ if (val & (0x03 << i*4)) {
+ print_BIOS_fail();
+ return -EBUSY;
+ }
+ }
+
+ /*
* v2 and above have a perf capabilities MSR
*/
if (version > 1) {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Arjan van de Ven

unread,
Sep 3, 2010, 10:00:02 AM9/3/10
to
On 9/3/2010 2:13 AM, Peter Zijlstra wrote:
>
> +static void print_BIOS_fail(void)
> +{
> + printk(KERN_ERR "\n");
> + printk(KERN_ERR "=============================================\n");
> + printk(KERN_ERR "It appears the BIOS is actively using the PMU\n");
> + printk(KERN_ERR "this avoids Linux from using it, please de- \n");
> + printk(KERN_ERR "activate this BIOS feature or request a BIOS \n");
> + printk(KERN_ERR "update from your vendor. \n");
> + printk(KERN_ERR "=============================================\n");
> +
> + memset(&x86_pmu, 0, sizeof(x86_pmu));
> +}
> +
>


tell us how you really feel :-)

do you want to add a phone number of the support line of the vendor
(based on DMI data of course) ?

Matt Domsch

unread,
Sep 3, 2010, 10:40:02 AM9/3/10
to
On Fri, Sep 03, 2010 at 06:53:59AM -0700, Arjan van de Ven wrote:
> On 9/3/2010 2:13 AM, Peter Zijlstra wrote:
> >
> >+static void print_BIOS_fail(void)
> >+{
> >+ printk(KERN_ERR "\n");
> >+ printk(KERN_ERR "=============================================\n");
> >+ printk(KERN_ERR "It appears the BIOS is actively using the PMU\n");
> >+ printk(KERN_ERR "this avoids Linux from using it, please de- \n");
> >+ printk(KERN_ERR "activate this BIOS feature or request a BIOS \n");
> >+ printk(KERN_ERR "update from your vendor. \n");
> >+ printk(KERN_ERR "=============================================\n");
> >+
> >+ memset(&x86_pmu, 0, sizeof(x86_pmu));
> >+}
> >+
> >
>
>
> tell us how you really feel :-)
>
> do you want to add a phone number of the support line of the vendor
> (based on DMI data of course) ?

I must object to messages that repeatedly (at least on every boot)
tell system administrators to contact their hardware vendor's support
lines, when it's not clear what the BIOS is doing is incorrect. There
are plenty of valid reasons why BIOS itself would use PMU counters.
Dell PowerEdge server power management, handled by the BIOS, certainly
does use one.

My understanding is that there is a mechanism for the OS to request
BIOS to release use of PMU counters. Are we doing that? If BIOS does
not release the counters when asked, ok, that's something to
(potentially) warn about. But blanket "BIOS is using a CPU feature!
Bad BIOS! No treat for you!" - that's not helpful to anyone.

Thanks,
Matt

--
Matt Domsch
Technology Strategist
Dell | Office of the CTO

Peter Zijlstra

unread,
Sep 3, 2010, 10:50:01 AM9/3/10
to
On Fri, 2010-09-03 at 09:32 -0500, Matt Domsch wrote:
> On Fri, Sep 03, 2010 at 06:53:59AM -0700, Arjan van de Ven wrote:
> > On 9/3/2010 2:13 AM, Peter Zijlstra wrote:
> > >
> > >+static void print_BIOS_fail(void)
> > >+{
> > >+ printk(KERN_ERR "\n");
> > >+ printk(KERN_ERR "=============================================\n");
> > >+ printk(KERN_ERR "It appears the BIOS is actively using the PMU\n");
> > >+ printk(KERN_ERR "this avoids Linux from using it, please de- \n");
> > >+ printk(KERN_ERR "activate this BIOS feature or request a BIOS \n");
> > >+ printk(KERN_ERR "update from your vendor. \n");
> > >+ printk(KERN_ERR "=============================================\n");
> > >+
> > >+ memset(&x86_pmu, 0, sizeof(x86_pmu));
> > >+}

> I must object to messages that repeatedly (at least on every boot)


> tell system administrators to contact their hardware vendor's support
> lines, when it's not clear what the BIOS is doing is incorrect.

It is using the PMU, it should not _ever_ do that.

> There
> are plenty of valid reasons why BIOS itself would use PMU counters.
> Dell PowerEdge server power management, handled by the BIOS, certainly
> does use one.

Them make them stop doing that, or at least provide a BIOS option to
disable this Feat^WFailure-add.

An no, doing power-management from the BIOS is most certainly not a
valid reason. That's not what BIOSes are for, a BIOS should bring up the
system and then sod off.

> My understanding is that there is a mechanism for the OS to request
> BIOS to release use of PMU counters. Are we doing that?

I'm not aware of any such thing. The Intel Arch docs most certainly
don't specify anything about that.

> If BIOS does
> not release the counters when asked, ok, that's something to
> (potentially) warn about. But blanket "BIOS is using a CPU feature!
> Bad BIOS! No treat for you!" - that's not helpful to anyone.

Again, I'm not aware there is a spec on how to ask anything of the BIOS,
let alone a part pertaining to PMU functionality.

Robert Richter

unread,
Sep 7, 2010, 11:20:03 AM9/7/10
to
> > > On 9/3/2010 2:13 AM, Peter Zijlstra wrote:
> > > >+static void print_BIOS_fail(void)
> > > >+{
> > > >+ printk(KERN_ERR "\n");
> > > >+ printk(KERN_ERR "=============================================\n");
> > > >+ printk(KERN_ERR "It appears the BIOS is actively using the PMU\n");
> > > >+ printk(KERN_ERR "this avoids Linux from using it, please de- \n");
> > > >+ printk(KERN_ERR "activate this BIOS feature or request a BIOS \n");
> > > >+ printk(KERN_ERR "update from your vendor. \n");
> > > >+ printk(KERN_ERR "=============================================\n");

I would rather prefer this:

BIOS bug, cpu 1, invalid <register=value> ...

... which is a much better information on one line, explains the bug
and is also better parsable. I intend to implement messages like
this. So maybe we could find consensus with this or something similar.

A simple grep of dmesg will then give a list of BIOS bugs.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center

Robert Richter

unread,
Sep 7, 2010, 1:00:02 PM9/7/10
to
On 07.09.10 17:15:41, Robert Richter wrote:
> > > > On 9/3/2010 2:13 AM, Peter Zijlstra wrote:
> > > > >+static void print_BIOS_fail(void)
> > > > >+{
> > > > >+ printk(KERN_ERR "\n");
> > > > >+ printk(KERN_ERR "=============================================\n");
> > > > >+ printk(KERN_ERR "It appears the BIOS is actively using the PMU\n");
> > > > >+ printk(KERN_ERR "this avoids Linux from using it, please de- \n");
> > > > >+ printk(KERN_ERR "activate this BIOS feature or request a BIOS \n");
> > > > >+ printk(KERN_ERR "update from your vendor. \n");
> > > > >+ printk(KERN_ERR "=============================================\n");
>
> I would rather prefer this:
>
> BIOS bug, cpu 1, invalid <register=value> ...

I got the note there is already a macro for this in kernel.h:

#define FW_BUG "[Firmware Bug]: "

So, we would have something like:

[Firmware Bug]: cpu 1, invalid <register=value> ...

in dmesg.

-Robert

0 new messages