It has been reported that some BIOSes access the PMU, for obscure things
better done by the OS. Come down hard on this practice and fully disable
the PMU.
[ Boot tested on a westmere system with a sane BIOS -- I don't actually
have an affected system to test this on. ]
Signed-off-by: Peter Zijlstra <a.p.zi...@chello.nl>
---
Robert, do you know if any AMD system BIOSes carry similar
Feat^H^H^H^HFailureAdd?
arch/x86/kernel/cpu/perf_event_intel.c | 35 ++++++++++++++++++++++++++++++++
1 files changed, 35 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index ee05c90..6849653 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -923,6 +923,19 @@ static void intel_clovertown_quirks(void)
x86_pmu.pebs_constraints = NULL;
}
+static void print_BIOS_fail(void)
+{
+ printk(KERN_ERR "\n");
+ printk(KERN_ERR "=============================================\n");
+ printk(KERN_ERR "It appears the BIOS is actively using the PMU\n");
+ printk(KERN_ERR "this avoids Linux from using it, please de- \n");
+ printk(KERN_ERR "activate this BIOS feature or request a BIOS \n");
+ printk(KERN_ERR "update from your vendor. \n");
+ printk(KERN_ERR "=============================================\n");
+
+ memset(&x86_pmu, 0, sizeof(x86_pmu));
+}
+
static __init int intel_pmu_init(void)
{
union cpuid10_edx edx;
@@ -930,6 +943,8 @@ static __init int intel_pmu_init(void)
unsigned int unused;
unsigned int ebx;
int version;
+ u64 val;
+ int i;
if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
switch (boot_cpu_data.x86) {
@@ -968,6 +983,26 @@ static __init int intel_pmu_init(void)
x86_pmu.num_counters_fixed = max((int)edx.split.num_counters_fixed, 3);
/*
+ * Check to see if the BIOS enabled any of the counters, if so
+ * complain and bail.
+ */
+ for (i = 0; i < x86_pmu.num_counters; i++) {
+ rdmsrl(x86_pmu.eventsel + i, val);
+ if (val & ARCH_PERFMON_EVENTSEL_ENABLE) {
+ print_BIOS_fail();
+ return -EBUSY;
+ }
+ }
+
+ for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
+ rdmsrl(MSR_ARCH_PERFMON_FIXED_CTR_CTRL, val);
+ if (val & (0x03 << i*4)) {
+ print_BIOS_fail();
+ return -EBUSY;
+ }
+ }
+
+ /*
* v2 and above have a perf capabilities MSR
*/
if (version > 1) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
tell us how you really feel :-)
do you want to add a phone number of the support line of the vendor
(based on DMI data of course) ?
I must object to messages that repeatedly (at least on every boot)
tell system administrators to contact their hardware vendor's support
lines, when it's not clear what the BIOS is doing is incorrect. There
are plenty of valid reasons why BIOS itself would use PMU counters.
Dell PowerEdge server power management, handled by the BIOS, certainly
does use one.
My understanding is that there is a mechanism for the OS to request
BIOS to release use of PMU counters. Are we doing that? If BIOS does
not release the counters when asked, ok, that's something to
(potentially) warn about. But blanket "BIOS is using a CPU feature!
Bad BIOS! No treat for you!" - that's not helpful to anyone.
Thanks,
Matt
--
Matt Domsch
Technology Strategist
Dell | Office of the CTO
> I must object to messages that repeatedly (at least on every boot)
> tell system administrators to contact their hardware vendor's support
> lines, when it's not clear what the BIOS is doing is incorrect.
It is using the PMU, it should not _ever_ do that.
> There
> are plenty of valid reasons why BIOS itself would use PMU counters.
> Dell PowerEdge server power management, handled by the BIOS, certainly
> does use one.
Them make them stop doing that, or at least provide a BIOS option to
disable this Feat^WFailure-add.
An no, doing power-management from the BIOS is most certainly not a
valid reason. That's not what BIOSes are for, a BIOS should bring up the
system and then sod off.
> My understanding is that there is a mechanism for the OS to request
> BIOS to release use of PMU counters. Are we doing that?
I'm not aware of any such thing. The Intel Arch docs most certainly
don't specify anything about that.
> If BIOS does
> not release the counters when asked, ok, that's something to
> (potentially) warn about. But blanket "BIOS is using a CPU feature!
> Bad BIOS! No treat for you!" - that's not helpful to anyone.
Again, I'm not aware there is a spec on how to ask anything of the BIOS,
let alone a part pertaining to PMU functionality.
I would rather prefer this:
BIOS bug, cpu 1, invalid <register=value> ...
... which is a much better information on one line, explains the bug
and is also better parsable. I intend to implement messages like
this. So maybe we could find consensus with this or something similar.
A simple grep of dmesg will then give a list of BIOS bugs.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
I got the note there is already a macro for this in kernel.h:
#define FW_BUG "[Firmware Bug]: "
So, we would have something like:
[Firmware Bug]: cpu 1, invalid <register=value> ...
in dmesg.
-Robert