Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH 0/6] dell-smm-hwmon fixes

319 views
Skip to first unread message

Pali Rohár

unread,
Jun 17, 2016, 7:00:04 PM6/17/16
to
I'm sending all my dell-smm-hwmon patches in one series, because due to
changes in code other patches depends on previous.

First two patches fixes problem with old /proc/i8k file. Second is security
fix and should be backported to all stable kernels (that problem was there
always). I tested i8kctl tool (from i8kutils package) that it still works
with these patches. Without root access for those security operations just
showes '?' or '-1'.

Third and fourth patches try to fix problem on machines with broken
SMM/BIOS when calling function fan_type().

Fifth is new feature and last sixth useful for debugging.

Pali Rohár (6):
hwmon: (dell-smm) Fail in ioctl I8K_BIOS_VERSION when bios version is
not a number
hwmon: (dell-smm) Restrict fan control and serial number to
CAP_SYS_ADMIN by default
hwmon: (dell-smm) Disallow fan_type() calls on broken machines
hwmon: (dell-smm) Cache fan_type() calls and change fan detection
hwmon: (dell-smm) Detect fan with index=2
hwmon: (dell-smm) In debug mode log duration of SMM calls

drivers/hwmon/dell-smm-hwmon.c | 122 ++++++++++++++++++++++++++++++++--------
1 file changed, 99 insertions(+), 23 deletions(-)

--
1.7.9.5

Pali Rohár

unread,
Jun 17, 2016, 7:00:05 PM6/17/16
to
On more Dell machines (e.g. Dell Precision M3800) fan_type() call is too
expensive (CPU is too long in SMM mode) and cause kernel to hang. This is
bug in Dell SMM or BIOS.

This patch caches type for each fan (as it should not change) and changes
the way how fan presense is detected. First it try function fan_status()
as was before commit f989e55452c7 ("i8k: Add support for fan labels"). And
if that fails fallback to fan_type(). *_status() functions can fail in case
fan is not currently accessible (e.g. present on GPU which is currently
turned off).

Reported-by: Tolga Cakir <ceve...@gmail.com>
Signed-off-by: Pali Rohár <pali....@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=112021
Cc: sta...@vger.kernel.org # v4.0+, will need backport
---
drivers/hwmon/dell-smm-hwmon.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c
index 4bbc587..2ac87d5 100644
--- a/drivers/hwmon/dell-smm-hwmon.c
+++ b/drivers/hwmon/dell-smm-hwmon.c
@@ -238,7 +238,7 @@ static int i8k_get_fan_speed(int fan)
/*
* Read the fan type.
*/
-static int i8k_get_fan_type(int fan)
+static int _i8k_get_fan_type(int fan)
{
struct smm_regs regs = { .eax = I8K_SMM_GET_FAN_TYPE, };

@@ -249,6 +249,17 @@ static int i8k_get_fan_type(int fan)
return i8k_smm(&regs) ? : regs.eax & 0xff;
}

+static int i8k_get_fan_type(int fan)
+{
+ /* I8K_SMM_GET_FAN_TYPE SMM call is expensive, so cache values */
+ static int types[2] = { INT_MIN, INT_MIN };
+
+ if (types[fan] == INT_MIN)
+ types[fan] = _i8k_get_fan_type(fan);
+
+ return types[fan];
+}
+
/*
* Read the fan nominal rpm for specific fan speed.
*/
@@ -782,13 +793,17 @@ static int __init i8k_init_hwmon(void)
if (err >= 0)
i8k_hwmon_flags |= I8K_HWMON_HAVE_TEMP4;

- /* First fan attributes, if fan type is OK */
- err = i8k_get_fan_type(0);
+ /* First fan attributes, if fan status or type is OK */
+ err = i8k_get_fan_status(0);
+ if (err < 0)
+ err = i8k_get_fan_type(0);
if (err >= 0)
i8k_hwmon_flags |= I8K_HWMON_HAVE_FAN1;

- /* Second fan attributes, if fan type is OK */
- err = i8k_get_fan_type(1);
+ /* Second fan attributes, if fan status or type is OK */
+ err = i8k_get_fan_status(1);
+ if (err < 0)
+ err = i8k_get_fan_type(1);
if (err >= 0)
i8k_hwmon_flags |= I8K_HWMON_HAVE_FAN2;

--
1.7.9.5

Pali Rohár

unread,
Jun 17, 2016, 7:00:05 PM6/17/16
to
Some Dell machines have especially broken SMM or BIOS which cause that once
fan_type() is called then CPU fan speed going randomly up and down. And for
fixing this behaviour reboot is required.

So this patch creates fan_type blacklist of affected Dell machines and
disallow fan_type() call on them to prevent that erratic behaviour.

Old blacklist which disabled loading driver on some machines added in
commits a4b45b25f18d ("hwmon: (dell-smm) Blacklist Dell Studio XPS 8100")
and 6220f4ebd7b4 ("hwmon: (dell-smm) Blacklist Dell Studio XPS 8000") were
moved to FAN_TYPE blacklist.

Reported-by: Jan C Peters <jcpet...@gmail.com>
Signed-off-by: Pali Rohár <pali....@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=100121
Cc: sta...@vger.kernel.org # v4.0+, will need backport
---
drivers/hwmon/dell-smm-hwmon.c | 36 +++++++++++++++++++++++++-----------
1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c
index c8bd3fdd..4bbc587 100644
--- a/drivers/hwmon/dell-smm-hwmon.c
+++ b/drivers/hwmon/dell-smm-hwmon.c
@@ -73,6 +73,7 @@ static u32 i8k_hwmon_flags;
static uint i8k_fan_mult = I8K_FAN_MULT;
static uint i8k_pwm_mult;
static uint i8k_fan_max = I8K_FAN_HIGH;
+static bool disallow_fan_type_call;

#define I8K_HWMON_HAVE_TEMP1 (1 << 0)
#define I8K_HWMON_HAVE_TEMP2 (1 << 1)
@@ -241,6 +242,9 @@ static int i8k_get_fan_type(int fan)
{
struct smm_regs regs = { .eax = I8K_SMM_GET_FAN_TYPE, };

+ if (disallow_fan_type_call)
+ return -EINVAL;
+
regs.ebx = fan & 0xff;
return i8k_smm(&regs) ? : regs.eax & 0xff;
}
@@ -726,6 +730,9 @@ static struct attribute *i8k_attrs[] = {
static umode_t i8k_is_visible(struct kobject *kobj, struct attribute *attr,
int index)
{
+ if (disallow_fan_type_call &&
+ (index == 9 || index == 12))
+ return 0;
if (index >= 0 && index <= 1 &&
!(i8k_hwmon_flags & I8K_HWMON_HAVE_TEMP1))
return 0;
@@ -937,12 +944,14 @@ static struct dmi_system_id i8k_dmi_table[] __initdata = {

MODULE_DEVICE_TABLE(dmi, i8k_dmi_table);

-static struct dmi_system_id i8k_blacklist_dmi_table[] __initdata = {
+/*
+ * On some machines once I8K_SMM_GET_FAN_TYPE is issued then CPU fan speed
+ * randomly going up and down due to bug in Dell SMM or BIOS. Here is blacklist
+ * of affected Dell machines for which we disallow I8K_SMM_GET_FAN_TYPE call.
+ * See bug: https://bugzilla.kernel.org/show_bug.cgi?id=100121
+ */
+static struct dmi_system_id i8k_blacklist_fan_type_dmi_table[] __initdata = {
{
- /*
- * CPU fan speed going up and down on Dell Studio XPS 8000
- * for unknown reasons.
- */
.ident = "Dell Studio XPS 8000",
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
@@ -950,16 +959,19 @@ static struct dmi_system_id i8k_blacklist_dmi_table[] __initdata = {
},
},
{
- /*
- * CPU fan speed going up and down on Dell Studio XPS 8100
- * for unknown reasons.
- */
.ident = "Dell Studio XPS 8100",
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Studio XPS 8100"),
},
},
+ {
+ .ident = "Dell Inspiron 580",
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Inspiron 580 "),
+ },
+ },
{ }
};

@@ -974,8 +986,7 @@ static int __init i8k_probe(void)
/*
* Get DMI information
*/
- if (!dmi_check_system(i8k_dmi_table) ||
- dmi_check_system(i8k_blacklist_dmi_table)) {
+ if (!dmi_check_system(i8k_dmi_table)) {
if (!ignore_dmi && !force)
return -ENODEV;

@@ -986,6 +997,9 @@ static int __init i8k_probe(void)
i8k_get_dmi_data(DMI_BIOS_VERSION));
}

+ if (dmi_check_system(i8k_blacklist_fan_type_dmi_table))
+ disallow_fan_type_call = true;
+
strlcpy(bios_version, i8k_get_dmi_data(DMI_BIOS_VERSION),
sizeof(bios_version));
strlcpy(bios_machineid, i8k_get_dmi_data(DMI_PRODUCT_SERIAL),
--
1.7.9.5

Pali Rohár

unread,
Jun 18, 2016, 8:30:06 AM6/18/16
to
Hi, can you test this patch series on affected Dell Studio XPS and
Inspiron 580 machines? Also on affected Precision machines?

I believe it finally fix those freeze and cpu fan speed problems, but
needs testing...

--
Pali Rohár
pali....@gmail.com

Guenter Roeck

unread,
Jun 18, 2016, 11:20:05 AM6/18/16
to
Pali,

You asked for additional testing, so I am not sure what you expect me to do.

Which of the patches can/should I apply now ?

Guenter

Pali Rohár

unread,
Jun 18, 2016, 11:30:06 AM6/18/16
to
Test 3/6 and 4/6 patches on affected Dell machines. I CCed all people
who tried to debug those bugs, so need confirmation from them that after
applying 3/6 and 4/6 patches, erratic fan behaviour is not there...

But because those two patches depends on previous, it is needed to test
whole series...

--
Pali Rohár
pali....@gmail.com
signature.asc

Guenter Roeck

unread,
Jun 18, 2016, 1:00:09 PM6/18/16
to
This doesn't tell me which patches to apply now. The first two ?

Guenter

Guenter Roeck

unread,
Jun 18, 2016, 4:10:05 PM6/18/16
to
On 06/17/2016 03:54 PM, Pali Rohár wrote:
> Some Dell machines have especially broken SMM or BIOS which cause that once
> fan_type() is called then CPU fan speed going randomly up and down. And for
> fixing this behaviour reboot is required.
>
> So this patch creates fan_type blacklist of affected Dell machines and
> disallow fan_type() call on them to prevent that erratic behaviour.
>
> Old blacklist which disabled loading driver on some machines added in
> commits a4b45b25f18d ("hwmon: (dell-smm) Blacklist Dell Studio XPS 8100")
> and 6220f4ebd7b4 ("hwmon: (dell-smm) Blacklist Dell Studio XPS 8000") were
> moved to FAN_TYPE blacklist.
>
> Reported-by: Jan C Peters <jcpet...@gmail.com>
> Signed-off-by: Pali Rohár <pali....@gmail.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=100121
> Cc: sta...@vger.kernel.org # v4.0+, will need backport

Applied.

It seemed worthwhile to apply this patch now, without waiting for further test
feedback, since it at least reduces the impact of the observed problems.
If follow-up changes are necessary, those will have to be applied in separate
patches.

Guenter

Pali Rohár

unread,
Jun 18, 2016, 6:40:04 PM6/18/16
to
Yes, 1/6 and 2/6 are OK.

--
Pali Rohár
pali....@gmail.com
signature.asc

Pali Rohár

unread,
Jun 18, 2016, 6:50:05 PM6/18/16
to
Thanks for testing! It took too long, but bugs in vendor SMM code are
hard to detect and probably impossible to fix. So I would call this
patch just as "workaround" and not proper bug fix...

On Saturday 18 June 2016 23:58:19 Leon Yu wrote:
> Just installed on "Inspiron 580", appears to have fixed the problem.
--
Pali Rohár
pali....@gmail.com
signature.asc

Pali Rohár

unread,
Jun 18, 2016, 6:50:06 PM6/18/16
to
On Saturday 18 June 2016 22:08:17 Guenter Roeck wrote:
> On 06/17/2016 03:54 PM, Pali Rohár wrote:
> > Some Dell machines have especially broken SMM or BIOS which cause
> > that once fan_type() is called then CPU fan speed going randomly
> > up and down. And for fixing this behaviour reboot is required.
> >
> > So this patch creates fan_type blacklist of affected Dell machines
> > and disallow fan_type() call on them to prevent that erratic
> > behaviour.
> >
> > Old blacklist which disabled loading driver on some machines added
> > in commits a4b45b25f18d ("hwmon: (dell-smm) Blacklist Dell Studio
> > XPS 8100") and 6220f4ebd7b4 ("hwmon: (dell-smm) Blacklist Dell
> > Studio XPS 8000") were moved to FAN_TYPE blacklist.
> >
> > Reported-by: Jan C Peters <jcpet...@gmail.com>
> > Signed-off-by: Pali Rohár <pali....@gmail.com>
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=100121
> > Cc: sta...@vger.kernel.org # v4.0+, will need backport
>
> Applied.
>
> It seemed worthwhile to apply this patch now, without waiting for
> further test feedback, since it at least reduces the impact of the
> observed problems. If follow-up changes are necessary, those will
> have to be applied in separate patches.
>
> Guenter

Ok. Meanwhile patch was tested on Inspiron 580, so maybe it could be
appended into commit message.

--
Pali Rohár
pali....@gmail.com
signature.asc

Tolga Cakir

unread,
Jun 18, 2016, 8:10:05 PM6/18/16
to
Tested on a Dell Precision M3800. After applying this patch, the
machine doesn't freeze for a couple of seconds on module load and on
each sensors (lm_sensors) call anymore. It now just freezes once, when
running sensors for the first time, and that's it. It was not possible
to read correct RPM values using sensors before, because the fans were
running at maximum speed, when the freeze occured. I can now read RPM
values correctly and the machine doesn't annoy me with freezing up on
boot anymore.

Tested-by: Tolga Cakir <ceve...@gmail.com>

Pali Rohár

unread,
Jun 20, 2016, 5:20:05 AM6/20/16
to
Guenter, now you can apply whole series + add all tested-by lines from
email threads. Looks like it is OK now.

--
Pali Rohár
pali....@gmail.com

Guenter Roeck

unread,
Jun 20, 2016, 9:30:06 AM6/20/16
to
First three patches applied and sent to Linus last night.

Remaining three patches applied to -next.

Thanks,
Guenter

Michał Kępień

unread,
Jun 22, 2016, 4:10:05 AM6/22/16
to
On a Vostro V131 (BIOS A04):

Tested-by: Michał Kępień <ker...@kempniu.pl>

--
Best regards,
Michał Kępień

Pali Rohár

unread,
Jun 23, 2016, 8:20:05 AM6/23/16
to
Ok, are you going to send at least patch 4/6 to Linus too as it fixes
next bug which should go to -stable too?

--
Pali Rohár
pali....@gmail.com

Guenter Roeck

unread,
Jun 23, 2016, 10:00:07 AM6/23/16
to
Now queued.

Guenter
0 new messages