[ 202.460967] PM: Syncing filesystems ... done.
[ 202.464818] PM: Preparing system for mem sleep
[ 202.485968] Freezing user space processes ... (elapsed 0.01 seconds) done.
[ 202.497079] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[ 202.508067] PM: Entering mem sleep
[ 202.508086] Suspending console(s) (use no_console_suspend to debug)
[ 202.508451] sd 3:0:0:0: [sdb] Synchronizing SCSI cache
[ 202.508562] sd 2:0:0:0: [sda] Synchronizing SCSI cache
[ 202.508616] sd 3:0:0:0: [sdb] Stopping disk
[ 202.511956] parport_pc 00:0b: disabled
[ 202.512127] serial 00:09: disabled
[ 202.512134] serial 00:09: wake-up capability disabled by ACPI
[ 202.536058] legacy_suspend(): pnp_bus_suspend+0x0/0x82 returns 38
[ 202.536061] PM: Device 00:02 failed to suspend: error 38
[ 202.997517] sd 2:0:0:0: [sda] Stopping disk
[ 202.997806] PM: Some devices failed to suspend
[ 202.998085] sd 2:0:0:0: [sda] Starting disk
[ 202.998144] sd 3:0:0:0: [sdb] Starting disk
[ 202.998614] serial 00:09: activated
[ 202.999158] parport_pc 00:0b: activated
[ 204.543094] PM: resume of devices complete after 1545.282 msecs
[ 204.543268] PM: Finishing wakeup.
[ 204.543270] Restarting tasks ... done.
...error 38 is ENOSYS, and the 00:02 is this:
# cat /sys/bus/pnp/devices/00\:02/id
IFX0102
PNP0c31
That appears to be an Infineon TPM chip:
# modinfo tpm_infineon
filename: /lib/modules/2.6.38.2-8.fc15.x86_64/kernel/drivers/char/tpm/tpm_infineon.ko
license: GPL
version: 1.9.2
description: Driver for Infineon TPM SLD 9630 TT 1.1 / SLB 9635 TT 1.2
author: Marcel Selhorst <m.sel...@sirrix.com>
srcversion: 01A807F04E1D1EC617254C4
alias: acpi*:IFX0102:*
alias: pnp:dIFX0102*
alias: acpi*:IFX0101:*
alias: pnp:dIFX0101*
depends:
vermagic: 2.6.38.2-8.fc15.x86_64 SMP mod_unload
Perhaps it's not being reset correctly on the initial wakeup? I've seen
some other emails about similar problems that were fixed a few releases
ago, but I can reproduce the above behavior with kernels as late as
2.6.38.2. Specifically, the above log is from the most recent kernel I
could find in Fedora koji:
2.6.38.2-8.fc15.x86_64
Let me know if you need other info or need me to test patches.
Thanks,
--
Jeff Layton <jla...@poochiereds.net>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Try the following before and after a suspend/resume:
cd /sys
find . | grep caps$ | xargs cat
It should display manufacturer data.
Stefan
It's using tpm_tis:
lrwxrwxrwx. 1 root root 0 Mar 28 13:40 /sys/bus/pnp/devices/00:02/driver -> ../../../bus/pnp/drivers/tpm_tis
FWIW, the fedora kernels have this:
CONFIG_TCG_TPM=y
CONFIG_TCG_TIS=y
CONFIG_TCG_NSC=m
CONFIG_TCG_ATMEL=m
CONFIG_TCG_INFINEON=m
When I boot, tpm_infineon is also plugged in, but I can remove that
module and nothing seems to change (not sure what's plugging it in).
I can try using tpm_infineon, but I'm not sure how to disable tpm_tis
with it compiled in like this -- is that possible?
> Try the following before and after a suspend/resume:
>
> cd /sys
> find . | grep caps$ | xargs cat
>
> It should display manufacturer data.
>
There's only one "caps" file. Here's the before (after a fresh reboot):
# cat ./devices/pnp0/00:02/caps
Manufacturer: 0x49465800
TCG version: 1.2
Firmware version: 1.0
...after a successful suspend/resume cycle:
# cat ./devices/pnp0/00:02/caps
...it gives no output at all. Guess that lends some weight to the
theory of it not being reset properly on resume?
Thanks for the help so far...
--
Jeff Layton <jla...@poochiereds.net>
FWIW, I turned up dynamic debugging on the tpm files and got this in
the ring buffer when I tried to read from "caps":
[ 6880.495071] tpm_tis 00:02: A TPM error (38) occurred attempting to determine the manufacturer
I don't see any obvious places that return ENOSYS in the tpm code, so
I'm not clear on where that's coming from...
From drivers/char/tpm/tpm.c,
static ssize_t transmit_cmd(struct tpm_chip *chip, struct tpm_cmd_t *cmd,
int len, const char *desc)
{
int err;
len = tpm_transmit(chip,(u8 *) cmd, len);
if (len < 0)
return len;
if (len == TPM_ERROR_SIZE) {
err = be32_to_cpu(cmd->header.out.return_code);
dev_dbg(chip->dev, "A TPM error (%d) occurred %s\n", err, desc);
return err;
}
return 0;
}
Where, desc comes from rc = tpm_getcap(dev, TPM_CAP_PROP_MANUFACTURER,
&cap, "attempting to determine the manufacturer");
TPM_ERROR_SIZE is 10, looks like it satisfies that condition.
        sk
Ahh yeah, I misread the code...
I guess then this error comes from the chip itself? Interesting that it
uses posix errors. Still though, it does seem like it's coming back
from resume in a bad state...
--
Jeff Layton <jla...@poochiereds.net>
Yup, it looks like the chip is at fault. The chip isn't supplying an
appropriate value for the capability TPM_CAP_PROP_MANUFACTURER (See
page 2 of [1], and compare with drivers/tpm/tpm.c). What is not clear
is whether the fault in the chip's response is due to the driver
code(maybe due to a firmware update etc.) or due to actual hardware
corruption.
sk
When the machines (S3) suspends then the OS needs to send a
TPM_SaveState() to the TPM. This is done by the Linux driver. Once the
VM resumes, the BIOS is supposed to send a TPM_Startup(ST_STATE) to the TPM.
Now the fun starts when a BIOS isn't doing that (even though the spec
says it's supposed to), which could very well be the case in your case
(don't know what broken BIOSes are out there... Did it ever work before
with the TPM driver in the kernel ?). I could try to send you a small
tool that you would have to run from user space upon resume so that we
can see that this error goes away. If that's verified we could
subsequently write a patch for the TPM driver to also send the
TPM_Startup(ST_STATE) to the TPM, which then in the case of most BIOSes
would be the 2nd time that the TPM receives such a command. I think TPMs
should be able to digest this 2nd TPM_Startup() well, but I'd have to
check -- but really we would ill-fix it just because of one (possibly)
buggy BIOS.
The failure of the 2nd suspend then likely stems from the TPM not
accepting the TPM_SaveState() anymore since it hasn't seen the
TPM_Startup(ST_STATE) that we expected the BIOS to send.
Another possibility would be for you to check for BIOS updates from the
laptop manufacturer...
Stefan
#include <stdio.h>
#include <stdint.h>
#include <fcntl.h>
#include <unistd.h>
int main(void) {
const uint8_t startup_st_state[] = {
0x00, 0xc1,
0x00, 0x00, 0x00, 0x0c,
0x00, 0x00, 0x00, 0x99,
0x00, 0x02
};
uint8_t buf[10];
int fd = open("/dev/tpm0", O_RDWR);
int len;
uint32_t err;
if (fd < 0) {
printf("Could not open /dev/tpm0\n");
return 1;
}
len = write(fd, startup_st_state, sizeof(startup_st_state));
if (len != sizeof(startup_st_state)) {
printf("Write failed.\n");
goto err_exit;
}
len = read(fd, buf, sizeof(buf));
if (len != sizeof(buf)) {
printf("Expected %d bytes bot got %d\n", (int)sizeof(buf), len);
goto err_exit;
}
if (buf[1] != 0xc4) {
printf("Response tag is bad.\n");
goto err_exit;
}
if (buf[5] != sizeof(buf)) {
printf("Response length is bad: %d\n", buf[5]);
goto err_exit;
}
err = buf[6] << 24 | buf[7] << 16 | buf[8] << 8 | buf[9];
if (err) {
printf("Got an error code in response: %u\n", err);
} else {
printf("Success!\n");
}
err_exit:
close(fd);
return 0;
}
gcc startup.c -o startup
Run it as 'root' after a resume and if that works do the 'cat ...' again.
Yep. That program fixed the problem. When I run it after a resume, I
can then cat the caps file and get output from it, and the machine will
successfully suspend again.
> Another possibility would be for you to check for BIOS updates from the
> laptop manufacturer...
>
This is actually a desktop machine and the BIOS for the motherboard is
at the latest version, though it is quite old -- 2007/09/01. For the
record this is a:
Foxconn 6150BK8MC
I'm actually not using the TPM in this thing at all. I'd be just as
happy if there were some way to disable it. Unfortunately, the option
in the BIOS to do this doesn't seem to actually work. When I set "TPM
Control" in the BIOS to "Disable" it always ends up reset back to "No
Change". I'd report both problems to the mfr, but this thing is long
out of warranty and I'm pretty sure they won't care.
Is there some way short of recompiling with CONFIG_TCG_* turned off
to disable the TPM driver at boot time?
--
Jeff Layton <jla...@poochiereds.net>
>> Another possibility would be for you to check for BIOS updates from the
>> laptop manufacturer...
>>
> This is actually a desktop machine and the BIOS for the motherboard is
> at the latest version, though it is quite old -- 2007/09/01. For the
> record this is a:
>
> Foxconn 6150BK8MC
>
> I'm actually not using the TPM in this thing at all. I'd be just as
> happy if there were some way to disable it. Unfortunately, the option
> in the BIOS to do this doesn't seem to actually work. When I set "TPM
> Control" in the BIOS to "Disable" it always ends up reset back to "No
> Change". I'd report both problems to the mfr, but this thing is long
> out of warranty and I'm pretty sure they won't care.
>
> Is there some way short of recompiling with CONFIG_TCG_* turned off
> to disable the TPM driver at boot time?
>
As far as I know, 'no'. I'd defer it to the maintainers as to how they
would want to solve your particular problem... either by using above
work-around, which would be more transparent, or actively having to turn
the driver off with a command line parameter.
Stefan
I'm fine with leaving it enabled as long as it doesn't get in the way
of suspend working. The scheme you mention above -- test the chip and
conditionally do a TPM_Startup() seems reasonable to me. Let me know if
you need me to test a patch...
Thanks,
--
Jeff Layton <jla...@poochiereds.net>
I'm handling a patch from Stefan that solves so, for now,
I'd recommend to use Stefan's tool.
Thanks,
Rajiv
Well, at least none of the patches I submitted in the series solves this
particular problem.
I am not sure whether this problem should be fixed since it's hopefully
rare. If it was to be fixed, how it should be fixed. Here are a couple
of options:
- declare it a lost case due to broken out-of-spec BIOS -- don't fix it;
machine won't suspend a 2nd time
- send a command to the TPM upon resume and if TPM response returns
with error code 38 set a flag and don't send TPM_SaveState() upon the
next suspend; log this case; the TPM becomes unusable; machine will
suspend a 2nd time
- send a command to the TPM upon resume and if it returns with error
code 38 send TPM_Startup(ST_STATE) -> this masks the BIOS problem; log
this case; TPM stays usable; machine will suspend a 2nd time; a
colleague tells me it may not be 'safe'
Options 2 and 3 would now also run for all the rest of the machines with
a good BIOS...
Stefan
> - send a command to the TPM upon resume and if TPM response returns
> with error code 38 set a flag and don't send TPM_SaveState() upon the
> next suspend; log this case; the TPM becomes unusable; machine will
> suspend a 2nd time
I assume there is no reasonable way to detect this particular piece of
broken BIOS/hardware? If not I think this sounds like the best option
to me. If the TPM doesn't work after resume not much we can do other
than accept it doesn't work and move along.....