Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[patch 0/2] suspend/resume regression fixes

8 views
Skip to first unread message

Thomas Gleixner

unread,
Sep 22, 2007, 6:30:11 PM9/22/07
to
Sorry, it took me quite a while to realize the real root cause of the
VAIO - and probably many other machines - suspend/resume regressions,
which were unearthed by the dyntick / clockevents patches.

We disable a lot of ACPI/BIOS functionality during suspend, but we
keep the lower idle C-states functionality active across
suspend/resume. It seems that this causes trouble with certain BIOSes,
but I assume that the problem is more wide spread and just not
surfacing due to the various scenarios in which a machine goes into
suspend/resume. I spent some quality time to figure out a set of debug
mechanisms, which did not influence the problem. So it is quite likely
that a lot of machines might be affected by this, but due to the
configuration, interrupt scenarios, .... the problem just does
not show up.

My final enlightment was, when I removed the ACPI processor module,
which controls the lower idle C-states, right before resume; this
worked fine all the time even without all the workaround hacks.

I really hope that this two patches finally set an end to the "jinxed
VAIO heisenbug series", which started when we removed the periodic
tick with the clockevents/dyntick patches.

Venki, can you please add the analogous fix to the cpuidle patch set ?

Thanks,

tglx
--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Thomas Gleixner

unread,
Sep 22, 2007, 6:40:08 PM9/22/07
to
clockevents-remove-the-wrong-fix.patch

Thomas Gleixner

unread,
Sep 22, 2007, 6:40:08 PM9/22/07
to
acpi-ignore-cstates-during-suspend.patch

Linus Torvalds

unread,
Sep 22, 2007, 7:10:09 PM9/22/07
to

On Sat, 22 Sep 2007, Thomas Gleixner wrote:
>
> My final enlightment was, when I removed the ACPI processor module,
> which controls the lower idle C-states, right before resume; this
> worked fine all the time even without all the workaround hacks.
>
> I really hope that this two patches finally set an end to the "jinxed
> VAIO heisenbug series", which started when we removed the periodic
> tick with the clockevents/dyntick patches.

Ok, so the patches look fine, but I somehow have this slight feeling that
you gave up a bit too soon on the "*why* does this happen?" question.

I realize that the answer is easily "because ACPI screwed up", but I'm
wondering if there's something we do to trigger that screw-up.

In particular, I also suspect that this may not really fix the problem -
maybe it just makes the window sufficiently small that it no longer
triggers. Because we don't necessarily understand what the real background
for the problem is, I'm not sure we can say that it is solved.

The reason I say this is that I have a suspicion on what triggers it.

I suspect that the problem is that we do

pm_ops->prepare();
disable_nonboot_cpus()
suspend_enter();
enable_nonboot_cpus()
pm_finish()

and here the big thing to notice is that "pm_ops->prepare()" call, which
sets the wakup vector etc etc.

So maybe the real problem here is that once we've done the "->prepare()"
call and ACPI has set up various stuff, we MUST NOT do any calls to any
ACPI routines to set low-power states, because the stupid firmware isn't
expecting it.

Now, if this is the cause, then I think your patch should indeed fix it,
since you get called by the early-suspend code (which happens *before* the
"->prepare()" call), but at the same time, I wonder if maybe it would be
slightly "more correct" to instead of using the suspend/resume callbacks,
simply do this in the "acpi_pm_prepare()" stage, since that is likely the
thing that triggers it?

But hey, I think I'll apply the patches as-is. I'd just feel even better
if we actually understood *why* doing the CPU Cx states is not something
we can do around the suspend code!

Linus

Thomas Gleixner

unread,
Sep 22, 2007, 7:40:09 PM9/22/07
to
Linus,

On Sat, 2007-09-22 at 15:59 -0700, Linus Torvalds wrote:
> > My final enlightment was, when I removed the ACPI processor module,
> > which controls the lower idle C-states, right before resume; this
> > worked fine all the time even without all the workaround hacks.
> >
> > I really hope that this two patches finally set an end to the "jinxed
> > VAIO heisenbug series", which started when we removed the periodic
> > tick with the clockevents/dyntick patches.
>
> Ok, so the patches look fine, but I somehow have this slight feeling that
> you gave up a bit too soon on the "*why* does this happen?" question.

Yeah, I gave up at the point where I was not longer able to dig
deeper :)

> I realize that the answer is easily "because ACPI screwed up", but I'm
> wondering if there's something we do to trigger that screw-up.

Fair enough.

> In particular, I also suspect that this may not really fix the problem -
> maybe it just makes the window sufficiently small that it no longer
> triggers. Because we don't necessarily understand what the real background
> for the problem is, I'm not sure we can say that it is solved.
>
> The reason I say this is that I have a suspicion on what triggers it.
>
> I suspect that the problem is that we do
>
> pm_ops->prepare();
> disable_nonboot_cpus()
> suspend_enter();
> enable_nonboot_cpus()
> pm_finish()
>
> and here the big thing to notice is that "pm_ops->prepare()" call, which
> sets the wakup vector etc etc.
>
> So maybe the real problem here is that once we've done the "->prepare()"
> call and ACPI has set up various stuff, we MUST NOT do any calls to any
> ACPI routines to set low-power states, because the stupid firmware isn't
> expecting it.

That's what I suspect and deduced from the various experiments including
a force the cpu into a lower c-state one, which triggered the problem
fully reproducible. Note that in case of the "force a lower c-state" I
verified, that the PIT was activated to avoid the local apic stops in c3
issue. But I never got an PIT interrupt. Either the box was completely
stuck or I was able to recover by hitting a key, which is as well one of
the unexplained phenomenons.

> Now, if this is the cause, then I think your patch should indeed fix it,
> since you get called by the early-suspend code (which happens *before* the
> "->prepare()" call), but at the same time, I wonder if maybe it would be
> slightly "more correct" to instead of using the suspend/resume callbacks,
> simply do this in the "acpi_pm_prepare()" stage, since that is likely the
> thing that triggers it?

Yeah, probably that's the correct point, but I leave this to the ACPI
wizards.

> But hey, I think I'll apply the patches as-is. I'd just feel even better
> if we actually understood *why* doing the CPU Cx states is not something
> we can do around the suspend code!

That needs some explanation of the folks who can actually look beyond
the ACPI/BIOS internals.

tglx

Oleg Verych

unread,
Sep 22, 2007, 9:10:07 PM9/22/07
to
* Sat, 22 Sep 2007 15:59:25 -0700 (PDT)

As i never had any suspend working ever, let me point this LKML post

http://mid.gmane.org/200709221746.14...@gmail.com

from Mihal, who just managed to do some other magic in sligtly
different context (maybe yet another anti "ACPI screwed up").

Mihai, you can find whole thread by above URL, requesting message-ids
from in-reply-to or references headers of this message.
_____

Linus Torvalds

unread,
Sep 22, 2007, 11:20:08 PM9/22/07
to

On Sun, 23 Sep 2007, Oleg Verych wrote:
>
> As i never had any suspend working ever, let me point this LKML post
>
> http://mid.gmane.org/200709221746.14...@gmail.com
>
> from Mihal, who just managed to do some other magic in sligtly
> different context (maybe yet another anti "ACPI screwed up").
>
> Mihai, you can find whole thread by above URL, requesting message-ids
> from in-reply-to or references headers of this message.

From a "future behaviour" standpoint it would probably be interesting to
hear whether Mihai can make his machine with not with the old IDE layer
(which distributions are migrating away from) but with the ATA layer
(libata) instead. It too should hopefully know about using ACPI to restore
any ATA controller quirks.

Linus

Mihai Donțu

unread,
Sep 23, 2007, 1:30:11 AM9/23/07
to
On Sunday 23 September 2007, Linus Torvalds wrote:
> From a "future behaviour" standpoint it would probably be interesting to
> hear whether Mihai can make his machine with not with the old IDE layer
> (which distributions are migrating away from) but with the ATA layer
> (libata) instead. It too should hopefully know about using ACPI to restore
> any ATA controller quirks.

I switched to libata, but it behaves like the old IDE without ACPI. I
did not manage to get a full dmesg (apparently all volumes are mounted
r/o right after a power up from a s2ram) but I did make a picture, from
which I quote (if I may say so):

"
ata1.00: configured for PIO0
sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
sd 0:0:0:0: [sda] Sense key : 0xb [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
07 65 35 25
sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0
end_request: I/O error, dev sda, sector 124073509
ata1: EH complete
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: cmd c5/00:10:75:36:65/00:00:00:00:00/e7 tag 0 cdb 0x0 data 8192 out
res 51/04:10:75:36:65/00:00:00:00:00/e7 Emask 0x1 (device error)
ata1.00: configured for PIO0
ata1: EH complete
"

The last six lines repeat six times after which the whole things goes
from the beginning:
"
ata1.00: configured for PIO0
...
"

It all gets crazy the moment I (or a process) try to access the root
(or any other drive), until then, everything is nice and quiet.

Mmm... in the excerpt above it says: "Write Protect is off" but when I did
$ mount -o remount,rw /
I got something like: "the device is write protected".

I tried to save the dmesg on a mmc, but after powering up it said:
"out of disk space"

These are about all symptoms that I noticed... oh, and 'scsi_eh_0/1'
enters disk-sleep often.

I attached the dmesg pre s2ram.

Thanks,

--
Mihai Donțu

dmesg-pre-s2ram.txt

Rafael J. Wysocki

unread,
Sep 23, 2007, 6:20:09 AM9/23/07
to

I think that this is the case.

> Now, if this is the cause, then I think your patch should indeed fix it,
> since you get called by the early-suspend code (which happens *before* the
> "->prepare()" call), but at the same time, I wonder if maybe it would be
> slightly "more correct" to instead of using the suspend/resume callbacks,
> simply do this in the "acpi_pm_prepare()" stage, since that is likely the
> thing that triggers it?
>
> But hey, I think I'll apply the patches as-is. I'd just feel even better
> if we actually understood *why* doing the CPU Cx states is not something
> we can do around the suspend code!

Greetings,
Rafael

Alan Cox

unread,
Sep 23, 2007, 8:30:22 AM9/23/07
to
> I switched to libata, but it behaves like the old IDE without ACPI. I
> did not manage to get a full dmesg (apparently all volumes are mounted
> r/o right after a power up from a s2ram) but I did make a picture, from
> which I quote (if I may say so):

Device errors.

Libata currently (wrongly IMHO) defaults to avoiding the use of ACPI
suspend/resume methods

So you also need to boot with "libata.noacpi=0" and if that works beat
up Jeff a bit ..

Mihai Donțu

unread,
Sep 23, 2007, 9:10:11 AM9/23/07
to
On Sunday 23 September 2007, Alan Cox wrote:
> > I switched to libata, but it behaves like the old IDE without ACPI. I
> > did not manage to get a full dmesg (apparently all volumes are mounted
> > r/o right after a power up from a s2ram) but I did make a picture, from
> > which I quote (if I may say so):
>
> Device errors.
>
> Libata currently (wrongly IMHO) defaults to avoiding the use of ACPI
> suspend/resume methods
>
> So you also need to boot with "libata.noacpi=0" and if that works beat
> up Jeff a bit ..

You were right, 'libata.noacpi=0' does the trick :)

It's interesting my mmc is unmounted before s2ram and then mounted back
on resume.

So, to sum up, I have a working suspend-to-ram with libata. Not bad, not
bad at all...

Thanks,

--
Mihai Donțu

Matthew Garrett

unread,
Sep 23, 2007, 10:10:11 AM9/23/07
to
On Sat, Sep 22, 2007 at 08:11:06PM -0700, Linus Torvalds wrote:

> From a "future behaviour" standpoint it would probably be interesting to
> hear whether Mihai can make his machine with not with the old IDE layer
> (which distributions are migrating away from) but with the ATA layer
> (libata) instead. It too should hopefully know about using ACPI to restore
> any ATA controller quirks.

It does, but it's disabled by default. libata needs to be loaded with
noacpi=0.

--
Matthew Garrett | mj...@srcf.ucam.org

Mark Lord

unread,
Sep 28, 2007, 4:30:13 PM9/28/07
to
Linus Torvalds wrote:
>
> On Sat, 22 Sep 2007, Thomas Gleixner wrote:
>> My final enlightment was, when I removed the ACPI processor module,
>> which controls the lower idle C-states, right before resume; this
>> worked fine all the time even without all the workaround hacks.
>>
>> I really hope that this two patches finally set an end to the "jinxed
>> VAIO heisenbug series", which started when we removed the periodic
>> tick with the clockevents/dyntick patches.
>
> Ok, so the patches look fine, but I somehow have this slight feeling that
> you gave up a bit too soon on the "*why* does this happen?" question.

On a closely related note: I just now submitted a patch to fix SMP-poweroff,
by having it do disable_nonboot_cpus before doing poweroff.

Which has led me to thinking..
..are similar precautions perhaps necessary for *all* ACPI BIOS calls?

Because one never knows what the other CPUs are doing at the same time,
and what the side effects may be on the ACPI BIOS functions.

And also, I wonder if at a minimum we should be guaranteeing ACPI BIOS calls
only ever happen from CPU#0 (or the "boot" CPU)? Or do we do that already?

-ml

Thomas Gleixner

unread,
Sep 28, 2007, 4:40:07 PM9/28/07
to
On Fri, 2007-09-28 at 16:27 -0400, Mark Lord wrote:
> Linus Torvalds wrote:
> >
> > On Sat, 22 Sep 2007, Thomas Gleixner wrote:
> >> My final enlightment was, when I removed the ACPI processor module,
> >> which controls the lower idle C-states, right before resume; this
> >> worked fine all the time even without all the workaround hacks.
> >>
> >> I really hope that this two patches finally set an end to the "jinxed
> >> VAIO heisenbug series", which started when we removed the periodic
> >> tick with the clockevents/dyntick patches.
> >
> > Ok, so the patches look fine, but I somehow have this slight feeling that
> > you gave up a bit too soon on the "*why* does this happen?" question.
>
> On a closely related note: I just now submitted a patch to fix SMP-poweroff,
> by having it do disable_nonboot_cpus before doing poweroff.
>
> Which has led me to thinking..
> ..are similar precautions perhaps necessary for *all* ACPI BIOS calls?
>
> Because one never knows what the other CPUs are doing at the same time,
> and what the side effects may be on the ACPI BIOS functions.
>
> And also, I wonder if at a minimum we should be guaranteeing ACPI BIOS calls
> only ever happen from CPU#0 (or the "boot" CPU)? Or do we do that already?

The ACPI calls are serialized in the kernel, AFAICT. But the fragile
situations (suspend, resume, shutdown, reboot) are probably those, where
some BIOS implementation expect that certain things are not called or
not active.

tglx

Alan Cox

unread,
Sep 28, 2007, 5:10:09 PM9/28/07
to
> And also, I wonder if at a minimum we should be guaranteeing ACPI BIOS calls
> only ever happen from CPU#0 (or the "boot" CPU)? Or do we do that already?

The real question that matters is "does windows" which possibly someone
who touches Windows can definitively answer. For APM we lock to CPU 0 and
have to because its what MS did, not because of the spec.

Mark Lord

unread,
Sep 28, 2007, 5:20:07 PM9/28/07
to
Thomas Gleixner wrote:
> On Fri, 2007-09-28 at 16:27 -0400, Mark Lord wrote:
..

>> On a closely related note: I just now submitted a patch to fix SMP-poweroff,
>> by having it do disable_nonboot_cpus before doing poweroff.
>>
>> Which has led me to thinking..
>> ..are similar precautions perhaps necessary for *all* ACPI BIOS calls?
>>
>> Because one never knows what the other CPUs are doing at the same time,
>> and what the side effects may be on the ACPI BIOS functions.
>>
>> And also, I wonder if at a minimum we should be guaranteeing ACPI BIOS calls
>> only ever happen from CPU#0 (or the "boot" CPU)? Or do we do that already?
>
> The ACPI calls are serialized in the kernel, AFAICT. But the fragile
> situations (suspend, resume, shutdown, reboot) are probably those, where
> some BIOS implementation expect that certain things are not called or
> not active.

Mmm.. *do* we actually do this for reboot? I don't see it there.
And how about for kexec?

I'm probably just missing seeing it. Right?

Cheers

Rafael J. Wysocki

unread,
Sep 28, 2007, 5:30:11 PM9/28/07
to
On Friday, 28 September 2007 23:17, Mark Lord wrote:
> Thomas Gleixner wrote:
> > On Fri, 2007-09-28 at 16:27 -0400, Mark Lord wrote:
> ..
> >> On a closely related note: I just now submitted a patch to fix SMP-poweroff,
> >> by having it do disable_nonboot_cpus before doing poweroff.
> >>
> >> Which has led me to thinking..
> >> ..are similar precautions perhaps necessary for *all* ACPI BIOS calls?
> >>
> >> Because one never knows what the other CPUs are doing at the same time,
> >> and what the side effects may be on the ACPI BIOS functions.
> >>
> >> And also, I wonder if at a minimum we should be guaranteeing ACPI BIOS calls
> >> only ever happen from CPU#0 (or the "boot" CPU)? Or do we do that already?
> >
> > The ACPI calls are serialized in the kernel, AFAICT. But the fragile
> > situations (suspend, resume, shutdown, reboot) are probably those, where
> > some BIOS implementation expect that certain things are not called or
> > not active.
>
> Mmm.. *do* we actually do this for reboot? I don't see it there.
> And how about for kexec?
>
> I'm probably just missing seeing it. Right?

Nope.

Till now, only hibernation and suspend disabled the nonboot CPUs before
invoking the platform firmware.

Greetings,
Rafael

Bill Davidsen

unread,
Sep 29, 2007, 1:10:08 PM9/29/07
to
Mark Lord wrote:
> Linus Torvalds wrote:
>>
>> On Sat, 22 Sep 2007, Thomas Gleixner wrote:
>>> My final enlightment was, when I removed the ACPI processor module,
>>> which controls the lower idle C-states, right before resume; this
>>> worked fine all the time even without all the workaround hacks.
>>>
>>> I really hope that this two patches finally set an end to the "jinxed
>>> VAIO heisenbug series", which started when we removed the periodic
>>> tick with the clockevents/dyntick patches.
>>
>> Ok, so the patches look fine, but I somehow have this slight feeling
>> that you gave up a bit too soon on the "*why* does this happen?"
>> question.
>
> On a closely related note: I just now submitted a patch to fix
> SMP-poweroff,
> by having it do disable_nonboot_cpus before doing poweroff.
>
> Which has led me to thinking..
> ..are similar precautions perhaps necessary for *all* ACPI BIOS calls?
>
> Because one never knows what the other CPUs are doing at the same time,
> and what the side effects may be on the ACPI BIOS functions.
>
> And also, I wonder if at a minimum we should be guaranteeing ACPI BIOS
> calls
> only ever happen from CPU#0 (or the "boot" CPU)? Or do we do that
> already?
>
Boot CPU, and AFAIK suspend is the only place which does it.

--
Bill Davidsen <davi...@tmr.com>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot

Andi Kleen

unread,
Oct 1, 2007, 6:20:09 AM10/1/07
to
Thomas Gleixner <tg...@linutronix.de> writes:
>
> +/*
> + * Suspend / resume control
> + */
> +static int acpi_idle_suspend;
> +
> +int acpi_processor_suspend(struct acpi_device * device, pm_message_t state)
> +{
> + acpi_idle_suspend = 1;
> + return 0;
> +}
> +
> +int acpi_processor_resume(struct acpi_device * device)
> +{
> + acpi_idle_suspend = 0;
> + return 0;
> +}

Instead of the ugly global variable you could you could just change
pm_idle back to NULL (= default_idle) and then later restore it.

-Andi

0 new messages