Signed-off-by: Wu Zhangjin <wuzha...@gmail.com>
---
drivers/pci/pci.c | 12 +++++++++++-
1 files changed, 11 insertions(+), 1 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 440f4fb..3d9a4e2 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -497,8 +497,18 @@ static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state)
/* Mandatory power management transition delays */
/* see PCI PM 1.1 5.6.1 table 18 */
- if (state == PCI_D3hot || dev->current_state == PCI_D3hot)
+
+ /* we can not call msleep() when resume:
+ *
+ * if the current_state is PCI_D3hot, means we are in the procedure
+ * of resuming, in this procedure, we can not re-schedule, otherwise,
+ * there will be a deadlock.
+ */
+
+ if (state == PCI_D3hot)
msleep(pci_pm_d3_delay);
+ else if (dev->current_state == PCI_D3hot)
+ udelay(pci_pm_d3_delay);
else if (state == PCI_D2 || dev->current_state == PCI_D2)
udelay(PCI_PM_D2_DELAY);
--
1.6.2.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
> we can not call msleep() when resuming from STR/Standby: if the
> current_state of the pci device is PCI_D3hot, means we are in the
> procedure of resuming, in this procedure, we can not re-schedule,
> otherwise, there will be a deadlock.
I don't understand.
First of all, why does current_state == PCI_D3hot mean the system is
resuming from sleep? Isn't it possible that the PCI device is going
through a runtime resume?
Secondly, why will scheduling during a resume cause a deadlock?
Alan Stern
traced the dev_name(dev) via serial port, and found it with lspci, it
is:
00:09.0 USB Controller: NEC Corporation USB (rev 44)
and the state here is D0, the current_sate is PCI_D3hot.
> error = resume_device_noirq(dev, state);
> if (error)
> pm_dev_err(dev, state, " early", error);
> }
>
> I tried to add prom_putchar() at <<DEBUG>> to print something, and This
> will make it resume from standby mode successfully. seems,
> prom_putchar() have influenced the power.status, and make some devices
> not enter into the condition statement, and make dpm_power_up return
> directly. (this is very weird, not sure why?)
>
> so, I removed the prom_putchar() from <<DEBUG>>, and it stopped at
> resume_device_noirq, here is the following tracing path:
>
> resume_device_noirq:
> --> pm_noirq_op
> --> ops->resume_noirq (dev) <--> pci_pm_resume_noirq:
> --> pci_pm_default_resume_noirq
> --> pci_restore_standard_config
> --> pci_set_power_state
> --> pci_raw_set_power_state
> --> msleep <-----------------------[ stop here]
>
> msleep:
> --> schedule_timeout_uninterruptible
> --> schedule_timeout
> --> ...
> --> __mod_timer
> --> ...
> --> schedule ---> a new scheduling happen and never return
>
> and then I tried to trace schedule(), and even added a prom_putchar() to
> the end of the schedule() function, it output something successfully,
> but never return to schedule_timeout(dead? no keyboard response), seems
> very weird! this is reproductive, perhaps I have missed something here.
>
> so, to avoid this 'weird' situation, I think it's better not to
> re-schedule in the resuming procedure from standby. and here, I can not
> find another condition to judge the resuming procedure from standby, so,
> I use "current_state == PCI_D3hot"(so, my pre-expression is really wrong
> for it maybe not resume from standby as you indicated). and is there
> another condition to judge we are resuming from standby? perhaps this is
> better:
>
> ((current_state == PCI_D3hot) && (state == PCI_D0))
>
> but seems this also can not indicate we are resuming from standby.
>
> Regards,
> Wu Zhangjin
On Thu, 2009-07-16 at 13:02 -0400, Alan Stern wrote:
> On Thu, 16 Jul 2009, Wu Zhangjin wrote:
>
> > we can not call msleep() when resuming from STR/Standby: if the
> > current_state of the pci device is PCI_D3hot, means we are in the
> > procedure of resuming, in this procedure, we can not re-schedule,
> > otherwise, there will be a deadlock.
>
> I don't understand.
>
> First of all, why does current_state == PCI_D3hot mean the system is
> resuming from sleep? Isn't it possible that the PCI device is going
> through a runtime resume?
>
> Secondly, why will scheduling during a resume cause a deadlock?
Sorry, I'm stupid to make conclusion before describing the problem
dev->power.status = DPM_OFF;
Regards,
Wu Zhangjin
--
'standby' is a system sleep state, while 'current_state' and 'state' refer to
device power states.
Anyway, the fact that schedule() did not return in your testing setup indicates
that there were no timer interrupts delivered to the CPU during resume, which
should not have happened.
Perhaps it's necessary to annotate your timer interrupts appropriately so that
they are not disabled during suspend?
Best,
Rafael
Yeap, when changing from D3 to D0, it is a procedure of resuming from a
sleep state. resuming from 'standby' is such a procedure, but resuming
from a sleep state maybe include the other situations, so, this
judgement is really wrong here.
> Anyway, the fact that schedule() did not return in your testing setup indicates
> that there were no timer interrupts delivered to the CPU during resume, which
> should not have happened.
>
> Perhaps it's necessary to annotate your timer interrupts appropriately so that
> they are not disabled during suspend?
You are right:
I'm using an external timer(cs5536 mfgpt timer), and because this works
well in linux-2.6.27, so, I never doubted with it in 2.6.30.1, I just
using the MIPS timer instead of it, works well. so, It must be the
problem of the mfgpt timer, it maybe disabled or not enabled before
device_power_up(PMSG_RESUME) in 2.6.30.1, I will check it later.
Thanks very much!
Regards,
Wu Zhangjin
Thanks very much, The IRQF_TIMER flag works for me, so the replacement
of msleep by udelay is not needed.
[This source code is not ready to upstream, herein, just a prompt to the
other guys who will meet the same problem.]
[PATCH][loongson] Mark external timer interrupt IRQF_TIMER to prevent
resume failure
This PATCH refers to the commit fee803b2f0c28c78984fc319bd4b88ad47117368
and
the feedback from "Rafael J. Wysocki" <r...@sisk.pl> of my patch:
"[suspend]
pci_raw_set_power_state: replace msleep by udelay in resuming procedure"
"timer interrupts are excluded from being disabled during suspend. The
clock
events code manages the disabling of clock events on its own because the
timer
interrupt needs to be functional before the resume code reenables the
device
interrupts."
The external mfgpt timers request their interrupt without setting the
IRQF_TIMER
flag so suspend_device_irqs() disables them as well which results in a
fatal
resume failure on the boot CPU.
Adding IRQF_TIMER to the interupt flags when requesting the mfgpt timer
interrupts solves the problem.
Signed-off-by: Wu Zhangjin <wuzha...@gmail.com>
---
arch/mips/loongson/common/cs5536/cs5536_mfgpt.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/mips/loongson/common/cs5536/cs5536_mfgpt.c
b/arch/mips/loongson/common/cs5536/cs5536_mfgpt.c
index 1a518ec..5b8fa9f 100644
--- a/arch/mips/loongson/common/cs5536/cs5536_mfgpt.c
+++ b/arch/mips/loongson/common/cs5536/cs5536_mfgpt.c
@@ -133,7 +133,7 @@ static irqreturn_t timer_interrupt(int irq, void
*dev_id)
static struct irqaction irq5 = {
.handler = timer_interrupt,
- .flags = IRQF_DISABLED | IRQF_NOBALANCING,
+ .flags = IRQF_TIMER | IRQF_DISABLED | IRQF_NOBALANCING,
.name = "timer"
};
--
1.6.2.1
Unlike the other patch, this one actually looks ok.
> Adding IRQF_TIMER to the interupt flags when requesting the mfgpt timer
> interrupts solves the problem.
>
> Signed-off-by: Wu Zhangjin <wuzha...@gmail.com>
> ---
> arch/mips/loongson/common/cs5536/cs5536_mfgpt.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/mips/loongson/common/cs5536/cs5536_mfgpt.c
> b/arch/mips/loongson/common/cs5536/cs5536_mfgpt.c
> index 1a518ec..5b8fa9f 100644
> --- a/arch/mips/loongson/common/cs5536/cs5536_mfgpt.c
> +++ b/arch/mips/loongson/common/cs5536/cs5536_mfgpt.c
> @@ -133,7 +133,7 @@ static irqreturn_t timer_interrupt(int irq, void
> *dev_id)
>
> static struct irqaction irq5 = {
> .handler = timer_interrupt,
> - .flags = IRQF_DISABLED | IRQF_NOBALANCING,
> + .flags = IRQF_TIMER | IRQF_DISABLED | IRQF_NOBALANCING,
> .name = "timer"
> };
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html