Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Patch] jbd commit code deadloop when installing Linux

2 views
Skip to first unread message

Zou Nan hai

unread,
Jun 28, 2006, 2:34:52 AM6/28/06
to LKML, Ingo Molnar, Andrew Morton
We see system hang in ext3 jbd code
when Linux install program anaconda copying
packages.

That is because anaconda is invoked from linuxrc
in initrd when system_state is still SYSTEM_BOOTING.

Thus the cond_resched checks in journal_commit_transaction
will always return 1 without actually schedule,
then the system fall into deadloop.

The patch will fix the issue.

Zou Nan hai

Signed-off-by: Zou Nan hai <nanha...@intel.com>


--- linux-2.6.17/fs/jbd/commit.c 2006-06-18 09:49:35.000000000 +0800
+++ b/fs/jbd/commit.c 2006-06-28 14:15:41.000000000 +0800
@@ -625,8 +625,17 @@ wait_for_iobuf:
wait_on_buffer(bh);
goto wait_for_iobuf;
}
- if (cond_resched())
- goto wait_for_iobuf;
+ if (cond_resched()) {
+ if (commit_transaction->t_iobuf_list != NULL) {
+ jh = commit_transaction->t_iobuf_list->b_tprev;
+ bh = jh2bh(jh);
+ if (buffer_locked(bh)) {
+ wait_on_buffer(bh);
+ goto wait_for_iobuf;
+ }
+ } else
+ break;
+ }

if (unlikely(!buffer_uptodate(bh)))
err = -EIO;
@@ -681,8 +690,17 @@ wait_for_iobuf:
wait_on_buffer(bh);
goto wait_for_ctlbuf;
}
- if (cond_resched())
- goto wait_for_ctlbuf;
+ if (cond_resched()) {
+ if(commit_transaction->t_log_list != NULL) {
+ jh = commit_transaction->t_log_list->b_tprev;
+ bh = jh2bh(jh);
+ if (buffer_locked(bh)) {
+ wait_on_buffer(bh);
+ goto wait_for_ctlbuf;
+ }
+ }else
+ break;
+ }

if (unlikely(!buffer_uptodate(bh)))
err = -EIO;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Andrew Morton

unread,
Jun 28, 2006, 2:40:38 AM6/28/06
to Zou Nan hai, linux-...@vger.kernel.org, mi...@elte.hu
On 28 Jun 2006 12:48:43 +0800

Zou Nan hai <nanha...@intel.com> wrote:

> We see system hang in ext3 jbd code
> when Linux install program anaconda copying
> packages.
>
> That is because anaconda is invoked from linuxrc
> in initrd when system_state is still SYSTEM_BOOTING.
>
> Thus the cond_resched checks in journal_commit_transaction
> will always return 1 without actually schedule,
> then the system fall into deadloop.

That's a bug in cond_resched().

Something like this..


--- a/kernel/sched.c~cond_resched-fix
+++ a/kernel/sched.c
@@ -4454,7 +4454,7 @@ asmlinkage long sys_sched_yield(void)
return 0;
}

-static inline void __cond_resched(void)
+static inline int __cond_resched(void)
{
#ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
__might_sleep(__FILE__, __LINE__);
@@ -4465,22 +4465,21 @@ static inline void __cond_resched(void)
* cond_resched() call.
*/
if (unlikely(preempt_count()))
- return;
+ return 0;
if (unlikely(system_state != SYSTEM_RUNNING))
- return;
+ return 0;
do {
add_preempt_count(PREEMPT_ACTIVE);
schedule();
sub_preempt_count(PREEMPT_ACTIVE);
} while (need_resched());
+ return 1;
}

int __sched cond_resched(void)
{
- if (need_resched()) {
- __cond_resched();
- return 1;
- }
+ if (need_resched())
+ return __cond_resched();
return 0;
}
EXPORT_SYMBOL(cond_resched);
_

Ingo Molnar

unread,
Jun 28, 2006, 2:44:14 AM6/28/06
to Andrew Morton, Zou Nan hai, linux-...@vger.kernel.org

* Andrew Morton <ak...@osdl.org> wrote:

> > We see system hang in ext3 jbd code
> > when Linux install program anaconda copying
> > packages.
> >
> > That is because anaconda is invoked from linuxrc
> > in initrd when system_state is still SYSTEM_BOOTING.

[ argh ...! ]

> > Thus the cond_resched checks in journal_commit_transaction
> > will always return 1 without actually schedule,
> > then the system fall into deadloop.
>
> That's a bug in cond_resched().
>
> Something like this..

Acked-by: Ingo Molnar <mi...@elte.hu>

Ingo

Andrew Morton

unread,
Jun 28, 2006, 2:55:59 AM6/28/06
to Ingo Molnar, nanha...@intel.com, linux-...@vger.kernel.org
On Wed, 28 Jun 2006 08:38:59 +0200
Ingo Molnar <mi...@elte.hu> wrote:

>
> * Andrew Morton <ak...@osdl.org> wrote:
>
> > > We see system hang in ext3 jbd code
> > > when Linux install program anaconda copying
> > > packages.
> > >
> > > That is because anaconda is invoked from linuxrc
> > > in initrd when system_state is still SYSTEM_BOOTING.
>
> [ argh ...! ]

That's what I thought ;)

> > > Thus the cond_resched checks in journal_commit_transaction
> > > will always return 1 without actually schedule,
> > > then the system fall into deadloop.
> >
> > That's a bug in cond_resched().
> >
> > Something like this..
>
> Acked-by: Ingo Molnar <mi...@elte.hu>
>

Thanks. Zou, it'd be great if you could test this in your setup, please.
I've tagged it as 2.6.17.x material.

Zou Nan hai

unread,
Jun 28, 2006, 3:32:41 AM6/28/06
to Andrew Morton, Ingo Molnar, LKML
On Wed, 2006-06-28 at 14:55, Andrew Morton wrote:
> On Wed, 28 Jun 2006 08:38:59 +0200
> Ingo Molnar <mi...@elte.hu> wrote:
>
> >
> > * Andrew Morton <ak...@osdl.org> wrote:
> >
> > > > We see system hang in ext3 jbd code
> > > > when Linux install program anaconda copying
> > > > packages.
> > > >
> > > > That is because anaconda is invoked from linuxrc
> > > > in initrd when system_state is still SYSTEM_BOOTING.
> >
> > [ argh ...! ]
>
> That's what I thought ;)
>
> > > > Thus the cond_resched checks in journal_commit_transaction
> > > > will always return 1 without actually schedule,
> > > > then the system fall into deadloop.
> > >
> > > That's a bug in cond_resched().
> > >
> > > Something like this..
> >
> > Acked-by: Ingo Molnar <mi...@elte.hu>
> >
>
> Thanks. Zou, it'd be great if you could test this in your setup, please.
> I've tagged it as 2.6.17.x material.

Andrew,
I am building the env to test.
The patch was my original idea, but I was afraid of breaking any code
that rely on the OLD wrong cond_sched semantic. However later I did a
grep found that there is very few code that checks the return value of
cond_resched. So the patch should be safe.

However I think cond_resched_lock and cond_resched_softirq also need fix
to make the semantic consistent.

Please check the following patch.

Zou Nan hai

Signed-off-by: Zou Nan hai <nanha...@intel.com>

--- linux-2.6.17/kernel/sched.c 2006-06-18 09:49:35.000000000 +0800
+++ linux-2.6.17-fix/kernel/sched.c 2006-06-28 13:34:39.000000000 +0800
@@ -4044,7 +4044,7 @@ asmlinkage long sys_sched_yield(void)


return 0;
}

-static inline void __cond_resched(void)
+static inline int __cond_resched(void)
{

/*
* The BKS might be reacquired before we have dropped
@@ -4052,22 +4052,21 @@ static inline void __cond_resched(void)


* cond_resched() call.
*/
if (unlikely(preempt_count()))
- return;
+ return 0;
if (unlikely(system_state != SYSTEM_RUNNING))
- return;
+ return 0;
do {
add_preempt_count(PREEMPT_ACTIVE);
schedule();
sub_preempt_count(PREEMPT_ACTIVE);
} while (need_resched());
+ return 1;
}

int __sched cond_resched(void)
{
- if (need_resched()) {
- __cond_resched();
- return 1;
- }
+ if (need_resched())
+ return __cond_resched();
return 0;
}

@@ -4094,8 +4093,7 @@ int cond_resched_lock(spinlock_t *lock)
if (need_resched()) {
_raw_spin_unlock(lock);
preempt_enable_no_resched();
- __cond_resched();
- ret = 1;
+ ret |= __cond_resched();
spin_lock(lock);
}
return ret;
@@ -4106,14 +4104,13 @@ EXPORT_SYMBOL(cond_resched_lock);
int __sched cond_resched_softirq(void)
{
BUG_ON(!in_softirq());
-
+ int ret = 0;
if (need_resched()) {
__local_bh_enable();
- __cond_resched();
+ ret = __cond_resched();
local_bh_disable();
- return 1;
}
- return 0;
+ return ret;
}

EXPORT_SYMBOL(cond_resched_softirq);

Andrew Morton

unread,
Jun 28, 2006, 3:40:57 AM6/28/06
to Zou Nan hai, mi...@elte.hu, linux-...@vger.kernel.org
On 28 Jun 2006 13:46:22 +0800

Zou Nan hai <nanha...@intel.com> wrote:

> On Wed, 2006-06-28 at 14:55, Andrew Morton wrote:
> > On Wed, 28 Jun 2006 08:38:59 +0200
> > Ingo Molnar <mi...@elte.hu> wrote:
> >
> > >
> > > * Andrew Morton <ak...@osdl.org> wrote:
> > >
> > > > > We see system hang in ext3 jbd code
> > > > > when Linux install program anaconda copying
> > > > > packages.
> > > > >
> > > > > That is because anaconda is invoked from linuxrc
> > > > > in initrd when system_state is still SYSTEM_BOOTING.
> > >
> > > [ argh ...! ]
> >
> > That's what I thought ;)
> >
> > > > > Thus the cond_resched checks in journal_commit_transaction
> > > > > will always return 1 without actually schedule,
> > > > > then the system fall into deadloop.
> > > >
> > > > That's a bug in cond_resched().
> > > >
> > > > Something like this..
> > >
> > > Acked-by: Ingo Molnar <mi...@elte.hu>
> > >
> >
> > Thanks. Zou, it'd be great if you could test this in your setup, please.
> > I've tagged it as 2.6.17.x material.
>
> Andrew,
> I am building the env to test.
> The patch was my original idea, but I was afraid of breaking any code
> that rely on the OLD wrong cond_sched semantic.

We prefer the "right" fix, however painful or risky that might be.

> However later I did a
> grep found that there is very few code that checks the return value of
> cond_resched. So the patch should be safe.

Hope so.

> However I think cond_resched_lock and cond_resched_softirq also need fix
> to make the semantic consistent.
>
> Please check the following patch.
>

Ah. I think the return value from these functions should mean "something
disruptive happened", if you like.

See, the callers of cond_resched_lock() aren't interested in whether
cond_resched_lock() actually called schedule(). They want to know whether
cond_resched_lock() dropped the lock. Because if the lock was dropped, the
caller needs to take some special action, regardless of whether schedule()
was finally called.

So I think the patch I queued is OK, agree?

Ingo Molnar

unread,
Jun 28, 2006, 3:44:33 AM6/28/06
to Zou Nan hai, Andrew Morton, LKML

* Zou Nan hai <nanha...@intel.com> wrote:

> @@ -4094,8 +4093,7 @@ int cond_resched_lock(spinlock_t *lock)

> - __cond_resched();


> - ret = 1;
> + ret |= __cond_resched();

> @@ -4106,14 +4104,13 @@ EXPORT_SYMBOL(cond_resched_lock);

> - __cond_resched();
> + ret = __cond_resched();

Acked-by: Ingo Molnar <mi...@elte.hu>

Ingo

Zou Nan hai

unread,
Jun 28, 2006, 3:49:17 AM6/28/06
to Andrew Morton, mi...@elte.hu, LKML

I am afraid the code like cond_resched_lock check in
fs/jbd/checkpoint.c log_do_checkpoint may fall into endless retry in
some condition, will it?
Though I have not encountered that.

Zou Nan hai

Ingo Molnar

unread,
Jun 28, 2006, 4:00:23 AM6/28/06
to Andrew Morton, Zou Nan hai, linux-...@vger.kernel.org

* Andrew Morton <ak...@osdl.org> wrote:

> > However I think cond_resched_lock and cond_resched_softirq also need fix
> > to make the semantic consistent.
> >
> > Please check the following patch.
> >
>
> Ah. I think the return value from these functions should mean
> "something disruptive happened", if you like.
>
> See, the callers of cond_resched_lock() aren't interested in whether
> cond_resched_lock() actually called schedule(). They want to know
> whether cond_resched_lock() dropped the lock. Because if the lock was
> dropped, the caller needs to take some special action, regardless of
> whether schedule() was finally called.

indeed ...!

> So I think the patch I queued is OK, agree?

yeah.

i think the really-right-fix would be to get rid of that SYSTEM_BOOTING
ugliness though ... I'm quite a bit uneasy about us doing different
things for an initrd app than for fully booted apps.

Ingo

Andrew Morton

unread,
Jun 28, 2006, 4:05:31 AM6/28/06
to Zou Nan hai, mi...@elte.hu, linux-...@vger.kernel.org
On 28 Jun 2006 14:02:57 +0800

Zou Nan hai <nanha...@intel.com> wrote:

> > > However I think cond_resched_lock and cond_resched_softirq also need fix
> > > to make the semantic consistent.
> > >
> > > Please check the following patch.
> > >
> >
> > Ah. I think the return value from these functions should mean "something
> > disruptive happened", if you like.
> >
> > See, the callers of cond_resched_lock() aren't interested in whether
> > cond_resched_lock() actually called schedule(). They want to know whether
> > cond_resched_lock() dropped the lock. Because if the lock was dropped, the
> > caller needs to take some special action, regardless of whether schedule()
> > was finally called.
> >
> > So I think the patch I queued is OK, agree?
>
> I am afraid the code like cond_resched_lock check in
> fs/jbd/checkpoint.c log_do_checkpoint may fall into endless retry in
> some condition, will it?

Oh crap, yes. If need_resched() and system_state==SYSTEM_BOOTING then
cond_resched_lock() will drop the lock but won't schedule. So it'll return
true but won't clear need_resched() and the caller will lock up.

So if cond_resched_foo() ends up dropping the lock it _must_ call
schedule() to clear need_resched().

So, how about this (it needs some code comments!)


diff -puN kernel/sched.c~cond_resched-fix kernel/sched.c
--- a/kernel/sched.c~cond_resched-fix
+++ a/kernel/sched.c
@@ -4386,6 +4386,15 @@ asmlinkage long sys_sched_yield(void)
return 0;
}

+static inline int __resched_legal(void)
+{
+ if (unlikely(preempt_count()))
+ return 0;
+ if (unlikely(system_state != SYSTEM_RUNNING))
+ return 0;
+ return 1;
+}
+
static inline void __cond_resched(void)
{
#ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
@@ -4396,10 +4405,6 @@ static inline void __cond_resched(void)
* PREEMPT_ACTIVE, which could trigger a second
* cond_resched() call.
*/
- if (unlikely(preempt_count()))
- return;
- if (unlikely(system_state != SYSTEM_RUNNING))
- return;
do {
add_preempt_count(PREEMPT_ACTIVE);
schedule();
@@ -4409,13 +4414,12 @@ static inline void __cond_resched(void)



int __sched cond_resched(void)
{
- if (need_resched()) {

+ if (need_resched() && __resched_legal()) {
__cond_resched();
return 1;
}
return 0;
}
-
EXPORT_SYMBOL(cond_resched);

/*
@@ -4436,7 +4440,7 @@ int cond_resched_lock(spinlock_t *lock)
ret = 1;
spin_lock(lock);
}
- if (need_resched()) {
+ if (need_resched() && __resched_legal()) {
_raw_spin_unlock(lock);
preempt_enable_no_resched();
__cond_resched();
@@ -4445,14 +4449,13 @@ int cond_resched_lock(spinlock_t *lock)
}
return ret;
}
-


EXPORT_SYMBOL(cond_resched_lock);

int __sched cond_resched_softirq(void)
{
BUG_ON(!in_softirq());

- if (need_resched()) {
+ if (need_resched() && __resched_legal()) {
__local_bh_enable();
__cond_resched();
local_bh_disable();
@@ -4460,10 +4463,8 @@ int __sched cond_resched_softirq(void)
}
return 0;
}
-
EXPORT_SYMBOL(cond_resched_softirq);

-
/**
* yield - yield the current processor to other threads.
*
_

Zou Nan hai

unread,
Jun 28, 2006, 4:36:49 AM6/28/06
to Andrew Morton, mi...@elte.hu, LKML
On Wed, 2006-06-28 at 16:04, Andrew Morton wrote:
> On 28 Jun 2006 14:02:57 +0800
> Zou Nan hai <nanha...@intel.com> wrote:
>
> > > > However I think cond_resched_lock and cond_resched_softirq also need fix
> > > > to make the semantic consistent.
> > > >
> > > > Please check the following patch.
> > > >
> > >
> > > Ah. I think the return value from these functions should mean "something
> > > disruptive happened", if you like.
> > >
> > > See, the callers of cond_resched_lock() aren't interested in whether
> > > cond_resched_lock() actually called schedule(). They want to know whether
> > > cond_resched_lock() dropped the lock. Because if the lock was dropped, the
> > > caller needs to take some special action, regardless of whether schedule()
> > > was finally called.
> > >
> > > So I think the patch I queued is OK, agree?
> >
> > I am afraid the code like cond_resched_lock check in
> > fs/jbd/checkpoint.c log_do_checkpoint may fall into endless retry in
> > some condition, will it?
>
> Oh crap, yes. If need_resched() and system_state==SYSTEM_BOOTING then
> cond_resched_lock() will drop the lock but won't schedule. So it'll return
> true but won't clear need_resched() and the caller will lock up.
>
> So if cond_resched_foo() ends up dropping the lock it _must_ call
> schedule() to clear need_resched().
>
> So, how about this (it needs some code comments!)
>
>

The patch works for the install test env.
However I still have some concern on cond_resched_lock(), on an UP
kernel it will return 1 if schedule happen, but actually it does not
drop any lock, that semantic seems to be different to SMP kernel.

Though the only code I found that checks the return value of
cond_resched_lock is in checkpoint.c...

Zou Nan hai

Andrew Morton

unread,
Jun 28, 2006, 4:46:18 AM6/28/06
to Zou Nan hai, mi...@elte.hu, linux-...@vger.kernel.org
On 28 Jun 2006 14:50:29 +0800

Thanks.

> However I still have some concern on cond_resched_lock(), on an UP
> kernel it will return 1 if schedule happen, but actually it does not
> drop any lock, that semantic seems to be different to SMP kernel.

That's OK (I think - I don't have a good track record in this thread).

If the kernel is non-preemptible and UP, we want to return true from
cond_resched_foo() if we called schedule(). Because schedule() might allow
a different thread into the kernel which might modify the locked data.

And if the kernel is preemptible and UP, we want to return true from
cond_resched_foo() if we dropped the lock, because that internally does a
preempt_enable().

And the patch (hopefully) satisfies those requirements. Does that all
sound solid?

Zou Nan hai

unread,
Jun 28, 2006, 5:01:26 AM6/28/06
to Andrew Morton, mi...@elte.hu, LKML

Ah yes, I think the logic is solid.

cond_sched_xxx will return 1 only if any thing disruptive really

happen, either dropping a lock or enabling preempt or bh or schedule.

The patch satisfied those requirements.

Thanks
Zou Nan hai

Arjan van de Ven

unread,
Jun 28, 2006, 5:11:31 AM6/28/06
to Andrew Morton, Zou Nan hai, linux-...@vger.kernel.org, mi...@elte.hu
On Tue, 2006-06-27 at 23:40 -0700, Andrew Morton wrote:
> On 28 Jun 2006 12:48:43 +0800
> Zou Nan hai <nanha...@intel.com> wrote:
>
> > We see system hang in ext3 jbd code
> > when Linux install program anaconda copying
> > packages.
> >
> > That is because anaconda is invoked from linuxrc
> > in initrd when system_state is still SYSTEM_BOOTING.
> >
> > Thus the cond_resched checks in journal_commit_transaction
> > will always return 1 without actually schedule,
> > then the system fall into deadloop.
>
> That's a bug in cond_resched().

hummm while this fix is needed... something makes me want to think that
something as complex as an OS installer really shouldn't be running with
SYSTEM_BOOTING state anymore....
shouldn't we start considering running the initrd (esp now with klibc)
as SYSTEM_RUNNING ?

Zou Nan hai

unread,
Jun 28, 2006, 5:19:12 AM6/28/06
to Arjan van de Ven, Andrew Morton, LKML, mi...@elte.hu
On Wed, 2006-06-28 at 17:10, Arjan van de Ven wrote:
> On Tue, 2006-06-27 at 23:40 -0700, Andrew Morton wrote:
> > On 28 Jun 2006 12:48:43 +0800
> > Zou Nan hai <nanha...@intel.com> wrote:
> >
> > > We see system hang in ext3 jbd code
> > > when Linux install program anaconda copying
> > > packages.
> > >
> > > That is because anaconda is invoked from linuxrc
> > > in initrd when system_state is still SYSTEM_BOOTING.
> > >
> > > Thus the cond_resched checks in journal_commit_transaction
> > > will always return 1 without actually schedule,
> > > then the system fall into deadloop.
> >
> > That's a bug in cond_resched().
>
> hummm while this fix is needed... something makes me want to think that
> something as complex as an OS installer really shouldn't be running with
> SYSTEM_BOOTING state anymore....
> shouldn't we start considering running the initrd (esp now with klibc)
> as SYSTEM_RUNNING ?

Is there any historical reason that we put system_state of app runs
from initrd into SYSTEM_BOOTING instead of SYSTEM_RUNNING?


Zou Nan hai

Ingo Molnar

unread,
Jun 28, 2006, 5:34:33 AM6/28/06
to Andrew Morton, Zou Nan hai, linux-...@vger.kernel.org

* Andrew Morton <ak...@osdl.org> wrote:

> > However I still have some concern on cond_resched_lock(), on an UP
> > kernel it will return 1 if schedule happen, but actually it does not
> > drop any lock, that semantic seems to be different to SMP kernel.
>
> That's OK (I think - I don't have a good track record in this thread).
>
> If the kernel is non-preemptible and UP, we want to return true from
> cond_resched_foo() if we called schedule(). Because schedule() might
> allow a different thread into the kernel which might modify the locked
> data.
>
> And if the kernel is preemptible and UP, we want to return true from
> cond_resched_foo() if we dropped the lock, because that internally
> does a preempt_enable().
>
> And the patch (hopefully) satisfies those requirements. Does that all
> sound solid?

it looks solid to me ... but this is subtle stuff and i dont seem to
have a good track record for these details today. I guess if it's
inadequate Zou ought to see them in practice?

Ingo

0 new messages