Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] sched: make sure sched_child_runs_first WORK

94 views
Skip to first unread message

marywangran

unread,
Apr 28, 2009, 7:21:52 AM4/28/09
to linux-...@vger.kernel.org
CFS scheduler become the main scheduler after 2.6.23.everything is
fair,no starvation,no complexity.The new task would not simply be
queued at the head to quickly preempt current.according to the code
of kernel 2.6.28,if you clear the STAR_DEBIT bit by sysctl -w
kernel.sched_features=orig_value&~STSRT_DEBIT_bit,child task would not
preempt its father always,and this problem is easier to recur if you
use a father task with lower nice value. my test file is:

/*******child_first.c**********/
#include <sched.h>
#include <sys/types.h>
#include <unistd.h>
int main(int argc,char *argv[])
{
cpu_set_t mask;
__CPU_ZERO( &mask );
__CPU_SET(0, &mask );
sched_setaffinity( 0, sizeof(mask), &mask );
int v = atoi(argv[1]);
nice(v);
int i = 90000;
while(i-->0)
{
v++;
}
if(fork() == 0)
{
printf("sub\n");
exit(0);
}
printf("main,%d\n",v);
}
just compile it to child_first and do following:
[root@zhaoya ~]#sysctl -w kernel.sched_features=0
[root@zhaoya ~]#./child_first -20
[root@zhaoya ~]#./child_first -xx
..
[root@zhaoya ~]#./child_first 10000...
after all this,believe your eyes.
because the code judgeing the condition whether the child should
preempt the father is very LOOSE锟斤拷if the nice value of father is very
low and the nr_running is very small,the cfs_rq->min_vruntime is
always equal with the vruntime of father,so {curr->vruntime <=
se->vruntime}.if the nice value if high,the cfs_rq->min_vruntime is
always little than father so {cfs_rq->min_vruntime <= curr->vruntime}
Signed-off-by: Ya Zhao <maryw...@gmail.com>
---
--- linux-2.6.28.1/kernel/sched_fair.c.orig 2009-04-28 22:26:00.000000000 +0800
+++ linux-2.6.28.1/kernel/sched_fair.c 2009-04-28 22:34:49.000000000 +0800
@@ -1628,12 +1628,13 @@ static void task_new_fair(struct rq *rq,

/* 'curr' will be NULL if the child belongs to a different group */
if (sysctl_sched_child_runs_first && this_cpu == task_cpu(p) &&
- curr && curr->vruntime < se->vruntime) {
+ 1
/*
* Upon rescheduling, sched_class::put_prev_task() will place
* 'current' within the tree based on its new key value.
*/
- swap(curr->vruntime, se->vruntime);
+ if( curr->vruntime < se->vruntime )
+ swap(curr->vruntime, se->vruntime);
resched_task(rq->curr);
}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

marywangran

unread,
Apr 28, 2009, 7:26:24 AM4/28/09
to linux-...@vger.kernel.org
This is new patch,it is simple,and it work all right

+ curr){

Peter Zijlstra

unread,
Apr 28, 2009, 7:53:56 AM4/28/09
to marywangran, linux-...@vger.kernel.org, Ingo Molnar
On Tue, 2009-04-28 at 19:26 +0800, marywangran wrote:


> Signed-off-by: Ya Zhao <maryw...@gmail.com>
> ---
> --- linux-2.6.28.1/kernel/sched_fair.c.orig 2009-04-28
> 22:26:00.000000000 +0800
> +++ linux-2.6.28.1/kernel/sched_fair.c 2009-04-28 22:34:49.000000000 +0800
> @@ -1628,12 +1628,13 @@ static void task_new_fair(struct rq *rq,
>
> /* 'curr' will be NULL if the child belongs to a different group */
> if (sysctl_sched_child_runs_first && this_cpu == task_cpu(p) &&
> - curr && curr->vruntime < se->vruntime) {
> + curr){
> /*
> * Upon rescheduling, sched_class::put_prev_task() will place
> * 'current' within the tree based on its new key value.
> */
> - swap(curr->vruntime, se->vruntime);
> + if( curr->vruntime < se->vruntime )
> + swap(curr->vruntime, se->vruntime);
> resched_task(rq->curr);
> }

Aside from the style issue the patch seems sensible enough.

Thing is, do we really care about child runs first?

Peter Zijlstra

unread,
Apr 28, 2009, 10:01:38 AM4/28/09
to marywangran, linux-...@vger.kernel.org, Ingo Molnar
On Tue, 2009-04-28 at 21:55 +0800, marywangran wrote:
>
>
> 2009/4/28 Peter Zijlstra <pet...@infradead.org>

> On Tue, 2009-04-28 at 19:26 +0800, marywangran wrote:
>
>
> > Signed-off-by: Ya Zhao <maryw...@gmail.com>
> > ---
> > --- linux-2.6.28.1/kernel/sched_fair.c.orig 2009-04-28
> > 22:26:00.000000000 +0800
> > +++ linux-2.6.28.1/kernel/sched_fair.c 2009-04-28 22:34:49.000000000 +0800
> > @@ -1628,12 +1628,13 @@ static void task_new_fair(struct rq *rq,
> >
> > /* 'curr' will be NULL if the child belongs to a different group */
> > if (sysctl_sched_child_runs_first && this_cpu == task_cpu(p) &&
> > - curr && curr->vruntime < se->vruntime) {
> > + curr){
> > /*
> > * Upon rescheduling, sched_class::put_prev_task() will place
> > * 'current' within the tree based on its new key value.
> > */
> > - swap(curr->vruntime, se->vruntime);
> > + if( curr->vruntime < se->vruntime )
> > + swap(curr->vruntime, se->vruntime);
> > resched_task(rq->curr);
> > }
>
>
> Aside from the style issue the patch seems sensible enough.
>
> Thing is, do we really care about child runs first?
>
> but if the child runs last,there maybe more copy-on-write.User can
> disable child-runs-first if he can confirm the child would not do exec
> or so . now that the kernel provide the policy,why we implement it
> halfway?

Sure, I just wanted to raise the issue, child-runs-first doesn't really
work reliably on SMP, and since even embedded is moving to SMP the value
of keeping it around seems to be less each day.

But as long as we do have it, I agree that your patch is wanted.

0 new messages