Signed-off-by: Ripduman Sohan <ripduma...@cl.cam.ac.uk>
---
kernel/workqueue.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 25fb1b0..0a4e785 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2031,6 +2031,7 @@ static int rescuer_thread(void *__wq)
struct list_head *scheduled = &rescuer->scheduled;
bool is_unbound = wq->flags & WQ_UNBOUND;
unsigned int cpu;
+ cpumask_t allowed_cpus = current->cpus_allowed;
set_user_nice(current, RESCUER_NICE_LEVEL);
repeat:
@@ -2078,6 +2079,8 @@ repeat:
spin_unlock_irq(&gcwq->lock);
}
+ set_cpus_allowed_ptr(current, &allowed_cpus);
+
schedule();
goto repeat;
}
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
except you cannot just allocate a cpumask_t like that on the stack,
those things can be massive.
> set_user_nice(current, RESCUER_NICE_LEVEL);
> repeat:
> @@ -2078,6 +2079,8 @@ repeat:
> spin_unlock_irq(&gcwq->lock);
> }
>
> + set_cpus_allowed_ptr(current, &allowed_cpus);
> +
> schedule();
> goto repeat;
> }
--
v2: Heeded Peter Zijlstra's comments and don't allocate cpumask_t on
stack, manipulate task cpus_allowed struct directly instead
Signed-off-by: Ripduman Sohan <ripduma...@cl.cam.ac.uk>
---
kernel/workqueue.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 25fb1b0..29d2ddf 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2078,6 +2078,12 @@ repeat:
spin_unlock_irq(&gcwq->lock);
}
+ if (is_unbound)
+ cpumask_setall(¤t->cpus_allowed);
+ else
+ for_each_cwq_cpu(cpu, wq)
+ cpu_set(cpu, current->cpus_allowed);
+
schedule();
goto repeat;
}
--
1.7.1
On Thu, Sep 01, 2011 at 02:36:33PM +0100, Ripduman Sohan wrote:
> Rescuer threads may be migrated (and are bound) to particular CPUs when
> active. However, the allowed_cpus mask is not restored when they return
> to sleep rendering inconsistent the presented and actual set of CPUs the
> process may potentially run on. This patch fixes this oversight by
> recording the allowed_cpus mask for rescuer threads when it enters the
> rescuer_thread() main loop and restoring it every time the thread sleeps.
Hmmm... so, currently, rescuer is left bound to the last cpu it worked
on. Why is this a problem?
Thanks.
--
tejun
On Thu, Sep 15, 2011 at 05:14:30PM +0100, Ripduman Sohan wrote:
> The rescuer being left bound to the last CPU it was active on is not a
> problem. As I pointed out in the commit log the issue is that the
> allowed_cpus mask is not restored when rescuers return to sleep,
> rendering inconsistent the presented and actual set of CPUs the
> process may potentially run on.
>
> Perhaps an explanation is in order. I am working on a system where we
> constantly sample process run-state (including the process
> Cpus_Allowed field in /proc/<pid>/status) to build a forward plan of
> where the process _may_ run in the future. In situations of high
> memory pressue (common on our setup) where the rescuers ran often the
> plan begun to significantly deviate from the calculated schedule
> because rescuer threads were marked as only runnable on a single CPU
> when in reality they would bounce across CPUs.
But cpus_allowed doesn't mean where the task *may* run in the future.
It indicates on which cpus the task is allowed to run *now* and it's
allowed to change.
> I've currently put in a special-case exception in our code to account
> for the fact that rescuer threads may run on _any_ CPU regardless of
> the current cpus_allowed mask but I thought it would be useful to
> correct it. I'm happy to continue with my current approach if you
> deem the patch irrelevant.
I'm not necessarily against the patch if it helps a valid use case but
let's do that when and if the use case becomes relevant enough, which
I don't think it is yet. Please feel free to raise the issue again
when the situation changes.
Thank you.
--
tejun
On Sun, Sep 25, 2011 at 09:02:25AM +0300, Gilad Ben-Yossef wrote:
> Thank you for taking the time to answer my query :-)
>
> I agree there is no real problem with having an additional task
> bound to an "isolated" CPU so long that it does not run and of
> course that if a task on an isolated CPU initiated activity that
> resulted in requiring the services of a rescuer workqueue thread it
> most certainly needs to run there and that is fine.
>
> I guess my question is - apart form running on the isolated CPU,
> does the fact that the rescuer thread is bound there can cause
> activity on that CPU originating from a foreign CPU, such as for
> example running an IPI handler in order to migrate it there?
Hmmm... indeed. This can cause an unnecessary wakeup / migration on
an isolated CPU when another CPU asks for the rescuer, so yeah it
makes sense to change the behavior. BTW, why didn't the original
patch simply use set_cpus_allowed_ptr(cpu_all_mask)?
Thanks.
--
tejun