Remove lock_task_sighand() and unnecessary sigemptyset's, move
collect_sigign_sigcatch() under rcu_read_lock().
Of course, this means we read pending/blocked/etc nonatomically,
but I hope this is OK for fs/proc.
Probably we can change do_task_stat() to avod ->siglock too, except
we can't get tty_nr lockless.
Also, remove the "is this correct?" comment. I think it is safe
to dereference __task_cred(p)->user under rcu lock. In any case,
->siglock can't help to protect cred->user.
Signed-off-by: Oleg Nesterov <ol...@redhat.com>
---
fs/proc/array.c | 26 ++++++++++----------------
1 file changed, 10 insertions(+), 16 deletions(-)
--- 34-rc1/fs/proc/array.c~PROC_3_TASK_SIG_DONT_USE_SIGLOCK 2010-03-22 17:39:42.000000000 +0100
+++ 34-rc1/fs/proc/array.c 2010-03-22 18:36:13.000000000 +0100
@@ -257,30 +257,24 @@ static void collect_sigign_sigcatch(stru
static inline void task_sig(struct seq_file *m, struct task_struct *p)
{
- unsigned long flags;
sigset_t pending, shpending, blocked, ignored, caught;
int num_threads = 0;
unsigned long qsize = 0;
unsigned long qlim = 0;
- sigemptyset(&pending);
- sigemptyset(&shpending);
- sigemptyset(&blocked);
sigemptyset(&ignored);
sigemptyset(&caught);
- if (lock_task_sighand(p, &flags)) {
- pending = p->pending.signal;
- shpending = p->signal->shared_pending.signal;
- blocked = p->blocked;
- collect_sigign_sigcatch(p, &ignored, &caught);
- num_threads = get_nr_threads(p);
- rcu_read_lock(); /* FIXME: is this correct? */
- qsize = atomic_read(&__task_cred(p)->user->sigpending);
- rcu_read_unlock();
- qlim = task_rlimit(p, RLIMIT_SIGPENDING);
- unlock_task_sighand(p, &flags);
- }
+ blocked = p->blocked;
+ pending = p->pending.signal;
+ shpending = p->signal->shared_pending.signal;
+ qlim = task_rlimit(p, RLIMIT_SIGPENDING);
+ num_threads = get_nr_threads(p);
+
+ rcu_read_lock();
+ collect_sigign_sigcatch(p, &ignored, &caught);
+ qsize = atomic_read(&__task_cred(p)->user->sigpending);
+ rcu_read_unlock();
seq_printf(m, "Threads:\t%d\n", num_threads);
seq_printf(m, "SigQ:\t%lu/%lu\n", qsize, qlim);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
> task_sig() doesn't need ->siglock.
Except that the data returned might then be inconsistent because you don't
hold a lock as you read the various bits of it.
David
> Also, remove the "is this correct?" comment. I think it is safe
> to dereference __task_cred(p)->user under rcu lock.
It is.
David
Yes. From the changelog:
Of course, this means we read pending/blocked/etc nonatomically,
but I hope this is OK for fs/proc.
But I don't think the returned data could be "really" inconsistent
from the /bin/ps pov. Yes, it is possible that, say, some signal is
seen as both pending and ignored without ->siglock. Or we can report
user->sigpending != 0 while pending/shpending are empty.
But this looks harmless to me. We never guaranteed /proc/pid/status
can't report the "intermediate" state, and I don't think we can
confuse the user-space.
Do you agree? Or do you think this can make problems ?
Oleg.
> > Except that the data returned might then be inconsistent because you don't
> > hold a lock as you read the various bits of it.
>
> Yes. From the changelog:
>
> Of course, this means we read pending/blocked/etc nonatomically,
> but I hope this is OK for fs/proc.
Ah, yes. I read that as you meant how procfs accessed the actual data
structures, not how the user accessed procfs. It might be worth clarifying
that.
> But I don't think the returned data could be "really" inconsistent
> from the /bin/ps pov. Yes, it is possible that, say, some signal is
> seen as both pending and ignored without ->siglock. Or we can report
> user->sigpending != 0 while pending/shpending are empty.
>
> But this looks harmless to me. We never guaranteed /proc/pid/status
> can't report the "intermediate" state, and I don't think we can
> confuse the user-space.
>
> Do you agree? Or do you think this can make problems ?
I don't know of anything this will affect adversely. In fact, I'm not sure
there was a guarantee that it would be atomic anyway.
So as far as I'm concerned, you can add:
Acked-by: David Howells <dhow...@redhat.com>
> > Probably we can change do_task_stat() to avod ->siglock too, except
> > we can't get tty_nr lockless.
Btw, avoid has an 'i' in it... :-)
OK, agreed.
> Acked-by: David Howells <dhow...@redhat.com>
Thanks,
> > > Probably we can change do_task_stat() to avod ->siglock too, except
>
> Btw, avoid has an 'i' in it... :-)
Another reason to update the changelog ;)
Andrew, please find the updated changelog for proc-make-task_sig-lockless.patch
If this is not convenient, please ignore or tell me what is the "right" way
to fix the changelog when the patch is already in -mm.
------------------------------------------------------------------------------
Now that task->signal can't go away and collect_sigign_sigcatch() is
rcu-safe, task_sig() doesn't need ->siglock.
Remove lock_task_sighand() and unnecessary sigemptyset's, move
collect_sigign_sigcatch() under rcu_read_lock().
Of course, this means we read pending/blocked/etc nonatomically and we
can report this info in some intermediate state. Say, a signal can be
reported as both pending and ignored, or we can report ->sigpending != 0
while pending/shpending are empty, etc. Hopefully this is OK for proc,
we never promised this info should be atomic.
Probably we can change do_task_stat() to avoid ->siglock too, except we
can't get tty_nr lockless.
Also, remove the "is this correct?" comment. I think it is safe to
dereference __task_cred(p)->user under rcu lock. In any case, ->siglock
can't help to protect cred->user.
--
I'm not so sure. Operations like sigprocmask and sigaction really have
always been entirely atomic from the userland perspective before. Now it
becomes possible to read from /proc e.g. a blocked set that never existed
as such (one word updated by sigprocmask but not yet the next word).
Thanks,
Roland
> I'm not so sure. Operations like sigprocmask and sigaction really have
> always been entirely atomic from the userland perspective before. Now it
> becomes possible to read from /proc e.g. a blocked set that never existed
> as such (one word updated by sigprocmask but not yet the next word).
If you have a small userspace buffer, that was previously possible too.
David
Yes, /proc/pid/status can report the intermediate state, I even sent
the updated changelog to document this.
But if you are not sure this is OK, I am worried. Do you think we should
drop this patch? If yes, I won't argue.
Oleg.
I'm not dead-set against it, but I am hesitant. My inclination is not to
remove any previous userland atomicity guarantees with regard to observable
signal state in any form. At least, don't do that in part of a whole
cleanup flurry where it is intermixed with lots of changes that really are
pure cleanup with absolutely no userland-observable change. If it really
helps to fragment what was atomic before, then we can consider it. But
let's not be in a hurry.
David mentioned that users who do multiple reads due to using tiny buffers
already don't get atomic sampling. That is certainly true but I don't
think it's relevant. It is completely reliable that you can easily
allocate a buffer big enough to get all the Sig* fields on the first read,
and any user program that might care about the coherence of the data,
by definition, is already doing that.
Thanks,
Roland
proc-make-collect_sigign_sigcatch-rcu-safe.patch
proc-make-task_sig-lockless.patch
On 04/12, Roland McGrath wrote:
>
> > Yes, /proc/pid/status can report the intermediate state, I even sent
> > the updated changelog to document this.
> >
> > But if you are not sure this is OK, I am worried. Do you think we should
> > drop this patch? If yes, I won't argue.
>
> I'm not dead-set against it, but I am hesitant. My inclination is not to
> remove any previous userland atomicity guarantees with regard to observable
> signal state in any form.
OK. Not that I really understand why do we need atomicity, but OK.
I was going to remove ->siglock from /fs/proc/ completely (except
do_io_accounting), but given that nobody replied to do_task_stat patches
this will not happen soon.
> At least, don't do that in part of a whole
> cleanup flurry where it is intermixed with lots of changes that really are
> pure cleanup with absolutely no userland-observable change.
OK. Anyway, these changes are simple, we can reconsider them later.
Oleg.