Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH -mm 3/3] proc: make task_sig() lockless

0 views
Skip to first unread message

Oleg Nesterov

unread,
Mar 22, 2010, 2:50:01 PM3/22/10
to
Now that task->signal can't go away and collect_sigign_sigcatch()
is rcu-safe, task_sig() doesn't need ->siglock.

Remove lock_task_sighand() and unnecessary sigemptyset's, move
collect_sigign_sigcatch() under rcu_read_lock().

Of course, this means we read pending/blocked/etc nonatomically,
but I hope this is OK for fs/proc.

Probably we can change do_task_stat() to avod ->siglock too, except
we can't get tty_nr lockless.

Also, remove the "is this correct?" comment. I think it is safe
to dereference __task_cred(p)->user under rcu lock. In any case,
->siglock can't help to protect cred->user.

Signed-off-by: Oleg Nesterov <ol...@redhat.com>
---

fs/proc/array.c | 26 ++++++++++----------------
1 file changed, 10 insertions(+), 16 deletions(-)

--- 34-rc1/fs/proc/array.c~PROC_3_TASK_SIG_DONT_USE_SIGLOCK 2010-03-22 17:39:42.000000000 +0100
+++ 34-rc1/fs/proc/array.c 2010-03-22 18:36:13.000000000 +0100
@@ -257,30 +257,24 @@ static void collect_sigign_sigcatch(stru

static inline void task_sig(struct seq_file *m, struct task_struct *p)
{
- unsigned long flags;
sigset_t pending, shpending, blocked, ignored, caught;
int num_threads = 0;
unsigned long qsize = 0;
unsigned long qlim = 0;

- sigemptyset(&pending);
- sigemptyset(&shpending);
- sigemptyset(&blocked);
sigemptyset(&ignored);
sigemptyset(&caught);

- if (lock_task_sighand(p, &flags)) {
- pending = p->pending.signal;
- shpending = p->signal->shared_pending.signal;
- blocked = p->blocked;
- collect_sigign_sigcatch(p, &ignored, &caught);
- num_threads = get_nr_threads(p);
- rcu_read_lock(); /* FIXME: is this correct? */
- qsize = atomic_read(&__task_cred(p)->user->sigpending);
- rcu_read_unlock();
- qlim = task_rlimit(p, RLIMIT_SIGPENDING);
- unlock_task_sighand(p, &flags);
- }
+ blocked = p->blocked;
+ pending = p->pending.signal;
+ shpending = p->signal->shared_pending.signal;
+ qlim = task_rlimit(p, RLIMIT_SIGPENDING);
+ num_threads = get_nr_threads(p);
+
+ rcu_read_lock();
+ collect_sigign_sigcatch(p, &ignored, &caught);
+ qsize = atomic_read(&__task_cred(p)->user->sigpending);
+ rcu_read_unlock();

seq_printf(m, "Threads:\t%d\n", num_threads);
seq_printf(m, "SigQ:\t%lu/%lu\n", qsize, qlim);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

David Howells

unread,
Mar 23, 2010, 4:40:02 AM3/23/10
to
Oleg Nesterov <ol...@redhat.com> wrote:

> task_sig() doesn't need ->siglock.

Except that the data returned might then be inconsistent because you don't
hold a lock as you read the various bits of it.

David

David Howells

unread,
Mar 23, 2010, 4:40:02 AM3/23/10
to
Oleg Nesterov <ol...@redhat.com> wrote:

> Also, remove the "is this correct?" comment. I think it is safe
> to dereference __task_cred(p)->user under rcu lock.

It is.

David

Oleg Nesterov

unread,
Mar 23, 2010, 7:00:02 AM3/23/10
to
On 03/23, David Howells wrote:
>
> Oleg Nesterov <ol...@redhat.com> wrote:
>
> > task_sig() doesn't need ->siglock.
>
> Except that the data returned might then be inconsistent because you don't
> hold a lock as you read the various bits of it.

Yes. From the changelog:

Of course, this means we read pending/blocked/etc nonatomically,
but I hope this is OK for fs/proc.

But I don't think the returned data could be "really" inconsistent
from the /bin/ps pov. Yes, it is possible that, say, some signal is
seen as both pending and ignored without ->siglock. Or we can report
user->sigpending != 0 while pending/shpending are empty.

But this looks harmless to me. We never guaranteed /proc/pid/status
can't report the "intermediate" state, and I don't think we can
confuse the user-space.

Do you agree? Or do you think this can make problems ?

Oleg.

David Howells

unread,
Mar 24, 2010, 4:40:01 AM3/24/10
to
Oleg Nesterov <ol...@redhat.com> wrote:

> > Except that the data returned might then be inconsistent because you don't
> > hold a lock as you read the various bits of it.
>
> Yes. From the changelog:
>
> Of course, this means we read pending/blocked/etc nonatomically,
> but I hope this is OK for fs/proc.

Ah, yes. I read that as you meant how procfs accessed the actual data
structures, not how the user accessed procfs. It might be worth clarifying
that.

> But I don't think the returned data could be "really" inconsistent
> from the /bin/ps pov. Yes, it is possible that, say, some signal is
> seen as both pending and ignored without ->siglock. Or we can report
> user->sigpending != 0 while pending/shpending are empty.
>
> But this looks harmless to me. We never guaranteed /proc/pid/status
> can't report the "intermediate" state, and I don't think we can
> confuse the user-space.
>
> Do you agree? Or do you think this can make problems ?

I don't know of anything this will affect adversely. In fact, I'm not sure
there was a guarantee that it would be atomic anyway.

So as far as I'm concerned, you can add:

Acked-by: David Howells <dhow...@redhat.com>

> > Probably we can change do_task_stat() to avod ->siglock too, except
> > we can't get tty_nr lockless.

Btw, avoid has an 'i' in it... :-)

Oleg Nesterov

unread,
Mar 24, 2010, 11:10:01 AM3/24/10
to
On 03/24, David Howells wrote:
>
> Oleg Nesterov <ol...@redhat.com> wrote:
> >
> > Yes. From the changelog:
> >
> > Of course, this means we read pending/blocked/etc nonatomically,
> > but I hope this is OK for fs/proc.
>
> Ah, yes. I read that as you meant how procfs accessed the actual data
> structures, not how the user accessed procfs. It might be worth clarifying
> that.

OK, agreed.

> Acked-by: David Howells <dhow...@redhat.com>

Thanks,

> > > Probably we can change do_task_stat() to avod ->siglock too, except
>

> Btw, avoid has an 'i' in it... :-)

Another reason to update the changelog ;)


Andrew, please find the updated changelog for proc-make-task_sig-lockless.patch
If this is not convenient, please ignore or tell me what is the "right" way
to fix the changelog when the patch is already in -mm.

------------------------------------------------------------------------------


Now that task->signal can't go away and collect_sigign_sigcatch() is
rcu-safe, task_sig() doesn't need ->siglock.

Remove lock_task_sighand() and unnecessary sigemptyset's, move
collect_sigign_sigcatch() under rcu_read_lock().

Of course, this means we read pending/blocked/etc nonatomically and we
can report this info in some intermediate state. Say, a signal can be
reported as both pending and ignored, or we can report ->sigpending != 0
while pending/shpending are empty, etc. Hopefully this is OK for proc,
we never promised this info should be atomic.

Probably we can change do_task_stat() to avoid ->siglock too, except we


can't get tty_nr lockless.

Also, remove the "is this correct?" comment. I think it is safe to


dereference __task_cred(p)->user under rcu lock. In any case, ->siglock
can't help to protect cred->user.

--

Roland McGrath

unread,
Apr 9, 2010, 4:10:02 PM4/9/10
to
> Yes. From the changelog:
>
> Of course, this means we read pending/blocked/etc nonatomically,
> but I hope this is OK for fs/proc.
>
> But I don't think the returned data could be "really" inconsistent
> from the /bin/ps pov. Yes, it is possible that, say, some signal is
> seen as both pending and ignored without ->siglock. Or we can report
> user->sigpending != 0 while pending/shpending are empty.
>
> But this looks harmless to me. We never guaranteed /proc/pid/status
> can't report the "intermediate" state, and I don't think we can
> confuse the user-space.
>
> Do you agree? Or do you think this can make problems ?

I'm not so sure. Operations like sigprocmask and sigaction really have
always been entirely atomic from the userland perspective before. Now it
becomes possible to read from /proc e.g. a blocked set that never existed
as such (one word updated by sigprocmask but not yet the next word).


Thanks,
Roland

David Howells

unread,
Apr 10, 2010, 4:20:01 AM4/10/10
to
Roland McGrath <rol...@redhat.com> wrote:

> I'm not so sure. Operations like sigprocmask and sigaction really have
> always been entirely atomic from the userland perspective before. Now it
> becomes possible to read from /proc e.g. a blocked set that never existed
> as such (one word updated by sigprocmask but not yet the next word).

If you have a small userspace buffer, that was previously possible too.

David

Oleg Nesterov

unread,
Apr 12, 2010, 4:00:02 PM4/12/10
to
On 04/09, Roland McGrath wrote:
>
> > Yes. From the changelog:
> >
> > Of course, this means we read pending/blocked/etc nonatomically,
> > but I hope this is OK for fs/proc.
> >
> > But I don't think the returned data could be "really" inconsistent
> > from the /bin/ps pov. Yes, it is possible that, say, some signal is
> > seen as both pending and ignored without ->siglock. Or we can report
> > user->sigpending != 0 while pending/shpending are empty.
> >
> > But this looks harmless to me. We never guaranteed /proc/pid/status
> > can't report the "intermediate" state, and I don't think we can
> > confuse the user-space.
> >
> > Do you agree? Or do you think this can make problems ?
>
> I'm not so sure. Operations like sigprocmask and sigaction really have
> always been entirely atomic from the userland perspective before. Now it
> becomes possible to read from /proc e.g. a blocked set that never existed
> as such (one word updated by sigprocmask but not yet the next word).

Yes, /proc/pid/status can report the intermediate state, I even sent
the updated changelog to document this.

But if you are not sure this is OK, I am worried. Do you think we should
drop this patch? If yes, I won't argue.

Oleg.

Roland McGrath

unread,
Apr 13, 2010, 2:40:01 AM4/13/10
to
> Yes, /proc/pid/status can report the intermediate state, I even sent
> the updated changelog to document this.
>
> But if you are not sure this is OK, I am worried. Do you think we should
> drop this patch? If yes, I won't argue.

I'm not dead-set against it, but I am hesitant. My inclination is not to
remove any previous userland atomicity guarantees with regard to observable
signal state in any form. At least, don't do that in part of a whole
cleanup flurry where it is intermixed with lots of changes that really are
pure cleanup with absolutely no userland-observable change. If it really
helps to fragment what was atomic before, then we can consider it. But
let's not be in a hurry.

David mentioned that users who do multiple reads due to using tiny buffers
already don't get atomic sampling. That is certainly true but I don't
think it's relevant. It is completely reliable that you can easily
allocate a buffer big enough to get all the Sig* fields on the first read,
and any user program that might care about the coherence of the data,
by definition, is already doing that.


Thanks,
Roland

Oleg Nesterov

unread,
Apr 13, 2010, 4:10:02 PM4/13/10
to
OK. Andrew, please drop

proc-make-collect_sigign_sigcatch-rcu-safe.patch
proc-make-task_sig-lockless.patch

On 04/12, Roland McGrath wrote:
>
> > Yes, /proc/pid/status can report the intermediate state, I even sent
> > the updated changelog to document this.
> >
> > But if you are not sure this is OK, I am worried. Do you think we should
> > drop this patch? If yes, I won't argue.
>
> I'm not dead-set against it, but I am hesitant. My inclination is not to
> remove any previous userland atomicity guarantees with regard to observable
> signal state in any form.

OK. Not that I really understand why do we need atomicity, but OK.

I was going to remove ->siglock from /fs/proc/ completely (except
do_io_accounting), but given that nobody replied to do_task_stat patches
this will not happen soon.

> At least, don't do that in part of a whole
> cleanup flurry where it is intermixed with lots of changes that really are
> pure cleanup with absolutely no userland-observable change.

OK. Anyway, these changes are simple, we can reconsider them later.

Oleg.

0 new messages