Account Options

  1. Sign in
Google Groups Home
« Groups Home
eventfd signal race in aio_complete()
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  6 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Jeff Roberson  
View profile  
 More options Mar 7 2008, 10:10 pm
Newsgroups: linux.kernel
From: Jeff Roberson <jrober...@chesapeake.net>
Date: Sat, 08 Mar 2008 04:10:09 +0100
Local: Fri, Mar 7 2008 10:10 pm
Subject: [PATCH] eventfd signal race in aio_complete()

Hello,

I have an application that makes use of eventfd to merge socket and aio
blocking with epoll in one thread.  Under heavy loads the application
sometimes hangs when we receive notification from epoll that the eventfd
has an event ready but reading the aio completions produces no results.
Further investigation revealed that the aiocb was later ready with no
new event and completing it based on a timer resolved the application
hang.

This pointed to the eventfd being signaled prematurely and I verified that
this was indeed the problem.  aio_complete() calls eventfd_signal() before
the event is actually placed on the completion ring.  On a multi-processor
system it is possible to read the event from epoll and return to userspace
before aio_complete() finishes.

The enclosed patch simply moves the signaling to the bottom of the
function.  I'm not 100% familiar with this code and it looks like it may
be possible to have spurious wakeups now but there will be no missed
wakeups.  An application may also race the other way now and receive aio
completion before the signal, thus still leaving it with a signal with no
completion.  signaling while the kioctx is locked would resolve this but I
was hesitant to introduce further nesting of spinlocks that might have
another order elsewhere.

Please keep me in the cc line for any necessary replies.

Thanks,
Jeff

Signed-off-by:  Jeff Roberson <j...@freebsd.org>

  aiorace.diff
1K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Davide Libenzi  
View profile  
 More options Mar 7 2008, 11:40 pm
Newsgroups: linux.kernel
From: Davide Libenzi <davi...@xmailserver.org>
Date: Sat, 08 Mar 2008 05:40:06 +0100
Local: Fri, Mar 7 2008 11:40 pm
Subject: Re: [PATCH] eventfd signal race in aio_complete()
[cc-ing zab]

On Fri, 7 Mar 2008, Jeff Roberson wrote:
> Hello,

Hi!

Your patch access the iocb after the __aio_put_req() call, that can make
the iocb (and the reference to the ki_eventfd) to become invalid. It also
has the spurious wakeup issue (not a biggie, but if it can be avoided).
There're two solutions AFAICS. The first solution/patch get a reference to
the file*, and signal (if the event has really been dropped inside the
ring) and release.
The second solution/patch simply moves the eventfd_signal() call before
the __aio_put_req() call, but after the event has beed "ringed".
We should be clear to go with the shorter/nicer second solution. Those
patches builds, but I'm not even signing them off till I tested them.

- Davide

---
 fs/aio.c |   29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

Index: linux-2.6.mod/fs/aio.c
===================================================================
--- linux-2.6.mod.orig/fs/aio.c 2008-03-07 19:33:44.000000000 -0800
+++ linux-2.6.mod/fs/aio.c      2008-03-07 19:45:50.000000000 -0800
@@ -916,7 +916,8 @@
        struct kioctx   *ctx = iocb->ki_ctx;
        struct aio_ring_info    *info;
        struct aio_ring *ring;
-       struct io_event *event;
+       struct io_event *event = NULL;
+       struct file *file = NULL;
        unsigned long   flags;
        unsigned long   tail;
        int             ret;
@@ -937,12 +938,15 @@
        }

        /*
-        * Check if the user asked us to deliver the result through an
-        * eventfd. The eventfd_signal() function is safe to be called
-        * from IRQ context.
+        * Get a reference now, but do not deliver the event until
+        * we're sure we actually dropped it inside the ring. We
+        * need to get a reference before calling __aio_put_req(),
+        * since the ->ki_eventfd may become invalid after such call.
         */
-       if (!IS_ERR(iocb->ki_eventfd))
-               eventfd_signal(iocb->ki_eventfd, 1);
+       if (!IS_ERR(iocb->ki_eventfd)) {
+               file = iocb->ki_eventfd;
+               get_file(file);
+       }

        info = &ctx->ring_info;

@@ -1000,6 +1004,19 @@
                wake_up(&ctx->wait);

        spin_unlock_irqrestore(&ctx->ctx_lock, flags);
+
+       /*
+        * If the user requested us to deliver a completion event to an
+        * eventfd file descriptor *and* we actually delivered the event,
+        * signal it with eventfd_signal(). The eventfd_signal() function
+        * is safe to be called from IRQ context.
+        */
+       if (file) {
+               if (event)
+                       eventfd_signal(file, 1);
+               fput(file);
+       }
+
        return ret;
 }

---
 fs/aio.c |   17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

Index: linux-2.6.mod/fs/aio.c
===================================================================
--- linux-2.6.mod.orig/fs/aio.c 2008-03-07 20:14:55.000000000 -0800
+++ linux-2.6.mod/fs/aio.c      2008-03-07 20:15:24.000000000 -0800
@@ -936,14 +936,6 @@
                return 1;
        }

-       /*
-        * Check if the user asked us to deliver the result through an
-        * eventfd. The eventfd_signal() function is safe to be called
-        * from IRQ context.
-        */
-       if (!IS_ERR(iocb->ki_eventfd))
-               eventfd_signal(iocb->ki_eventfd, 1);
-
        info = &ctx->ring_info;

        /* add a completion event to the ring buffer.
@@ -992,6 +984,15 @@
        kunmap_atomic(ring, KM_IRQ1);

        pr_debug("added to ring %p at [%lu]\n", iocb, tail);
+
+       /*
+        * Check if the user asked us to deliver the result through an
+        * eventfd. The eventfd_signal() function is safe to be called
+        * from IRQ context.
+        */
+       if (!IS_ERR(iocb->ki_eventfd))
+               eventfd_signal(iocb->ki_eventfd, 1);
+
 put_rq:
        /* everything turned out well, dispose of the aiocb. */
        ret = __aio_put_req(ctx, iocb);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rik van Riel  
View profile  
 More options Mar 8 2008, 10:30 am
Newsgroups: linux.kernel
From: Rik van Riel <r...@redhat.com>
Date: Sat, 08 Mar 2008 16:30:15 +0100
Local: Sat, Mar 8 2008 10:30 am
Subject: Re: [PATCH] eventfd signal race in aio_complete()
On Fri, 7 Mar 2008 20:29:20 -0800 (PST)

Davide Libenzi <davi...@xmailserver.org> wrote:
> The second solution/patch simply moves the eventfd_signal() call before
> the __aio_put_req() call, but after the event has beed "ringed".
> We should be clear to go with the shorter/nicer second solution. Those
> patches builds, but I'm not even signing them off till I tested them.

If there are no spinlock ordering issues between &ctx->ctx_lock
and &ctx->wqh.lock (taken inside eventfd_signal), then the second
patch is indeed preferable.

Jeff and I did look at that briefly last night, but were not
familiar enough with the code to decide whether or not that was
safe.

--
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Davide Libenzi  
View profile  
 More options Mar 8 2008, 3:40 pm
Newsgroups: linux.kernel
From: Davide Libenzi <davi...@xmailserver.org>
Date: Sat, 08 Mar 2008 21:40:09 +0100
Local: Sat, Mar 8 2008 3:40 pm
Subject: Re: [PATCH] eventfd signal race in aio_complete()

There's no interlocking between the two, so let's go with #2.
Jeff, would you mind giving patch #2 a spin in your test suite?

- Davide

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jeff Roberson  
View profile  
 More options Mar 8 2008, 4:40 pm
Newsgroups: linux.kernel
From: Jeff Roberson <jrober...@chesapeake.net>
Date: Sat, 08 Mar 2008 22:40:11 +0100
Local: Sat, Mar 8 2008 4:40 pm
Subject: Re: [PATCH] eventfd signal race in aio_complete()

I agree #2 would be best as well.  It may take me a few days due to some
equipment issues but I will get back to you as soon as I can.

Thanks,
Jeff

> - Davide

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Davide Libenzi  
View profile  
 More options Mar 8 2008, 5:00 pm
Newsgroups: linux.kernel
From: Davide Libenzi <davi...@xmailserver.org>
Date: Sat, 08 Mar 2008 23:00:19 +0100
Local: Sat, Mar 8 2008 5:00 pm
Subject: Re: [PATCH] eventfd signal race in aio_complete()

I've just tested it here, and it's fine. My test case is not as complex as
yours though, so I'll wait your feedback before posting. Just remember to
ping me back, otherwise I'll forget to post ;)

- Davide

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »