Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
FileObject and short reads
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 30 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Geert Jansen  
View profile  
 More options Jul 21 2012, 6:18 pm
From: Geert Jansen <gee...@gmail.com>
Date: Sat, 21 Jul 2012 15:18:15 -0700 (PDT)
Local: Sat, Jul 21 2012 6:18 pm
Subject: FileObject and short reads

I am using the recently introduced FileObject to do cooperative IO on a
file descriptor. The FileObject.read() method by default will retry reads
until it can satisfy the amount of bytes requested. I think these semantics
are not great for non-blocking IO, at least not for my use case. In my case
i am reading the standard output of a command line process, and i do not
know how many bytes of output will be available.

One workaround would be to read 1 byte every time, which would suck for
performance.

The retry read behavior is actually implemented in Python's _fileobject not
in gevent's FileObject.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Denis Bilenko  
View profile  
 More options Jul 23 2012, 3:15 am
From: Denis Bilenko <denis.bile...@gmail.com>
Date: Mon, 23 Jul 2012 11:15:21 +0400
Local: Mon, Jul 23 2012 3:15 am
Subject: Re: [gevent] FileObject and short reads

On Sun, Jul 22, 2012 at 2:18 AM, Geert Jansen <gee...@gmail.com> wrote:
> I am using the recently introduced FileObject to do cooperative IO on a file
> descriptor. The FileObject.read() method by default will retry reads until
> it can satisfy the amount of bytes requested. I think these semantics are
> not great for non-blocking IO, at least not for my use case.

Well, these are the semantics of a file object in Python.

Take a look at gevent.fileobject.SocketAdapter class which has recv()
method that is like socket's recv.

https://bitbucket.org/denis/gevent/src/8c9722b1fc2a/gevent/fileobject...

Note, that this class is an implementation detail and might be
removed/changed in the future without notice. Nothing stops you from
copying it into your own project though.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Geert Jansen  
View profile  
 More options Jul 23 2012, 4:51 pm
From: Geert Jansen <gee...@gmail.com>
Date: Mon, 23 Jul 2012 13:51:22 -0700 (PDT)
Local: Mon, Jul 23 2012 4:51 pm
Subject: Re: [gevent] FileObject and short reads

Hi,

On Monday, July 23, 2012 9:15:21 AM UTC+2, Denis Bilenko wrote:

Well, these are the semantics of a file object in Python.

> Take a look at gevent.fileobject.SocketAdapter class which has recv()
> method that is like socket's recv.

> https://bitbucket.org/denis/gevent/src/8c9722b1fc2a/gevent/fileobject...

OK, understand. Actually using FileObject was a workaround for me anyway
since i'm working directly with file descriptors and really don't need/want
a file object

I have attached a patch that adds cooperative versions of os.read() and
os.write() to gevent. The patch also updates monkey.patch_os() to install
these versions. I have tested it on Linux and Mac OSX, both the posix
variants as well as the thread pool variants. I don't have a usable
development system on Windows currently so i could not test that platform.

Any chance this could be added?

Thanks,
Geert

  gevent-fd.patch
5K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Denis Bilenko  
View profile  
 More options Jul 24 2012, 4:23 am
From: Denis Bilenko <denis.bile...@gmail.com>
Date: Tue, 24 Jul 2012 12:23:33 +0400
Local: Tues, Jul 24 2012 4:23 am
Subject: Re: [gevent] FileObject and short reads

Yeah, there is a chance :) I'll test it later. It will be called
gevent.os though.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vitaly  
View profile   Translate to Translated (View Original)
 More options Jul 25 2012, 4:07 am
From: vitaly <vitaly.krugl.nume...@gmail.com>
Date: Wed, 25 Jul 2012 01:07:36 -0700 (PDT)
Local: Wed, Jul 25 2012 4:07 am
Subject: Re: [gevent] FileObject and short reads

Wait -- there is a serious problem with the attached posix_read and
posix_write functions when used with things like stdin/stdout file
descriptors. For example, if you attempt to use posix_read() on the
stdin.fileno(), fcntl.fcntl(fd, fcntl.F_SETFL, flags|os.O_NONBLOCK) also
makes stdout non-blocking. This causes writes to stdout to fail
sporadically with something like EWOULDBLOCK. I experienced this on MacOS
(and Linux?) very recently when trying to solve a similar problem. My app
needed to make gevent-friendly reads from stdin, while logging in my app
was configured to go to stdout. So, while my gevent-friendly fd-read
function blocked on the hub waiting for input, other greenlets would
execute and some would make logging calls, which would occasionally result
in an error exception with errno=EWOULDBLOCK because O_NONBLOCK on stdin
also made stdout O_NONBLOCK (stdin and stdout have different file
descriptors (0 and 1), but share fcntl flags, including O_NONBLOCK). My
solution that fixed the problem was to set O_NONBLOCK before os.read() and
restore the original flags in a finally clause immediately after os.read()
inside the loop; this way, the flags will be restored if the non-blocking os_read()
succeeds and posix_read() returns or if there is a greenlet context switch
as the result of hub.wait. Same applies to posix_write().


+def posix_read(fd, n):
+    """Read up to `n` bytes from file descriptor `fd`."""
+    flags = fcntl.fcntl(fd, fcntl.F_GETFL, 0)
+    if not flags & os.O_NONBLOCK:
+        fcntl.fcntl(fd, fcntl.F_SETFL, flags|os.O_NONBLOCK)
+    while True:
+        try:
+            return os_read(fd, n)
+        except OSError:
+            code = sys.exc_info()[1].args[0]
+            if code == EBADF:
+                return ''
+            elif code != EAGAIN:
+                raise
+            sys.exc_clear()
+        hub = get_hub()
+        event = hub.loop.io(fd, 1)
+        try:
+            hub.wait(event)
+        except IOError:
+            code = sys.exc_info()[1].args[0]
+            if code == EBADF:
+                return ''
+            raise

<<


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vitaly  
View profile   Translate to Translated (View Original)
 More options Jul 25 2012, 4:15 am
From: vitaly <vitaly.krugl.nume...@gmail.com>
Date: Wed, 25 Jul 2012 01:15:49 -0700 (PDT)
Local: Wed, Jul 25 2012 4:15 am
Subject: Re: [gevent] FileObject and short reads

Here is an example of the solution I described above for the os.write case
(this is inside the while loop):

    # Temporarily set fd in NONBLOCKING mode, so the following write won't
block
    originalFdFlags = _makeFDNonBlocking(fd)
    try:
      numBytesSent += os.write(fd, buffer(string, numBytesSent))
    except socket.error as e:
      if e.args[0] != errno.EWOULDBLOCK:
        raise
    finally:
      # Restore fd flags before we (possibly) block on gevent_select (or in
case
      # of exception), so that other threads/greenlets won't be impacted by
our
      # change to this fd (e.g., it might be stderr or stdout fd)
      _restoreFDFcntlFlags(fd, originalFdFlags)

<<


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Geert Jansen  
View profile  
 More options Jul 25 2012, 12:07 pm
From: Geert Jansen <gee...@gmail.com>
Date: Wed, 25 Jul 2012 18:07:24 +0200
Local: Wed, Jul 25 2012 12:07 pm
Subject: Re: [gevent] FileObject and short reads

On Wed, Jul 25, 2012 at 10:07 AM, vitaly <vitaly.krugl.nume...@gmail.com> wrote:
>  My solution that fixed the problem was to set
> O_NONBLOCK before os.read() and restore the original flags in a finally
> clause immediately after os.read() inside the loop; this way, the flags will
> be restored if the non-blocking os_read() succeeds and posix_read() returns
> or if there is a greenlet context switch as the result of hub.wait. Same
> applies to posix_write().

Restoring the fd flags before a context switch seem reasonable.
There's an overhead of one extra syscall per read, but that should
probably be fine. We already need at least 2 syscalls anyway (one to
see if the socket is blocking, one for the read), so adding a third
one maybe isn't such a big deal.

Regards,
Geert


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vitaly  
View profile  
 More options Jul 25 2012, 12:36 pm
From: vitaly <vitaly.krugl.nume...@gmail.com>
Date: Wed, 25 Jul 2012 09:36:50 -0700 (PDT)
Local: Wed, Jul 25 2012 12:36 pm
Subject: Re: FileObject and short reads
On Jul 25, 9:07 am, Geert Jansen <gee...@gmail.com> wrote:

Right -- it's a necessary overhead since real apps will break without
it. stdin/stdout sharing fcntl flags is but one such example - there
may very well be other pairings like that.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Geert Jansen  
View profile  
 More options Jul 30 2012, 4:09 am
From: Geert Jansen <gee...@gmail.com>
Date: Mon, 30 Jul 2012 01:09:53 -0700 (PDT)
Local: Mon, Jul 30 2012 4:09 am
Subject: Re: [gevent] FileObject and short reads

Hi,

On Tuesday, July 24, 2012 10:23:33 AM UTC+2, Denis Bilenko wrote:

> Yeah, there is a chance :) I'll test it later. It will be called
> gevent.os though.

Just trying to keep the momentum on this issue... I am trying to create an
updated version of my patch that addresses vitaly's issue and also calls
the module gevent.os like you say above.

However since the "os" module is imported by many gevent submodules, they
would all import the gevent.os instead of the stdlib os. Do you prefer to
expose the entire stdlib's os in gevent.os, or use "from __future__ import
absolute_import" to prevent that?

Regards,
Geert


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Denis Bilenko  
View profile  
 More options Jul 30 2012, 4:28 am
From: Denis Bilenko <denis.bile...@gmail.com>
Date: Mon, 30 Jul 2012 12:28:10 +0400
Local: Mon, Jul 30 2012 4:28 am
Subject: Re: [gevent] FileObject and short reads

On Mon, Jul 30, 2012 at 12:09 PM, Geert Jansen <gee...@gmail.com> wrote:
> Just trying to keep the momentum on this issue... I am trying to create an
> updated version of my patch that addresses vitaly's issue and also calls the
> module gevent.os like you say above.

> However since the "os" module is imported by many gevent submodules, they
> would all import the gevent.os instead of the stdlib os. Do you prefer to
> expose the entire stdlib's os in gevent.os, or use "from __future__ import
> absolute_import" to prevent that?

Even if you expose the entire stdlib's os in gevent.os you'd still
need to access the original os module somehow.

I used to just do __import__('module'). However, now I'd choose absolute_import.

In fact, now that support for python2.4 is dropped, it probably makes
sense to go back and replace all __import__('...') hacks with
absolute_import.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Geert Jansen  
View profile  
 More options Jul 31 2012, 5:48 am
From: Geert Jansen <gee...@gmail.com>
Date: Tue, 31 Jul 2012 02:48:48 -0700 (PDT)
Local: Tues, Jul 31 2012 5:48 am
Subject: Re: [gevent] FileObject and short reads

Hi,

On Monday, July 30, 2012 10:28:10 AM UTC+2, Denis Bilenko wrote:

Even if you expose the entire stdlib's os in gevent.os you'd still

> need to access the original os module somehow.

> I used to just do __import__('module'). However, now I'd choose
> absolute_import.

> In fact, now that support for python2.4 is dropped, it probably makes
> sense to go back and replace all __import__('...') hacks with
> absolute_import.

Attached is an updated patch. Changes:

 * Implemented the proposed new semantics where the patched os.read() and
os.write() restore file descriptor flags.
 * The patched os.read() and os.write() now only raise an OSError.
Previously they could also raise an IOError which is inconsistent with the
stdlib versions.
 * The module is now called gevent.os instead of gevent.fd. This module
just provides the patched functions, it does not expose the entire stdlib.
 * fork() moved to gevent.os as well. This is a user visible change. Hope
that's still OK given that we're in beta, otherwise a stub can be added to
gevent.hub.
 * Added absolute_import where necessary to prevent other gevent code
importing gevent.os instead of the stdlib os.
 * Cleaned up most __import__ hacks and moved to absolute_import.
 * Updated tests for gevent.os.
 * Make sure gevent.fileobject uses the non-patched os.read/os.write.

I have tested this patch on Linux both with the posix and the threadpool
variants. On my system the test suite runs with 25 errors which all are
there before my patch too.

Regards,
Geert

  gevent-os.patch
15K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vitaly  
View profile  
 More options Jul 31 2012, 3:49 pm
From: vitaly <vitaly.krugl.nume...@gmail.com>
Date: Tue, 31 Jul 2012 12:49:32 -0700 (PDT)
Local: Tues, Jul 31 2012 3:49 pm
Subject: Re: FileObject and short reads
Hi,

On Jul 31, 2:48 am, Geert Jansen <gee...@gmail.com> wrote:

The following annotated subset of the proposed gevent-os.patch details
serious bugs and os read/write API incompatibilities that I found in
the patch.  I will post the fixed posix_read and posix_write shortly.

Serious Bugs and OS read/write incompatibilities in the proposed patch
are tagged with BUG below; corresponding
fix proposals are tagged with FIX:

+def _wrap_errors(func, *args):
+    """Convert any error to an OSError."""
+    try:
+        return func(*args)
+    except Exception, e:
         BUG1: this breaks OSError, which is defined as (note errno
and strerror
          attributes): "It is raised when a
          function returns a system-related error (not for illegal
argument
          types or other incidental errors). The errno attribute is a
numeric
          error code from errno, and the strerror attribute is the
corresponding
          string, as would be printed by the C function perror(). See
the module
          errno, which contains names for the error codes defined by
the
          underlying operating system."
         BUG2: This also causes loss of the original traceback, making
it
          impossible for user code to determine which lower-level API
call
          actually failed.
         FIX: Don't use _wrap_errors, it is not needed; the actual
exception
          raised by the lower-level API is the acceptable exception.
+        raise OSError(*e.args)
+
+
+def posix_read(fd, n):
+    """Read up to `n` bytes from file descriptor `fd`."""
+    flags = _wrap_errors(fcntl.fcntl, fd, fcntl.F_GETFL, 0)
+    if not flags & os.O_NONBLOCK:
+        _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL, flags|
os.O_NONBLOCK)
+    try:
+        while True:
+            try:
+                return os_read(fd, n)
+            except OSError, e:
+                if e[0] == EBADF:
                     BUG: This breaks the definition of os.read, which
is defined
                      as: "Read at most n bytes from file descriptor
fd. Return
                      a string containing the bytes read. If the end
of the file
                      referred to by fd has been reached, an empty
string is
                      returned.".
                     FIX: Remove the "if e[0] == EBADF: return ''"
block. When
                      given a bad file descriptor, os.read raises
                      OSError with errno=EBADF in this case, which is
the
                      correct behavior (try os.read(999999, 10) and
see for yourself);
+                    return ''
+                elif e[0] != EAGAIN:
+                    raise
+                sys.exc_clear()
+            hub = get_hub()
+            event = hub.loop.io(fd, 1)
+            try:
+                _wrap_errors(hub.wait, event)
+            except OSError, e:
+                if e[0] == EBADF:
                     BUG: same as above EBADF BUG
                     FIX: same as above EBADF FIX; the original
OSError
                      exception with errno=EBADF is the correct
behavior and must
                      NOT be overridden
+                    return ''
+                raise
+    finally:
         BUG: The flags are restored too late; there is a serious
problem with
          these posix_read and posix_write functions when used with
things like
          stdin/stdout file descriptors, or other file descriptors
whose fcntl
          flags are linked in a similar way. For example, if you
attempt to use
          posix_read() on the stdin.fileno(), fcntl.fcntl(fd,
fcntl.F_SETFL, flags|os.O_NONBLOCK)
          also makes stdout non-blocking. This causes writes to stdout
to fail
          sporadically with unexpected errors like EWOULDBLOCK when
attempted
          during gevent context switch while this function is waiting
in hub.wait.
          The converse also fails. I experienced this on MacOS (and
Linux?) very
          recently when trying to solve a similar problem. My app
needed to make
          gevent-friendly reads from stdin, while logging in my app
was configured
          to go to stdout. So, while my gevent-friendly fd-read
function blocked
          on the hub waiting for input, other greenlets would execute
and some
          would make logging calls, which would sporadically and
unexpectedly
          fail  exception with OSError with errno=EWOULDBLOCK because
O_NONBLOCK
          on stdin also made stdout O_NONBLOCK (stdin and stdout have
different
          file descriptors (0 and 1), but share fcntl flags, including
O_NONBLOCK).
         FIX: restore fcntl flags in a finally clause for return
os_write(fd, buf)
          (instead of the outer finally clause, which is not needed)
+        if not flags & os.O_NONBLOCK:
+            _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL, flags)
+
+def posix_write(fd, buf):
+    """Write bytes from buffer `buf` to file descriptor `fd`."""
+    flags = _wrap_errors(fcntl.fcntl, fd, fcntl.F_GETFL, 0)
+    if not flags & os.O_NONBLOCK:
+        _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL, flags|
os.O_NONBLOCK)
+    try:
+        while True:
+            try:
+                return os_write(fd, buf)
+            except OSError, e:
+                if e[0] == EBADF:
                     BUG: This breaks the definition of os.write,
which is defined
                     as: "Write the string str to file descriptor fd.
Return the
                     number of bytes actually written."
                     FIX: Remove the "if e[0] == EBADF: return ''"
block. When
                      given a bad file descriptor, os.write raises
                      OSError with errno=EBADF in this case, which is
the
                      correct behavior (try os.write(999999, "abc")
and see for yourself);
+                    return ''
+                elif e[0] != EAGAIN:
+                    raise
+                sys.exc_clear()
+            hub = get_hub()
+            event = hub.loop.io(fd, 2)
+            try:
+                _wrap_errors(hub.wait, event)
+            except OSError, e:
+                if e[0] == EBADF:
                     BUG: same as above EBADF BUG
                     FIX: same as above EBADF FIX; the original
OSError
                      exception with errno=EBADF is the correct
behavior and must
                      NOT be overridden
+                    return ''
+                raise
+    finally:
         BUG: The flags are restored too late; there is a serious
problem with
          these posix_read and posix_write functions when used with
things like
          stdin/stdout file descriptors, or other file descriptors
whose fcntl
          flags are linked in a similar way. For example, if you
attempt to use
          posix_read() on the stdin.fileno(), fcntl.fcntl(fd,
fcntl.F_SETFL, flags|os.O_NONBLOCK)
          also makes stdout non-blocking. This causes writes to stdout
to fail
          sporadically with unexpected errors like EWOULDBLOCK when
attempted
          during gevent context switch while this function is waiting
in hub.wait.
          The converse also fails. I experienced this on MacOS (and
Linux?) very
          recently when trying to solve a similar problem. My app
needed to make
          gevent-friendly reads from stdin, while logging in my app
was configured
          to go to stdout. So, while my gevent-friendly fd-read
function blocked
          on the hub waiting for input, other greenlets would execute
and some
          would make logging calls, which would sporadically and
unexpectedly
          fail  exception with OSError with errno=EWOULDBLOCK because
O_NONBLOCK
          on stdin also made stdout O_NONBLOCK (stdin and stdout have
different
          file descriptors (0 and 1), but share fcntl flags, including
O_NONBLOCK).
         FIX: restore fcntl flags in a finally clause for return
os_write(fd, buf)
          (instead of the outer finally clause, which is not needed)
+        if not flags & os.O_NONBLOCK:
+            _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL, flags)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vitaly  
View profile  
 More options Jul 31 2012, 3:59 pm
From: vitaly <vitaly.krugl.nume...@gmail.com>
Date: Tue, 31 Jul 2012 12:59:30 -0700 (PDT)
Local: Tues, Jul 31 2012 3:59 pm
Subject: Re: FileObject and short reads

On Tuesday, July 31, 2012 12:49:32 PM UTC-7, vitaly wrote:

> Hi,

> The following annotated subset of the proposed gevent-os.patch details
> serious bugs and os read/write API incompatibilities that I found in
> the patch.  I will post the fixed posix_read and posix_write shortly.

Thanks to Geert Jansen for the original posting of the patch and getting
everyone motivated around this.  I am attaching new versions of  posix_read
and posix_write with fixes for all the bugs and os read/write
incompatibilities that I identified in my previous post.  Please review.

Thank you,
Vitaly


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vitaly  
View profile  
 More options Jul 31 2012, 4:10 pm
From: vitaly <vitaly.krugl.nume...@gmail.com>
Date: Tue, 31 Jul 2012 13:10:19 -0700 (PDT)
Local: Tues, Jul 31 2012 4:10 pm
Subject: Re: FileObject and short reads

Not sure what happened to the attachment, so trying to attach again...

  posix_read_write.py
1K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthias Urlichs  
View profile  
 More options Jul 31 2012, 4:40 pm
From: Matthias Urlichs <matth...@urlichs.de>
Date: Tue, 31 Jul 2012 22:40:28 +0200
Local: Tues, Jul 31 2012 4:40 pm
Subject: Re: [gevent] Re: FileObject and short reads
Hi,

vitaly:

>          BUG1: this breaks OSError, which is defined as (note errno
> and strerror

Could you please re-post this comment WITHOUT word wrapping?
Your comments are mostly unreadable because of it.

--
-- Matthias Urlichs


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vitaly  
View profile  
 More options Jul 31 2012, 4:59 pm
From: vitaly <vitaly.krugl.nume...@gmail.com>
Date: Tue, 31 Jul 2012 13:59:41 -0700 (PDT)
Local: Tues, Jul 31 2012 4:59 pm
Subject: Re: [gevent] Re: FileObject and short reads

On Tuesday, July 31, 2012 1:40:28 PM UTC-7, smurf wrote:

> Hi,

> vitaly:
> >          BUG1: this breaks OSError, which is defined as (note errno
> > and strerror

> Could you please re-post this comment WITHOUT word wrapping?
> Your comments are mostly unreadable because of it.

> --
> -- Matthias Urlichs

This is the requested re-post (hopefully without ugly word-wrapping this
time) of my earlier post that identifies a number of bugs and os read/write
incompatibilities in Geert Jansen's patch proposal for
posix_read/posix_write.  (I am wrapping it in <PRE> and </PRE> tags this
time.  If this doesn't help with the ugly wrapping, please suggest a more
appropriate technique).  My earlier post also includes an attachment with
fixes for these issues.

<PRE>
Serious Bugs and os read/write incompatibilities in the proposed patch are
tagged with BUG below; corresponding fix proposals are tagged with FIX:

+def _wrap_errors(func, *args):
+    """Convert any error to an OSError."""
+    try:
+        return func(*args)
+    except Exception, e:
         BUG1: this breaks OSError, which is defined as (note errno and
strerror
          attributes): "It is raised when a
          function returns a system-related error (not for illegal argument
          types or other incidental errors). The errno attribute is a
numeric
          error code from errno, and the strerror attribute is the
corresponding
          string, as would be printed by the C function perror(). See the
module
          errno, which contains names for the error codes defined by the
          underlying operating system."
         BUG2: This also causes loss of the original traceback, making it
          impossible for user code to determine which lower-level API call
          actually failed.
         FIX: Don't use _wrap_errors, it is not needed; the actual exception
          raised by the lower-level API is the acceptable exception.
+        raise OSError(*e.args)
+
+
+def posix_read(fd, n):
+    """Read up to `n` bytes from file descriptor `fd`."""
+    flags = _wrap_errors(fcntl.fcntl, fd, fcntl.F_GETFL, 0)
+    if not flags & os.O_NONBLOCK:
+        _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL, flags|os.O_NONBLOCK)
+    try:
+        while True:
+            try:
+                return os_read(fd, n)
+            except OSError, e:
+                if e[0] == EBADF:
                     BUG: This breaks the definition of os.read, which is
defined
                      as: "Read at most n bytes from file descriptor fd.
Return
                      a string containing the bytes read. If the end of the
file
                      referred to by fd has been reached, an empty string is
                      returned.".
                     FIX: Remove the "if e[0] == EBADF: return ''" block.
When
                      given a bad file descriptor, os.read raises
                      OSError with errno=EBADF in this case, which is the
                      correct behavior (try os.read(999999, 10) and see for
                      yourself);
+                    return ''
+                elif e[0] != EAGAIN:
+                    raise
+                sys.exc_clear()
+            hub = get_hub()
+            event = hub.loop.io(fd, 1)
+            try:
+                _wrap_errors(hub.wait, event)
+            except OSError, e:
+                if e[0] == EBADF:
                     BUG: same as above EBADF BUG
                     FIX: same as above EBADF FIX; the original OSError
                      exception with errno=EBADF is the correct behavior
and must
                      NOT be overridden
+                    return ''
+                raise
+    finally:
         BUG: The flags are restored too late; there is a serious problem
with
          these posix_read and posix_write functions when used with things
like
          stdin/stdout file descriptors, or other file descriptors whose
fcntl
          flags are linked in a similar way. For example, if you attempt to
use
          posix_read() on the stdin.fileno(), fcntl.fcntl(fd,
fcntl.F_SETFL, flags|os.O_NONBLOCK)
          also makes stdout non-blocking. This causes writes to stdout to
fail
          sporadically with unexpected errors like EWOULDBLOCK when
attempted
          during gevent context switch while this function is waiting in
hub.wait.
          The converse also fails. I experienced this on MacOS (and Linux?)
very
          recently when trying to solve a similar problem. My app needed to
make
          gevent-friendly reads from stdin, while logging in my app was
configured
          to go to stdout. So, while my gevent-friendly fd-read function
blocked
          on the hub waiting for input, other greenlets would execute and
some
          would make logging calls, which would sporadically and
unexpectedly
          fail  exception with OSError with errno=EWOULDBLOCK because
O_NONBLOCK
          on stdin also made stdout O_NONBLOCK (stdin and stdout have
different
          file descriptors (0 and 1), but share fcntl flags, including
O_NONBLOCK).
         FIX: restore fcntl flags in a finally clause for return
os_write(fd, buf)
          (instead of the outer finally clause, which is not needed)
+        if not flags & os.O_NONBLOCK:
+            _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL, flags)
+
+def posix_write(fd, buf):
+    """Write bytes from buffer `buf` to file descriptor `fd`."""
+    flags = _wrap_errors(fcntl.fcntl, fd, fcntl.F_GETFL, 0)
+    if not flags & os.O_NONBLOCK:
+        _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL, flags|os.O_NONBLOCK)
+    try:
+        while True:
+            try:
+                return os_write(fd, buf)
+            except OSError, e:
+                if e[0] == EBADF:
                     BUG: This breaks the definition of os.write, which is
defined
                     as: "Write the string str to file descriptor fd.
Return the
                     number of bytes actually written."
                     FIX: Remove the "if e[0] == EBADF: return ''" block.
When
                      given a bad file descriptor, os.write raises
                      OSError with errno=EBADF in this case, which is the
                      correct behavior (try os.write(999999, "abc") and see
for yourself);
+                    return ''
+                elif e[0] != EAGAIN:
+                    raise
+                sys.exc_clear()
+            hub = get_hub()
+            event = hub.loop.io(fd, 2)
+            try:
+                _wrap_errors(hub.wait, event)
+            except OSError, e:
+                if e[0] == EBADF:
                     BUG: same as above EBADF BUG
                     FIX: same as above EBADF FIX; the original OSError
                      exception with errno=EBADF is the correct behavior
and must
                      NOT be overridden
+                    return ''
+                raise
+    finally:
         BUG: The flags are restored too late; there is a serious problem
with
          these posix_read and posix_write functions when used with things
like
          stdin/stdout file descriptors, or other file descriptors whose
fcntl
          flags are linked in a similar way. For example, if you attempt to
use
          posix_read() on the stdin.fileno(), fcntl.fcntl(fd,
fcntl.F_SETFL, flags|os.O_NONBLOCK)
          also makes stdout non-blocking. This causes writes to stdout to
fail
          sporadically with unexpected errors like EWOULDBLOCK when
attempted
          during gevent context switch while this function is waiting in
hub.wait.
          The converse also fails. I experienced this on MacOS (and Linux?)
very
          recently when trying to solve a similar problem. My app needed to
make
          gevent-friendly reads from stdin, while logging in my app was
configured
          to go to stdout. So, while my gevent-friendly fd-read function
blocked
          on the hub waiting for input, other greenlets would execute and
some
          would make logging calls, which would sporadically and
unexpectedly
          fail  exception with OSError with errno=EWOULDBLOCK because
O_NONBLOCK
          on stdin also made stdout O_NONBLOCK (stdin and stdout have
different
          file descriptors (0 and 1), but share fcntl flags, including
O_NONBLOCK).
         FIX: restore fcntl flags in a finally clause for return
os_write(fd, buf)
          (instead of the outer finally clause, which is not needed)
+        if not flags & os.O_NONBLOCK:
+            _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL, flags)
</PRE>


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Geert Jansen  
View profile  
 More options Jul 31 2012, 5:23 pm
From: Geert Jansen <gee...@gmail.com>
Date: Tue, 31 Jul 2012 14:23:52 -0700 (PDT)
Local: Tues, Jul 31 2012 5:23 pm
Subject: Re: FileObject and short reads

Hi Vitaly, thanks for the review.

See a few comments inline below.

The line wrapping in your response makes it hard to read by the way.

On Tuesday, July 31, 2012 9:49:32 PM UTC+2, vitaly wrote:

+def _wrap_errors(func, *args):

> +    """Convert any error to an OSError."""
> +    try:
> +        return func(*args)
> +    except Exception, e:

>          BUG1: this breaks OSError, which is defined as (note errno and
> strerror
>           attributes): [snip...]

         BUG2: This also causes loss of the original traceback, making

> it impossible for user code to determine which lower-level API
> call actually failed.

The stdlib variant of os.read() only returns an OSError, never an IOError.
However fcntl will return an IOError. So we need to translate that IOError
into an OSError. Otherwise code that calls os.read() and only catches
OSError will break with our monkey patched version. This actually happened
to me in the test_ssl.py test suite that failed for me.

I think the proper fix is to catch everything /except/ OSError. OSError
should be re-raised with the original trackback, other errors should be
translated to OSError.

+        while True:

> +            try:
> +                return os_read(fd, n)
> +            except OSError, e:
> +                if e[0] == EBADF:

>                      BUG: This breaks the definition of os.read, which is
> defined
>                       as: "Read at most n bytes from file descriptor fd.
> Return
>                       a string containing the bytes read. If the end of
> the file
>                       referred to by fd has been reached, an empty string
> is
>                       returned.".  

The EBADF test is actually copied from fileobject.py. I think it is not
correct around os_read() but it may be correct around hub.wait(). The first
time the fd was read from (in the os_read() call above), the file
descriptor was OK and it returned EAGAIN. Then you switch to the hub, which
calls select/poll on it, and for some reason the fd has gone bad. I think
the reason that this test was in fileobject.py is that you want to first
communicate an EOF to the user, and then next time you communicate the
EBADF (when os_read() is called again).

You are right about restoring the flags before the context switch to the
hub by the way. You mentioned that before, and it's a logic error on my
side.

I've added a v3 of the patch that i think addresses all points.

Regards,
Geert

  gevent-os-v3.patch
15K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Geert Jansen  
View profile  
 More options Jul 31 2012, 5:26 pm
From: Geert Jansen <gee...@gmail.com>
Date: Tue, 31 Jul 2012 14:26:53 -0700 (PDT)
Local: Tues, Jul 31 2012 5:26 pm
Subject: Re: FileObject and short reads

On Tuesday, July 31, 2012 11:23:52 PM UTC+2, Geert Jansen wrote:

> I've added a v3 of the patch that i think addresses all points.

Argh.. Now attaching the right patch.

Regards,
Geert

  gevent-os-v3.patch
15K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Geert Jansen  
View profile  
 More options Jul 31 2012, 5:30 pm
From: Geert Jansen <gee...@gmail.com>
Date: Tue, 31 Jul 2012 14:30:22 -0700 (PDT)
Local: Tues, Jul 31 2012 5:30 pm
Subject: Re: FileObject and short reads

On Tuesday, July 31, 2012 11:26:53 PM UTC+2, Geert Jansen wrote:

> On Tuesday, July 31, 2012 11:23:52 PM UTC+2, Geert Jansen wrote:

>> I've added a v3 of the patch that i think addresses all points.

> Argh.. Now attaching the right patch.

Apologies for spamming everybody's inboxes (and damn Google that doesn't
let me check an attachment after i upload it).

This is the right one.

Regards,
Geert

  gevent-os-v3.patch
15K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vitaly  
View profile  
 More options Jul 31 2012, 10:38 pm
From: vitaly <vitaly.krugl.nume...@gmail.com>
Date: Tue, 31 Jul 2012 19:38:52 -0700 (PDT)
Local: Tues, Jul 31 2012 10:38 pm
Subject: Re: FileObject and short reads

Hi Geert, here I wanted to comment only about the following statement from
your earlier post, and I will post additional details in a separate post.

So, regarding


The EBADF test is actually copied from fileobject.py. I think it is not
correct around os_read() but it may be correct around hub.wait(). The first
time the fd was read from (in the os_read() call above), the file
descriptor was OK and it returned EAGAIN. Then you switch to the hub, which
calls select/poll on it, and for some reason the fd has gone bad. I think
the reason that this test was in fileobject.py is that you want to first
communicate an EOF to the user, and then next time you communicate the
EBADF (when os_read() is called again).
<<

EBADF usually means that either we were passed an invalid file descriptor
or that another thread/greenlet closed that file descriptor before we were
done using it. Both cases are either application errors or intentional
behavior by the apps and MUST be signaled by raising the intended exception
(OSError/EBADF in this case).

P.S., sorry about the line-wrapping; I'm in learning mode when it comes to
Google Groups Markup.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vitaly  
View profile  
 More options Jul 31 2012, 10:44 pm
From: vitaly <vitaly.krugl.nume...@gmail.com>
Date: Tue, 31 Jul 2012 19:44:21 -0700 (PDT)
Local: Tues, Jul 31 2012 10:44 pm
Subject: Re: FileObject and short reads

Hi Geert, I think we're almost there.  Here is my review of your latest
patch, with errors tagged with "BUG".  I am also attaching the version with
my fixes for those bugs. One thing became apparent to me is that when we
make certain non-obvious decisions in the code, we need to document why we
chose that path in order to help maintainers of the code with future
modifications.  So, here is the feedback:

<PRE>

+def _wrap_errors(func, *args):
+    """Convert any error to an OSError."""
+    try:
+        return func(*args)
+    except OSError:
+        raise
+    except Exception, e:
          # BUG: We may not arbitrarily map all Exception-based instances to
          #  OSError because OSError makes certain assumptions about its
args. In
          #  particular, there is the 2-tuple rule for both IOError and
OSError
          #  that stipulates that in case of a 2-tuple arg, the first
sub-arg in
          #  that case must be a valid errno and the second sub-arg must be
the
          #  corresponding strerror string.
          #
          # For example, if we allow something like this to be created:
          #  oserror = OSError("a", "aaa")
          # then oserror.errno will evaluate to "a", which is an invalid
errno!
+        raise OSError(*e.args)

+def posix_read(fd, n):
+    """Read up to `n` bytes from file descriptor `fd`."""
+    while True:
+        flags = _wrap_errors(fcntl.fcntl, fd, fcntl.F_GETFL, 0)
+        if not flags & os.O_NONBLOCK:
+            _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL,
flags|os.O_NONBLOCK)
+        try:
+            return os_read(fd, n)
+        except OSError, e:
+            if e[0] != EAGAIN:
+                raise
+            sys.exc_clear()
+        finally:
+            if not flags & os.O_NONBLOCK:
+                _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL, flags)
+        hub = get_hub()
+        event = hub.loop.io(fd, 1)
+        try:
+            _wrap_errors(hub.wait, event)
+        except OSError, e:
+            if e[0] == EBADF:
                 # BUG: Built-in os.read is documented to "Return a string
                 #  containing the bytes read. If the end of the file
referred
                 #  to by fd has been reached, an empty string is returned."
                 #
                 # EBADF is very different from "end of file", so returning
an
                 #  empty string is a bug in this case. EBADF usually means
                 #  that either we were passed an invalid file descriptor
or that
                 #  another thread/greenlet closed that file descriptor
before we
                 #  were done using it. Both cases are either application
errors
                 #  or intentional behavior by the app and MUST be signaled
by
                 #  raising the intended exception (OSError/EBADF in this
case).
+                return ''
+            raise
+
+def posix_write(fd, buf):
+    """Write bytes from buffer `buf` to file descriptor `fd`."""
+    while True:
+        flags = _wrap_errors(fcntl.fcntl, fd, fcntl.F_GETFL, 0)
+        if not flags & os.O_NONBLOCK:
+            _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL,
flags|os.O_NONBLOCK)
+        try:
+            return os_write(fd, buf)
+        except OSError, e:
+            if e[0] != EAGAIN:
+                raise
+            sys.exc_clear()
+        finally:
+            if not flags & os.O_NONBLOCK:
+                _wrap_errors(fcntl.fcntl, fd, fcntl.F_SETFL, flags)
+        hub = get_hub()
+        event = hub.loop.io(fd, 2)
+        try:
+            _wrap_errors(hub.wait, event)
+        except OSError, e:
+            if e[0] == EBADF:
                 # 2 BUGs: Built-in os.write is documented to "Return the
number of
                 #  bytes actually written."  1. So, it's a bug to return
empty
                 #  string; 2. It's also a bug to suppress OSError/EBADF
for the
                 #  following reason:
                 #
                 # EBADF usually means that either we were passed an invalid
                 #  file descriptor or that another thread/greenlet closed
that
                 #  file descriptor before we were done using it. Both are
                 #  either application errors or intentional behavior and
MUST
                 #  be signaled by raising the intended exception
(OSError/EBADF
                 #  in this case).
+                return ''
+            raise

</PRE>

  posix_read_write_2_with_fixes.py
4K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Geert Jansen  
View profile  
 More options Aug 1 2012, 5:40 pm
From: Geert Jansen <gee...@gmail.com>
Date: Wed, 1 Aug 2012 14:40:03 -0700 (PDT)
Local: Wed, Aug 1 2012 5:40 pm
Subject: Re: FileObject and short reads

Thanks Vitaly,

On Wednesday, August 1, 2012 4:44:21 AM UTC+2, vitaly wrote:

Hi Geert, I think we're almost there.  Here is my review of your latest

> patch, with errors tagged with "BUG".  I am also attaching the version with
> my fixes for those bugs. One thing became apparent to me is that when we
> make certain non-obvious decisions in the code, we need to document why we
> chose that path in order to help maintainers of the code with future
> modifications.  So, here is the feedback:

Summarizing your last points for conciseness:

 1. Map IOError to IOError but do not map other errors.
 2. Remove the EBADF logic when switching to the hub.
 3. Document some of the reasoning.

Regarding #1 - I am OK with this, but with the remark that that solution
isn't 100% clean either. If at some point hub.wait() starts raising errors
other than IOError/OSError, this will break code that is calling
os.read/write and is only expecting OSError as the possible exception.

Regarding #2 - I am happy to remove it, which i did in the attached patch.
But i copied this from fileobject.py and there must be some kind of reason
for this code. Denis?

Regarding #3 - sure but let's be a little less verbose :)

Updated patch attached. Any more suggestions from your side? If not anybody
else or if not can my patch be added?

Regards,
Geert

  gevent-os-v4.patch
16K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vitaly  
View profile  
 More options Aug 1 2012, 6:07 pm
From: vitaly <vitaly.krugl.nume...@gmail.com>
Date: Wed, 1 Aug 2012 15:07:30 -0700 (PDT)
Local: Wed, Aug 1 2012 6:07 pm
Subject: Re: FileObject and short reads

Thanks Geert!  I agree with the reasoning and conclusions in this post.  
Regarding comments, my lengthy ones were meant only for communication with
you, not for direct inclusion in final code -- nice editing!  I also don't
know why fileobject.py does that with EBADF, but I think that would be a
bug for posix_read/write.  I reviewed the source code for posix_read and
posix_write in your attachment, but haven't tried running it.  The code
looks nice/tidy.  I would only change these things:

1. Document the return values of posix_read/posix_write in their
docstrings; this will help with understanding the API and with maintenance;
in my proposals, I used text from python documentation:
posix_read: "Read at most n bytes from file descriptor fd. Return a string
containing the bytes read. If the end of the file referred to by fd has
been reached, an empty string is returned."
posix_write: "Write the string buf to file descriptor fd. Return the number
of bytes actually written."

2. e[0] in posix_read and posix_write is not safe; e.errno (instead of
e[0]) is safer for OSError and IOError; for example:

>>> oe = OSError()
>>> oe[0]

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: tuple index out of range
>>> oe.errno
>>> repr(oe.errno)

'None'

Many thanks!

Vitaly


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vitaly  
View profile  
 More options Aug 3 2012, 10:54 pm
From: vitaly <vitaly.krugl.nume...@gmail.com>
Date: Fri, 3 Aug 2012 19:54:53 -0700 (PDT)
Local: Fri, Aug 3 2012 10:54 pm
Subject: Re: FileObject and short reads

Thank you,
Vitaly

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Geert Jansen  
View profile  
 More options Aug 4 2012, 3:41 am
From: Geert Jansen <gee...@gmail.com>
Date: Sat, 4 Aug 2012 00:41:48 -0700 (PDT)
Local: Sat, Aug 4 2012 3:41 am
Subject: Re: FileObject and short reads

On Saturday, August 4, 2012 4:54:53 AM UTC+2, vitaly wrote:

Hi Geert, any update on this?


Updated patch attached (gevent-os-v5).

Regards,
Geert

  gevent-os-v5.patch
16K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 30   Newer >
« Back to Discussions « Newer topic     Older topic »