Le Sun, 28 Oct 2012 16:52:02 -0700,
Guido van Rossum <gu...@python.org> a
écrit :
>
> The event list started out as a tuple of (fd, flag, callback, args),
> where flag is 'r' or 'w' (easily extensible); in practice neither the
> fd nor the flag are used, and one of the last things I did was to wrap
> callback and args into a simple object that allows cancelling the
> callback; the add_*() methods return this object. (This could probably
> use a little more abstraction.) Note that poll() doesn't call the
> callbacks -- that's up to the event loop.
I don't understand why the pollster takes callback objects if it never
calls them. Also the fact that it wraps them into DelayedCalls is more
mysterious to me. DelayedCalls represent one-time cancellable callbacks
with a given deadline, not callbacks which are called any number of
times on I/O events and that you can't cancel.
> scheduling.py:
> http://code.google.com/p/tulip/source/browse/scheduling.py
>
> This is the scheduler for PEP-380 style coroutines. I started with a
> Scheduler class and operations along the lines of Greg Ewing's design,
> with a Scheduler instance as a global variable, but ended up ripping
> it out in favor of a Task object that represents a single stack of
> generators chained via yield-from. There is a Context object holding
> the event loop and the current task in thread-local storage, so that
> multiple threads can (and must) have independent event loops.
YMMV, but I tend to be wary of implicit thread-local storage. What if
someone runs a function or method depending on that thread-local
storage from inside a thread pool? Weird bugs ensue.
I think explicit context is much less error-prone. Even a single global
instance (like Twisted's reactor) would be better :-)
As for the rest of the scheduling module, I can't say much since I have
a hard time reading and understanding it.
> To invoke a primitive I/O operation, you call the current task's
> block() method and then immediately yield (similar to Greg Ewing's
> approach). There are helpers block_r() and block_w() that arrange for
> a task to block until a file descriptor is ready for reading/writing.
> Examples of their use are in sockets.py.
That's weird and kindof ugly IMHO. Why would you write:
scheduling.block_w(self.sock.fileno())
yield
instead of say:
yield scheduling.block_w(self.sock.fileno())
?
Also, the fact that each call to SocketTransport.{recv,send} explicitly
registers then removes the fd on the event loop looks wasteful.
By the way, even when a fd is signalled ready, you must still be
prepared for recv() to return EAGAIN (see
http://bugs.python.org/issue9090).
> In the docstrings I use the prefix "COROUTINE:" to indicate public
> APIs that should be invoked using yield from.
Hmm, should they? Your approach looks a bit weird: you have functions
that should use yield, and others that should use "yield from"? That
sounds confusing to me.
I'd much rather either have all functions use "yield", or have all
functions use "yield from".
(also, I wouldn't be shocked if coroutines had to wear a special
decorator; it's a better marker than having the word COROUTINE in the
docstring, anyway :-))
> sockets.py: http://code.google.com/p/tulip/source/browse/sockets.py
>
> This implements some internet primitives using the APIs in
> scheduling.py (including block_r() and block_w()). I call them
> transports but they are different from transports Twisted; they are
> closer to idealized sockets. SocketTransport wraps a plain socket,
> offering recv() and send() methods that must be invoked using yield
> from. SslTransport wraps an ssl socket (luckily in Python 2.6 and up,
> stdlib ssl sockets have good async support!).
SslTransport.{recv,send} need the same kind of logic as do_handshake():
catch both SSLWantReadError and SSLWantWriteError, and call block_r /
block_w accordingly.
> Then there is a
> BufferedReader class that implements more traditional read() and
> readline() coroutines (i.e., to be invoked using yield from), the
> latter handy for line-oriented transports.
Well... It would be nice if BufferedReader could re-use the actual
io.BufferedReader and its fast readline(), read(), readinto()
implementations.
Regards
Antoine.
On Monday 29 Oct 2012, Richard Oudkerk wrote:Is that actually true? It may be guaranteed on Intel x86 compatibles and Linux
> Writing (short messages) to a pipe also
> has atomic guarantees that can make having multiple writers perfectly
> reasonable.
>
> --
> Richard
>
> _______________________________________________
> Python-ideas mailing list
> Python...@python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
(because of the string operations available in the x86 instruction set), but I
don't thing anything other than an IPC message has a "you can write a string
atomically" guarantee. And I may be misremembering that.
 Â
Well, using keyword-only arguments for passing flags can be good point.
I can live with *args only. Maybe using **kwargs for call_later family
only is good compromise?
Really I don't care on add_reader/add_writer, that functions intended
to library writers.
call_later and call_soon can be used in user code often enough and
passing keyword arguments can be convenient.
--
Thanks,
Andrew Svetlov
I've found that concept very useful when I used twisted.
--
Thanks,
Andrew Svetlov
yield from trans.send(line.upper())Not only do I not understand why I'm yielding there in the first place (I don't have to wait for anything, I just want to push some data out!), it feels like all of my yields have been replaced with yield froms for no obvious reason (well, there are reasons, I'm just trying to look at this naively).
The IOCP thread pool is managed by Windows, not you.
Regards
Antoine.
Here's another unscientific benchmark: I wrote a stupid "http" server
(stupider than echosvr.py actually) that accepts HTTP requests and
responds with the shortest possible "200 Ok" response. This should
provide an adequate benchmark of how fast the event loop, scheduler,
and transport are at accepting and closing connections (and reading
and writing small amounts). On my linux box at work, over localhost,
it seems I can handle 10K requests (sent using 'ab' over localhost) in
1.6 seconds. Is that good or bad? The box has insane amounts of memory
and 12 cores (?) and rates at around 115K pystones.
With sufficiently cheap tasks, there's another way to approach
this: one task is dedicated to accepting connections from the
socket, and it spawns a new task to handle each connection.
--
Greg
The pollster has a very simple API: add_reader(fd, callback, *args),
add_writer(<ditto>), remove_reader(fd), remove_writer(fd), and
poll(timeout) -> list of events. (fd means file descriptor.) There's
also pollable() which just checks if there are any fds registered. My
implementation requires fd to be an int, but that could easily be
extended to support other types of event sources.
I'm not super happy that I have parallel reader/writer APIs, but passing a separate read/write flag didn't come out any more elegant, and I don't foresee other operation types (though I may be wrong).
The event loop has two basic ways to register callbacks:
call_soon(callback, *args) causes callback(*args) to be called the
next time the event loop runs; call_later(delay, callback, *args)
schedules a callback at some time (relative or absolute) in the
future.
sockets.py: http://code.google.com/p/tulip/source/browse/sockets.py
This implements some internet primitives using the APIs in
scheduling.py (including block_r() and block_w()). I call them
transports but they are different from transports Twisted; they are
closer to idealized sockets. SocketTransport wraps a plain socket,
offering recv() and send() methods that must be invoked using yield
from.
SslTransport wraps an ssl socket (luckily in Python 2.6 and up,
stdlib ssl sockets have good async support!).
I don't particularly care about the exact abstractions in this module;
they are convenient and I was surprised how easy it was to add SSL,
but still these mostly serve as somewhat realistic examples of how to
use scheduling.py.
I'm most interested in feedback on the design of polling.py and
scheduling.py, and to a lesser extent on the design of sockets.py;
main.py is just an example of how this style works out in practice.
Despite this intended application, I have tried to approach this design task independently to produce an API that will work for many cases, especially given the narrow focus on sockets. If people decide to get hung up on "the Microsoft way" or similar rubbish then I will feel vindicated for not mentioning it earlier :-) - it has not had any more influence on wattle than any of my other past experience has.