[Python-Dev] Status of PEP 3145 - Asynchronous I/O for subprocess.popen

135 views
Skip to first unread message

Antoine Pitrou

unread,
Mar 25, 2014, 6:19:47 PM3/25/14
to pytho...@python.org

Hi,

On core-mentorship someone asked about PEP 3145 - Asynchronous I/O for
subprocess.popen. I answered that asyncio now has subprocess support
(including non-blocking I/O on the three standard stream pipes), so
it's not obvious anything else is needed.

Should we change the PEP's status to Rejected or Superseded?

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Nick Coghlan

unread,
Mar 25, 2014, 7:14:02 PM3/25/14
to Antoine Pitrou, pytho...@python.org


On 26 Mar 2014 08:22, "Antoine Pitrou" <soli...@pitrou.net> wrote:
>
>
> Hi,
>
> On core-mentorship someone asked about PEP 3145 - Asynchronous I/O for
> subprocess.popen.  I answered that asyncio now has subprocess support
> (including non-blocking I/O on the three standard stream pipes), so
> it's not obvious anything else is needed.
>
> Should we change the PEP's status to Rejected or Superseded?

Yes. I think we'd typically use Rejected in this case, as Superseded normally relates to the evolution of interface definition PEPs.

Cheers,
Nick.

>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-Dev mailing list
> Pytho...@python.org
> https://mail.python.org/mailman/listinfo/python-dev

> Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

Guido van Rossum

unread,
Mar 25, 2014, 7:14:04 PM3/25/14
to Antoine Pitrou, Python-Dev
That would be a rather strong unilateral decision. Why don't you ask the authors? In theory the PEP's proposals could serve in situations where asyncio isn't appropriate, and asyncio's subprocess I/O isn't the smoothest API imaginable. (In practice I'm not sure if the PEP would have been written with asyncio subprocess support in place.)


Antoine Pitrou

unread,
Mar 25, 2014, 7:24:56 PM3/25/14
to pytho...@python.org, gu...@python.org
On Tue, 25 Mar 2014 16:14:04 -0700
Guido van Rossum <gu...@python.org> wrote:
> That would be a rather strong unilateral decision. Why don't you ask the
> authors? In theory the PEP's proposals could serve in situations where
> asyncio isn't appropriate, and asyncio's subprocess I/O isn't the smoothest
> API imaginable. (In practice I'm not sure if the PEP would have been
> written with asyncio subprocess support in place.)

That's a good point. I now have e-mailed Eric Pruitt and Josiah Carlson
(I couldn't find an e-mail for Charles R. McCreary).

Regards

Antoine.
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Victor Stinner

unread,
Mar 26, 2014, 7:55:47 AM3/26/14
to Antoine Pitrou, Python Dev
Hi,

For your information, asyncio.subprocess.Process is limited. It's not
possible yet to connect pipes between two processes. Something like
"cat | wc -l" where the cat stdin comes from Python.

It's possible to enhance the API to implement that, but the timeframe
was too short to implement it before Python 3.4.

Victor
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com

Josiah Carlson

unread,
Mar 27, 2014, 5:52:59 PM3/27/14
to Victor Stinner, Antoine Pitrou, Python Dev
Hopping in to give my take on this, which I've expressed to Antoine off-list.

When I first built the functionality about 8.5-9 years ago, I personally just wanted to be able to build something that could replace some of Expect: http://expect.sourceforge.net/ . The original and perhaps current API of the GSoC patch were inspired by my experience with asyncore (hence send() and recv() methods), but I never made an effort to get it practically working with asyncore - primarily because such would be functionally impossible on Windows without a lot of work to pull in a chunk of what was pywin32 libraries (at the time, Windows was a primary target). On the *nix side of things, performing the integration would be arguably trivial as select, poll, epoll, etc., all deal with pipes the same way as any other file handles (on-disk files, domain sockets, network sockets, etc.), with OS X being the exception. A little work would have been necessary to handle the two readable file handles and one writable file handle, but it's not that much different than building a proxy. But I digress.

At this point I still believe that the functionality is useful from a scriptable interaction perspective, regardless of platform. I don't believe that being able to natively support the piping of output from one process to another is necessary, but a convenient future feature. That said, discussions about the quality of the existing GSoC patch and its API basically mean that the existing code to implement async subprocesses within the subprocess module precludes it from an easy or short acceptance process. And without substantial efforts from one or more people would doom the feature request and PEP to rejection.

As an alternative, easily 95% of what most people would use this for can be written as an example using the asyncio module and included in the docs just after (or replacing) http://docs.python.org/3/library/asyncio-subprocess.html#example . Adding a reference to the subprocess module docs to point off to the asyncio subprocess example docs would get people a copy/paste snippet that they can include and update to their heart's content.

Benefits to updating the docs:
* It can happen at any time and doesn't need to wait for a 3.5 release (it can also happily wait)
* No one likes maintaining code, but everyone loves docs (especially if it documents likely use-cases)
* Because it is example docs, maybe a multi-week bikeshedding discussion about API doesn't need to happen (as long as "read line", "read X bytes", "read what is available", and "write this data" - all with timeouts - are shown, people can build everything else they want/need)
* An example using asyncio is shorter than the modifications to the subprocess module
* I would celebrate the closing of a feature request I opened in 2005

Aside from discarding code (Eric's and my own), not supporting Python-side chained pipes, and potentially angering some purists who *needed* this to be based on the subprocess module, I'm not sure I can think of any drawbacks. And arguably 2/3 of those drawbacks are imagined.


Let me know your thoughts. If it gets an "okay", I'll come up with some example code, update the docs, and post a link to the code review in this thread.

 - Josiah



Victor Stinner

unread,
Mar 27, 2014, 7:11:00 PM3/27/14
to Josiah Carlson, Antoine Pitrou, Python Dev
Hi,

2014-03-27 22:52 GMT+01:00 Josiah Carlson <josiah....@gmail.com>:
> ... but I never made an effort to get it practically working
> with asyncore - primarily because such would be functionally impossible on
> Windows without a lot of work to pull in a chunk of what was pywin32
> libraries (at the time, Windows was a primary target). On the *nix side of
> things, performing the integration would be arguably trivial as select,
> poll, epoll, etc., all deal with pipes the same way as any other file
> handles (on-disk files, domain sockets, network sockets, etc.), with OS X
> being the exception.

You should definitively take a look at asyncio. It handles sockets
*and* pipes on all platforms, and even character devices (PTY) on some
platforms. (The current status is still unclear to me, especially
regarding to the "non blocking" flag of the PTY.) On Windows, asyncio
uses IOCP.

asyncio.subprocess solves also an old issue related to polling:
subprocess.wait(timeout) uses polling because it was not possible to
register an handler for SIGCHLD handler without breaking backward
compatibility. asyncio supports also signals as well.

> As an alternative, easily 95% of what most people would use this for can be
> written as an example using the asyncio module and included in the docs just
> after (or replacing)
> http://docs.python.org/3/library/asyncio-subprocess.html#example . Adding a
> reference to the subprocess module docs to point off to the asyncio
> subprocess example docs would get people a copy/paste snippet that they can
> include and update to their heart's content.

Yeah, a link should be added from the subprocess module to the
asyncio.subprocess module (the module, not the example). FYI the
asyncore doc now has this note:

"Note: This module exists for backwards compatibility only. For new
code we recommend using asyncio."

I opened the following issue for the "ls | wc -l" feature request:
http://bugs.python.org/issue21080

Victor
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Victor Stinner

unread,
Mar 27, 2014, 7:24:34 PM3/27/14
to Josiah Carlson, Antoine Pitrou, Python Dev
2014-03-27 22:52 GMT+01:00 Josiah Carlson <josiah....@gmail.com>:
> * Because it is example docs, maybe a multi-week bikeshedding discussion
> about API doesn't need to happen (as long as "read line", "read X bytes",
> "read what is available", and "write this data" - all with timeouts - are
> shown, people can build everything else they want/need)

I don't understand this point. Using asyncio, you can read and write a
single byte or a whole line. Using functions like asyncio.wait_for(),
it's easy to add a timeout on such operation.

Victor
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Josiah Carlson

unread,
Mar 27, 2014, 9:16:50 PM3/27/14
to Victor Stinner, Antoine Pitrou, Python Dev
You don't understand the point because you don't understand the feature request or PEP. That is probably my fault for not communicating the intent better in the past. The feature request and PEP were written to offer something like the below (or at least enough that the below could be built with minimal effort):

def do_login(...):
    proc = subprocess.Popen(...)
    current = proc.recv(timeout=5)
    last_line = current.rstrip().rpartition('\n')[-1]
    if last_line.endswith('login:'):
        proc.send(username)
        if proc.readline(timeout=5).rstrip().endswith('password:'):
            proc.send(password)
            if 'welcome' in proc.recv(timeout=5).lower():
                return proc
    proc.kill()

The API above can be very awkward (as shown :P ), but that's okay. From those building blocks a (minimally) enterprising user would add functionality to suit their needs. The existing subprocess module only offers two methods for *any* amount of communication over pipes with the subprocess: check_output() and communicate(), only the latter of which supports sending data (once, limited by system-level pipe buffer lengths). Neither allow for nontrivial interactions from a single subprocess.Popen() invocation. The purpose was to be able to communicate in a bidirectional manner with a subprocess without blocking, or practically speaking, blocking with a timeout. That's where the "async" term comes from. Again, there was never any intent to have the functionality be part of asyncore or any other asynchronous sockets framework, which is why there are no handle_*() methods, readable(), writable(), etc.

Your next questions will be: But why bother at all? Why not just build the piece you need *inside* asyncio? Why does this need anything more? The answer to those questions are wants and needs. If I'm a user that needs interactive subprocess handling, I want to be able to do something like the code snippet above. The last thing I need is to have to rewrite the way my application/script/whatever handles *everything* just because a new asynchronous IO library has been included in the Python standard library - it's a bit like selling you a $300 bicycle when you need a $20 wheel for your scooter.

That there *now* exists the ability to have async subprocesses as part of asyncio is a fortunate happenstance, as the necessary underlying tools for building the above now exist in the standard library. It's a matter of properly embedding the asyncio-related bits inside a handful of functions to provide something like the above, which is what I was offering to write. But why not keep working on the subprocess module? Yep. Tried that. Coming up on 9 years since I created the feature request and original Activestate recipe. To go that route is going to be 2-3 times as much work as has already been dedicated to get somewhere remotely acceptable for inclusion in Python 3.5, but more likely, subsequent rejection for similar reasons why it has been in limbo.

But here's the thing: I can build enough using asyncio in 30-40 lines of Python to offer something like the above API. The problem is that it really has no natural home. It uses asyncio, so makes no sense to put in subprocess. It doesn't fit the typical asyncio behavior, so doesn't make sense to put in asyncio. The required functionality isn't big enough to warrant a submodule anywhere. Heck, it's even way too small to toss into an external PyPI module. But in the docs? It would show an atypical, but not wholly unreasonable use of asyncio (the existing example already shows what I would consider to be an atypical use of asyncio). It would provide a good starting point for someone who just wants/needs something like the snippet above. It is *yet another* use-case for asyncio. And it could spawn a larger library for offering a more fleshed-out subprocess-related API, though that is probably more wishful thinking on my part than anything.

 - Josiah

Terry Reedy

unread,
Mar 27, 2014, 10:18:10 PM3/27/14
to pytho...@python.org
According to my reading of the doc, one should (in the absence of
deadlocks, and without having timeouts) be able to use proc.stdin.write
and proc.stdout.read. Do those not actually work?
> <mailto:josiah....@gmail.com>>:
> > * Because it is example docs, maybe a multi-week bikeshedding
> discussion
> > about API doesn't need to happen (as long as "read line", "read X
> bytes",
> > "read what is available", and "write this data" - all with
> timeouts - are
> > shown, people can build everything else they want/need)
>
> I don't understand this point. Using asyncio, you can read and write a
> single byte or a whole line. Using functions like asyncio.wait_for(),
> it's easy to add a timeout on such operation.
>
> Victor
>
>
>
>


--
Terry Jan Reedy

Josiah Carlson

unread,
Mar 28, 2014, 1:09:47 AM3/28/14
to Terry Reedy, Python-Dev
By digging into the internals of a subprocess produced by Popen(), you can write in a blocking manner to the stdin pipe, and read in a blocking manner from the stdout/stderr pipe(s). For scripting most command-line operations, the lack of timeouts and the ability to *stop* trying to read is as important as being able to spawn an external process. It kind-of kills that side of the usefulness of Python as a tool for scripting.

The question is not whether or not a user of Python can dig into the internals, make some calls, then get it to be non-blocking - the existence of two different patches to do so (the most recent of which is from 4 1/2 years ago) shows that it *can* be done. The question is whether or not the desire for the functionality warrants having functions or methods to perform these operations in the standard library.

I and others have claimed that it should go into the standard library. Heck, there was enough of a push that Eric got paid to write his version of the functionality for a GSoC project in 2009. There has even been activity on the bug itself unrelated to deferring discussions as recently as May 2012 (after which activity seems to have paused for reasons I don't know). Some people have raised reasonable questions about the API and implementation, but no one is willing to offer an alternative API that they think would be better, so discussions about implementation of a non-existent API for inclusion are moot.


But honestly, I have approximately zero faith that what I say or do will lead to the inclusion of any changes to the subprocess module. Which is why I'm offering to write a short example that uses asyncio for inclusion in the docs. It's not what I've wanted for almost 9 years, but at least it has a chance of actually happening. I'll take a chance at updating the docs instead of a 3 to 9 month bikeshedding just to lead to rejection any day.


So yeah. Someone want to make a decision? Tell me to write the docs, I will. Tell me to go take a long walk off a short pier, I'll thank you for your time and leave you alone.

 - Josiah



    <mailto:josiah.carlson@gmail.com>>:

     > * Because it is example docs, maybe a multi-week bikeshedding
    discussion
     > about API doesn't need to happen (as long as "read line", "read X
    bytes",
     > "read what is available", and "write this data" - all with
    timeouts - are
     > shown, people can build everything else they want/need)

    I don't understand this point. Using asyncio, you can read and write a
    single byte or a whole line. Using functions like asyncio.wait_for(),
    it's easy to add a timeout on such operation.

    Victor






--
Terry Jan Reedy


_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev

Paul Moore

unread,
Mar 28, 2014, 4:09:45 AM3/28/14
to Josiah Carlson, Python-Dev, Terry Reedy
On 28 March 2014 05:09, Josiah Carlson <josiah....@gmail.com> wrote:
> So yeah. Someone want to make a decision? Tell me to write the docs, I will.
> Tell me to go take a long walk off a short pier, I'll thank you for your
> time and leave you alone.

I had a need for this a few years ago. It's messy to do on Windows
(ctypes callouts to PeekNamedPipe to check if you can read from the
process without blocking). So I would like to see a recipe for this,
(even if it's likely to be another few years before I ever need it
again :-)).

Paul
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Victor Stinner

unread,
Mar 28, 2014, 6:20:14 AM3/28/14
to Josiah Carlson, Antoine Pitrou, Python Dev
2014-03-28 2:16 GMT+01:00 Josiah Carlson <josiah....@gmail.com>:
> def do_login(...):
> proc = subprocess.Popen(...)
> current = proc.recv(timeout=5)
> last_line = current.rstrip().rpartition('\n')[-1]
> if last_line.endswith('login:'):
> proc.send(username)
> if proc.readline(timeout=5).rstrip().endswith('password:'):
> proc.send(password)
> if 'welcome' in proc.recv(timeout=5).lower():
> return proc
> proc.kill()

I don't understand this example. How is it "asynchronous"? It looks
like blocking calls. In my definition, asynchronous means that you can
call this function twice on two processes, and they will run in
parallel.

Using greenlet/eventlet, you can write code which looks blocking, but
runs asynchronously. But I don't think that you are using greenlet or
eventlet here.

I take a look at the implementation:
http://code.google.com/p/subprocdev/source/browse/subprocess.py

It doesn't look portable. On Windows, WriteFile() is used. This
function is blocking, or I missed something huge :-) It's much better
if a PEP is portable. Adding time.monotonic() only to Linux would make
the PEP 418 much shorter (4 sentences instead of 10 pages? :-))!

The implementation doesn't look reliable:

def get_conn_maxsize(self, which, maxsize):
# Not 100% certain if I get how this works yet.
if maxsize is None:
maxsize = 1024
...

This constant 1024 looks arbitrary. On UNIX, a write into a pipe may
block with less bytes (512 bytes).

asyncio has a completly different design. On Windows, it uses
overlapped operations with IOCP event loop. Such operation can be
cancelled. Windows cares of the buffering. On UNIX, non-blocking mode
is used with select() (or something faster like epoll) and asyncio
retries to write more data when the pipe (or any file descriptor used
for process stdin/stdoud/stderr) becomes ready (for reading/writing).

asyncio design is more reliable and portable.

I don't see how you can implement asynchronous communication with a
subprocess without the complex machinery of an event loop.

> The API above can be very awkward (as shown :P ), but that's okay. From
> those building blocks a (minimally) enterprising user would add
> functionality to suit their needs. The existing subprocess module only
> offers two methods for *any* amount of communication over pipes with the
> subprocess: check_output() and communicate(), only the latter of which
> supports sending data (once, limited by system-level pipe buffer lengths).

As I wrote, it's complex to handle non-blocking file descriptors. You
have to catch EWOULDBLOCK and retries later when the file descriptor
becomes ready. The main thread has to watch for such event on the file
descriptor, or you need a dedicated thread. By the way,
subprocess.communicate() is currently implemented using threads on
Windows.

> Neither allow for nontrivial interactions from a single subprocess.Popen()
> invocation. The purpose was to be able to communicate in a bidirectional
> manner with a subprocess without blocking, or practically speaking, blocking
> with a timeout. That's where the "async" term comes from.

I call this "non-blocking functions", not "async functions".

It's quite simple to check if a read will block on not on UNIX. It's
more complex to implement it on Windows. And even more complex to
handle to add a buffer to write().

> Your next questions will be: But why bother at all? Why not just build the
> piece you need *inside* asyncio? Why does this need anything more? The
> answer to those questions are wants and needs. If I'm a user that needs
> interactive subprocess handling, I want to be able to do something like the
> code snippet above. The last thing I need is to have to rewrite the way my
> application/script/whatever handles *everything* just because a new
> asynchronous IO library has been included in the Python standard library -
> it's a bit like selling you a $300 bicycle when you need a $20 wheel for
> your scooter.

You don't have to rewrite your whole application. If you only want to
use asyncio event loop in a single function, you can use
loop.run_until_complete(do_login) which blocks until the function
completes. The "function" is an asynchronous coroutine in fact.

Full example of asynchronous communication with a subprocess (the
python interactive interpreter) using asyncio high-level API:
---
import asyncio.subprocess
import time
import sys

@asyncio.coroutine
def eval_python_async(command, encoding='ascii', loop=None):
proc = yield from asyncio.subprocess.create_subprocess_exec(
sys.executable, "-u", "-i",
stdin=asyncio.subprocess.PIPE,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.STDOUT,
loop=loop)

# wait for the prompt
buffer = bytearray()
while True:
data = yield from proc.stdout.read(100)
buffer.extend(data)
if buffer.endswith(b'>>> '):
break

proc.stdin.write(command.encode(encoding) + b"\n")
yield from proc.stdin.drain()
proc.stdin.close()

output = yield from proc.stdout.read()

output = output.decode(encoding)
output = output.rstrip()
if output.endswith('>>>'):
output = output[:-3].rstrip()
return output

def eval_python(command, timeout=None):
loop = asyncio.get_event_loop()
task = asyncio.Task(eval_python_async(command, loop=loop), loop=loop)
return loop.run_until_complete(asyncio.wait_for(task, timeout))

def test_sequential(nproc, command):
t0 = time.monotonic()
for index in range(nproc):
eval_python(command)
return time.monotonic() - t0

def test_parallel(nproc, command):
loop = asyncio.get_event_loop()
tasks = [asyncio.Task(eval_python_async(command, loop=loop), loop=loop)
for index in range(nproc)]
t0 = time.monotonic()
loop.run_until_complete(asyncio.wait(tasks))
return time.monotonic() - t0

print("1+1 = %r" % eval_python("1+1", timeout=1.0))

slow_code = "import math; print(str(math.factorial(20000)).count('7'))"

dt = test_sequential(10, slow_code)
print("Run 10 tasks in sequence: %.1f sec" % dt)

dt2 = test_parallel(10, slow_code)
print("Run 10 tasks in parallel: %.1f sec (speed=%.1f)" % (dt2, dt/dt2))

# cleanup asyncio
asyncio.get_event_loop().close()
---

Output:
---
1+1 = '2'
Run 10 tasks in sequence: 2.8 sec
Run 10 tasks in parallel: 0.6 sec (speed=4.6)
---

(My CPU has 8 cores, the speed may be lower on other computers with
fewer cores.)

Even if eval_python_async() is asynchronous, eval_python() function is
blocking so you can write: print("1+1 = %r" % eval_python("1+1"))
without callback nor "yield from".

Running tasks in parallel is faster than running them in sequence
(almost 5 times faster on my PC).

The syntax in eval_python_async() is close to the API you proposed,
except that you have to add "yield from" in front of "blocking"
functions like read() or drain() (it's the function to flush the stdin
buffer, I'm not sure that it is needed in this example).

The timeout is on the whole eval_python_async(), but you can as well
using finer timeout on each read/write.

> But here's the thing: I can build enough using asyncio in 30-40 lines of
> Python to offer something like the above API. The problem is that it really
> has no natural home.

I agree that writing explicit asynchronous code is more complex than
using eventlet. Asynchronous programming is hard.

> But in the docs? It would show an atypical, but not
> wholly unreasonable use of asyncio (the existing example already shows what
> I would consider to be an atypical use of asyncio).

The asyncio documentation is still a work-in-progress. I tried to
document all APIs, but there are too few examples and the
documentation is still focused on the API instead of being oriented to
the user of the API.

Don't hesitate to contribute to the documentation!

We can probably write a simple example showing how to interact with an
interactive program like Python.

Nick Coghlan

unread,
Mar 28, 2014, 6:49:54 AM3/28/14
to Victor Stinner, Antoine Pitrou, Python Dev
On 28 March 2014 20:20, Victor Stinner <victor....@gmail.com> wrote:
> 2014-03-28 2:16 GMT+01:00 Josiah Carlson <josiah....@gmail.com>:
>> def do_login(...):
>> proc = subprocess.Popen(...)
>> current = proc.recv(timeout=5)
>> last_line = current.rstrip().rpartition('\n')[-1]
>> if last_line.endswith('login:'):
>> proc.send(username)
>> if proc.readline(timeout=5).rstrip().endswith('password:'):
>> proc.send(password)
>> if 'welcome' in proc.recv(timeout=5).lower():
>> return proc
>> proc.kill()
>
> I don't understand this example. How is it "asynchronous"? It looks
> like blocking calls. In my definition, asynchronous means that you can
> call this function twice on two processes, and they will run in
> parallel.

Without reading all the reference from PEP 3145 again, I now seem to
recall the problem it was aimed at was the current deadlock warnings
in the subprocess docs - if you're not careful to make sure you keep
reading from the stdout and stderr pipes while writing to stdin, you
can fill up the kernel buffers and deadlock while communicating with
the subprocess. So the "asynchronous" part is to be able to happily
write large amounts of data to a stdin pipe without fear of deadlock
with a subprocess that has just written large amounts of data to the
stdout or stderr pipes.

So, from the perspective of the user, it behaves like a synchronous
blocking operation, but on the backend it needs to use asynchronous
read and write operations to avoid deadlock. I suspect it would likely
be a relatively thin wrapper around run_until_complete().

Also, as far as where such functionality should live in the standard
library could go, it's entirely possible for it to live in its natural
home of "subprocess". To make that work, the core subprocess.Popen
functionality would need to be moved to a _subprocess module, and then
both subprocess and asyncio would depend on that, allowing subprocess
to also depend on asyncio without creating a circular import.

So I'll go back on my original comment - assuming I've now remembered
its intended effects correctly PEP 3145 remains a valid proposal,
independent of (but potentially relying on) asyncio, as the problem it
is designed to solve is all those notes like "Do not use stdout=PIPE
or stderr=PIPE with this function. As the pipes are not being read in
the current process, the child process may block if it generates
enough output to a pipe to fill up the OS pipe buffer." in the current
subprocess module by using an asynchronous backend while still
presenting a synchronous API.

And rather than adding a new API, I'd hope it could propose just
getting rid of those warnings by reimplementing the current deadlock
prone APIs on top of "run_until_complete()" and exploring the
potential consequences for backwards compatibility.

Cheers,
Nick.

--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia

Tres Seaver

unread,
Mar 28, 2014, 10:45:01 AM3/28/14
to pytho...@python.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/27/2014 09:16 PM, Josiah Carlson wrote:
> But here's the thing: I can build enough using asyncio in 30-40 lines
> of Python to offer something like the above API. The problem is that
> it really has no natural home. It uses asyncio, so makes no sense to
> put in subprocess. It doesn't fit the typical asyncio behavior, so
> doesn't make sense to put in asyncio. The required functionality isn't
> big enough to warrant a submodule anywhere. Heck, it's even way too
> small to toss into an external PyPI module.

Seems perfect for the Cheesehop to me.


Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tse...@palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlM1iucACgkQ+gerLs4ltQ4RygCfQOjBD7jTZU5ILub/sKxGYqH8
8v8AoKkv2ePkRn3X43CpGBQNeB9uNufQ
=xgSe
-----END PGP SIGNATURE-----

R. David Murray

unread,
Mar 28, 2014, 11:08:02 AM3/28/14
to pytho...@python.org
On Fri, 28 Mar 2014 10:45:01 -0400, Tres Seaver <tse...@palladion.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 03/27/2014 09:16 PM, Josiah Carlson wrote:
> > But here's the thing: I can build enough using asyncio in 30-40 lines
> > of Python to offer something like the above API. The problem is that
> > it really has no natural home. It uses asyncio, so makes no sense to
> > put in subprocess. It doesn't fit the typical asyncio behavior, so
> > doesn't make sense to put in asyncio. The required functionality isn't
> > big enough to warrant a submodule anywhere. Heck, it's even way too
> > small to toss into an external PyPI module.
>
> Seems perfect for the Cheesehop to me.

Indeed. I heard a rumor[*] that there's at least one package in the
cheeseshop that consists of a one-liner.

On the other hand, we have multiprocessing examples in the docs that are
longer than that, so it sounds like a great asyncio example to me,
especially given that Victor says we don't have enough examples yet.

--David

[*] https://pypi.python.org/pypi/first It's not *actually* a one
liner, but you could write it as one, and the actual code isn't
much heavier :)

Josiah Carlson

unread,
Mar 28, 2014, 12:45:40 PM3/28/14
to Paul Moore, Python-Dev, Terry Reedy
If it makes you feel any better, I spent an hour this morning building a 2-function API for Linux and Windows, both tested, not using ctypes, and not even using any part of asyncio (the Windows bits are in msvcrt and _winapi). It works in Python 3.3+. You can see it here: http://pastebin.com/0LpyQtU5

 - Josiah

Josiah Carlson

unread,
Mar 28, 2014, 1:42:03 PM3/28/14
to Victor Stinner, Python Dev
*This* is the type of conversation that I wanted to avoid. But I'll answer your questions because I used to do exactly the same thing.

On Fri, Mar 28, 2014 at 3:20 AM, Victor Stinner <victor....@gmail.com> wrote:
2014-03-28 2:16 GMT+01:00 Josiah Carlson <josiah....@gmail.com>:
> def do_login(...):
>     proc = subprocess.Popen(...)
>     current = proc.recv(timeout=5)
>     last_line = current.rstrip().rpartition('\n')[-1]
>     if last_line.endswith('login:'):
>         proc.send(username)
>         if proc.readline(timeout=5).rstrip().endswith('password:'):
>             proc.send(password)
>             if 'welcome' in proc.recv(timeout=5).lower():
>                 return proc
>     proc.kill()

I don't understand this example. How is it "asynchronous"? It looks
like blocking calls. In my definition, asynchronous means that you can
call this function twice on two processes, and they will run in
parallel.

In this context, async means not necessarily blocking. If you didn't provide a timeout, it would default to 0, which would return immediately with what was sent and/or received from the subprocess. If you don't believe me, that's fine, but it prevents meaningful discussion.

Using greenlet/eventlet, you can write code which looks blocking, but
runs asynchronously. But I don't think that you are using greenlet or
eventlet here.

You are right. And you are talking about something that is completely out of scope.
 
I take a look at the implementation:
http://code.google.com/p/subprocdev/source/browse/subprocess.py

It doesn't look portable. On Windows, WriteFile() is used. This
function is blocking, or I missed something huge :-) It's much better
if a PEP is portable. Adding time.monotonic() only to Linux would make
the PEP 418 much shorter (4 sentences instead of 10 pages? :-))!

Of course it's not portable. Windows does things differently from other platforms. That's one of the reasons why early versions required pywin32. Before you reply to another message, I would encourage you to read the bug, the pep, and perhaps the recipe I just posted: http://pastebin.com/0LpyQtU5

Or you can try to believe that I have done all of those and believe what I say, especially when I say that I don't believe that spending a lot of time worrying about the original patch/recipe and the GSoC entry. They would all require a lot of work to make reasonably sane, which is why I wrote the minimal recipe above.

The implementation doesn't look reliable:

  def get_conn_maxsize(self, which, maxsize):
    # Not 100% certain if I get how this works yet.
    if maxsize is None:
      maxsize = 1024
    ...

This constant 1024 looks arbitrary. On UNIX, a write into a pipe may
block with less bytes (512 bytes).

Testing now I seem to be able to send non-reading subprocesses somewhat arbitrary amounts of data without leading to a block. But I can't test all Linux installations or verify that I'm correct. But whether or not this makes sense is moot, as I don't think it should be merged, and I don't believe anyone thinks it should be merged at this point.

asyncio has a completly different design. On Windows, it uses
overlapped operations with IOCP event loop. Such operation can be
cancelled. Windows cares of the buffering. On UNIX, non-blocking mode
is used with select() (or something faster like epoll) and asyncio
retries to write more data when the pipe (or any file descriptor used
for process stdin/stdoud/stderr) becomes ready (for reading/writing).

asyncio design is more reliable and portable.

More reliable, sure. More portable... only because all of the portability heavy lifting has been done and included in Python core. That's one other thing that you aren't understanding - the purpose of trying to have this in the standard library is so that people can use the functionality (async subprocesses) on multiple platforms without needing to write it themselves (poorly), ask on forums of one kind or another, copy and paste from some recipe posted to the internet, etc. It's a strict increase in the functionality and usefulness of the Python standard library and has literally zero backwards compatibility issues.

This is the absolute minimum functionality necessary to make people who need this functionality happy. No, really. Absolute minimum. Sort of what asyncore was - the minimum functionality necessary to have async sockets in Python. Was it dirty? Sure. Was it difficult to use? Some people had issues. Did it work? It worked well enough that people were making money building applications based on asyncore (myself included 10 years ago).

I don't see how you can implement asynchronous communication with a
subprocess without the complex machinery of an event loop.

Words can have multiple meanings. The meaning of "async" in this context is different from what you believe it to mean, which is part of your confusion. I tried to address this in my last message, but either you didn't read that part, didn't understand that part, or don't believe what I wrote. So let me write it again:

In this context, "async subprocesses" means the ability to interactively interrogate a subprocess without necessarily blocking on input or output. Everyone posting questions about this on StackOverflow or other forums understands it this way. It *does not mean* that it needs to participate in an event loop, needs to be usable with asyncore, asyncio, Twisted, greenlets, gevent, or otherwise.

If there is one thing that *I* need for you (and everyone else) to understand and believe in this conversation, it is the above. Do you? Yes? Okay. Now read everything that I've written again. No? Can you explain *why* you don't believe or understand me?


> The API above can be very awkward (as shown :P ), but that's okay. From
> those building blocks a (minimally) enterprising user would add
> functionality to suit their needs. The existing subprocess module only
> offers two methods for *any* amount of communication over pipes with the
> subprocess: check_output() and communicate(), only the latter of which
> supports sending data (once, limited by system-level pipe buffer lengths).

As I wrote, it's complex to handle non-blocking file descriptors. You
have to catch EWOULDBLOCK and retries later when the file descriptor
becomes ready. The main thread has to watch for such event on the file
descriptor, or you need a dedicated thread. By the way,
subprocess.communicate() is currently implemented using threads on
Windows.

I know what it takes, I've been writing async sockets for 12 years. I used to maintain asyncore/asynchat and related libraries. Actually, you can thank me for asyncore existing in Python 2.6+ (Giampaolo has done a great job and kept asyncore alive after I stopped participating daily python-dev about 5 years ago, and I can't thank him enough for that).

But to the point: stop bagging on the old patches. No one likes them. We all agree. The question is where do we go from here.

> Neither allow for nontrivial interactions from a single subprocess.Popen()
> invocation. The purpose was to be able to communicate in a bidirectional
> manner with a subprocess without blocking, or practically speaking, blocking
> with a timeout. That's where the "async" term comes from.

I call this "non-blocking functions", not "async functions".

It's quite simple to check if a read will block on not on UNIX. It's
more complex to implement it on Windows. And even more complex to
handle to add a buffer to write().

Okay, call it non-blocking subprocess reads and writes. Whatever you want to call it. And yes, I know what it takes to read and write on Windows... I've done it 3 times now (the original recipe, the original patch, now the above recipe).

But the other piece is that *this* doesn't necessarily need to be 100% robust - I'm not even advocating it to be in the Python standard library anywhere! I've given up on that. But a short example hanging out in the docs? Someone will use it. Someone will run into issues. They will add robustness. They will add functionality. And it will grow into something worth using before being posted to the cheeseshop.

The status quo is that people don't get answers anywhere in the Python docs or the Python stdlib. Python core is noticeably absent in a source of information about how someone would go about using the subprocess module in a completely reasonable and sane manner.

> Your next questions will be: But why bother at all? Why not just build the
> piece you need *inside* asyncio? Why does this need anything more? The
> answer to those questions are wants and needs. If I'm a user that needs
> interactive subprocess handling, I want to be able to do something like the
> code snippet above. The last thing I need is to have to rewrite the way my
> application/script/whatever handles *everything* just because a new
> asynchronous IO library has been included in the Python standard library -
> it's a bit like selling you a $300 bicycle when you need a $20 wheel for
> your scooter.

You don't have to rewrite your whole application. If you only want to
use asyncio event loop in a single function, you can use
loop.run_until_complete(do_login) which blocks until the function
completes. The "function" is an asynchronous coroutine in fact.

The point of this conversation is that I was offering to write the handful of wrappers that would make interactions of the form that I showed earlier easy and possible with asyncio. So that a user didn't have to write them themselves.

[snip]

Even if eval_python_async() is asynchronous, eval_python() function is
blocking so you can write: print("1+1 = %r" % eval_python("1+1"))
without callback nor "yield from".

Running tasks in parallel is faster than running them in sequence
(almost 5 times faster on my PC).

This is completely unrelated to the conversation.

The syntax in eval_python_async() is close to the API you proposed,
except that you have to add "yield from" in front of "blocking"
functions like read() or drain() (it's the function to flush the stdin
buffer, I'm not sure that it is needed in this example).

The timeout is on the whole eval_python_async(), but you can as well
using finer timeout on each read/write.

> But here's the thing: I can build enough using asyncio in 30-40 lines of
> Python to offer something like the above API. The problem is that it really
> has no natural home.

I agree that writing explicit asynchronous code is more complex than
using eventlet. Asynchronous programming is hard.

No, it's not hard. It just requires thinking in a different way. It's the thinking in a different way that's difficult. But I've been doing async sockets programming on and off for 13 years now, so I get it. What I'm offering is to help people *not* do that, because some people have difficulty thinking in that way.

> But in the docs? It would show an atypical, but not
> wholly unreasonable use of asyncio (the existing example already shows what
> I would consider to be an atypical use of asyncio).

The asyncio documentation is still a work-in-progress. I tried to
document all APIs, but there are too few examples and the
documentation is still focused on the API instead of being oriented to
the user of the API.

Don't hesitate to contribute to the documentation!

So is this the "okay" that I've been waiting with baited breath for?
 
 - Josiah

Guido van Rossum

unread,
Mar 28, 2014, 1:46:51 PM3/28/14
to Josiah Carlson, Terry Reedy, Python-Dev
On Fri, Mar 28, 2014 at 9:45 AM, Josiah Carlson <josiah....@gmail.com> wrote:

If it makes you feel any better, I spent an hour this morning building a 2-function API for Linux and Windows, both tested, not using ctypes, and not even using any part of asyncio (the Windows bits are in msvcrt and _winapi). It works in Python 3.3+. You can see it here: http://pastebin.com/0LpyQtU5

Seeing this makes *me* feel better. I think it's reasonable to add (some variant) of that to the subprocess module in Python 3.5. It also belongs in the Activestate cookbook. And no, the asyncio module hasn't made it obsolete.

Josiah, you sound upset about the whole thing -- to the point of writing unintelligible sentences and passive-aggressive digs at everyone reading this list. I'm sorry that something happened that led you feel that way (if you indeed feel upset or frustrated) but I'm glad that you wrote that code snippet -- it is completely clear what you want and why you want it, and also what should happen next (a few rounds of code review on the tracker).

But that PEP? It's just a terrible PEP. It doesn't contain a single line of example code. It doesn't specify the proposed interface, it just describes in way too many sentences how it should work, and contains a whole lot of references to various rants from which the reader is apparently meant to become enlightened. I don't know which of the three authors *really* wrote it, and I don't want to know -- I think the PEP is irrelevant to the proposed feature, which is of "put it in the bug tracker and work from there" category -- presumably the PEP was written based on the misunderstanding that having a PEP would make acceptance of the patch easier, or because during an earlier bikeshedding round someone said "please write a PEP" (someone always says that). I propose to scrap the PEP (set the status to Withdrawn) and just work on adding the methods to the subprocess module.

If it were me, I'd define three methods, with longer names to clarify what they do, e.g.

proc.write_nonblocking(data)
data = proc.read_nonblocking()
data = proc.read_stderr_nonblocking()

I.e. add _nonblocking to the method names to clarify that they may return '' when there's nothing available, and have a separate method for reading stderr instead of a flag. And I'd wonder if there should be an unambiguous way to detect EOF or whether the caller should just check for proc.stdout.closed. (And what for stdin? IIRC it actually becomes writable when the other end is closed, and then the write() will fail. But maybe I forget.)

But that's all bikeshedding and it can happen on the tracker or directly on the list just as easily; I don't see the need for a PEP.

Josiah Carlson

unread,
Mar 28, 2014, 2:35:16 PM3/28/14
to Guido van Rossum, Terry Reedy, Python-Dev
On Fri, Mar 28, 2014 at 10:46 AM, Guido van Rossum <gu...@python.org> wrote:
On Fri, Mar 28, 2014 at 9:45 AM, Josiah Carlson <josiah....@gmail.com> wrote:

If it makes you feel any better, I spent an hour this morning building a 2-function API for Linux and Windows, both tested, not using ctypes, and not even using any part of asyncio (the Windows bits are in msvcrt and _winapi). It works in Python 3.3+. You can see it here: http://pastebin.com/0LpyQtU5

Seeing this makes *me* feel better. I think it's reasonable to add (some variant) of that to the subprocess module in Python 3.5. It also belongs in the Activestate cookbook. And no, the asyncio module hasn't made it obsolete.

Cool.

Josiah, you sound upset about the whole thing -- to the point of writing unintelligible sentences and passive-aggressive digs at everyone reading this list. I'm sorry that something happened that led you feel that way (if you indeed feel upset or frustrated) but I'm glad that you wrote that code snippet -- it is completely clear what you want and why you want it, and also what should happen next (a few rounds of code review on the tracker).

I'm not always a prat. Something about python-dev brings out parts of me that I thought I had discarded from my personality years ago. Toss in a bit of needing to re-explain ideas that I've been trying to explain for almost 9 years? Frustration + formerly discarded personality traits = uck. That's pretty much why I won't be rejoining the party here regularly, you are all better off without me commenting on 95% of threads like I used to.

Victor, I'm sorry for being a jerk. It's hard for me to not be the guy I was when I spend time on this list. That's *my* issue, not yours. That I spent any time redirecting my frustration towards you is BS, and if I could take back the email I sent just before getting Guido's, I would.

I would advise everyone to write it off as the ramblings of a surprisingly young, angry old man. Or call me an a-hole. Both are pretty accurate. :)

But that PEP? It's just a terrible PEP. It doesn't contain a single line of example code. It doesn't specify the proposed interface, it just describes in way too many sentences how it should work, and contains a whole lot of references to various rants from which the reader is apparently meant to become enlightened. I don't know which of the three authors *really* wrote it, and I don't want to know -- I think the PEP is irrelevant to the proposed feature, which is of "put it in the bug tracker and work from there" category -- presumably the PEP was written based on the misunderstanding that having a PEP would make acceptance of the patch easier, or because during an earlier bikeshedding round someone said "please write a PEP" (someone always says that). I propose to scrap the PEP (set the status to Withdrawn) and just work on adding the methods to the subprocess module.

I'm not going to argue. The first I read it was 2-3 days ago.

If it were me, I'd define three methods, with longer names to clarify what they do, e.g.

proc.write_nonblocking(data)
data = proc.read_nonblocking()
data = proc.read_stderr_nonblocking()

Easily doable.

I.e. add _nonblocking to the method names to clarify that they may return '' when there's nothing available, and have a separate method for reading stderr instead of a flag. And I'd wonder if there should be an unambiguous way to detect EOF or whether the caller should just check for proc.stdout.closed. (And what for stdin? IIRC it actually becomes writable when the other end is closed, and then the write() will fail. But maybe I forget.)

But that's all bikeshedding and it can happen on the tracker or directly on the list just as easily; I don't see the need for a PEP.

Sounds good.

 - Josiah
 

Terry Reedy

unread,
Mar 28, 2014, 3:42:17 PM3/28/14
to pytho...@python.org
On 3/28/2014 12:45 PM, Josiah Carlson wrote:
> If it makes you feel any better, I spent an hour this morning building a
> 2-function API for Linux and Windows, both tested, not using ctypes, and
> not even using any part of asyncio (the Windows bits are in msvcrt and
> _winapi). It works in Python 3.3+. You can see it here:
> http://pastebin.com/0LpyQtU5

Thank you. The docs gave me the impression that I could simply write
proc.stdin and read proc.stdout. I failed with even a simple echo server
(on Windows) and your code suggests why. So it does not get lost, I
attached your code to

http://bugs.python.org/issue18823

My interest is with Idle. It originally ran user code in the same
process as the Shell and Editor code. Then Guido added an option to
os.spawn a separate process and communicate through a socket connection
and the option became the default with same process (requested by -N on
the command line) as a backup option. 3.2 switched to using subprocess,
but still with a socket. The problem is that the socket connection
intermittently fails. Firewalls are, or at least used to be one possible
cause, but there are others -- unknown. (While it works, the suggestion
to restart with -N is a mystery to people who have never seen a command
line.) This is one of the biggest sources of complaints about Idle. A
pipe connection method that always worked on Windows, *x, and Mac would
be great in itself and would also allow code simplification by removing
the -n option. (Roger Serwy has suggested the latter as having two modes
makes patching trickier.)

The current socket connection must be non-blocking. Even though the exec
loop part of the Shell window waits for a response after sending a user
statement, everything else is responsive. One can select text in the
window, use the menus, or switch to another window. So Idle definitely
needs non-blocking write and read.

In my ignorance, I have no idea whether the approach in your code or
that in Viktor's code is better. Either way, I will appreciate any help
you give, whether by writing, reviewing, or testing, to make
communication with subprocesses easier and more dependable.

--
Terry Jan Reedy

Josiah Carlson

unread,
Mar 28, 2014, 4:26:48 PM3/28/14
to Terry Reedy, Python-Dev
On Fri, Mar 28, 2014 at 12:42 PM, Terry Reedy <tjr...@udel.edu> wrote:
On 3/28/2014 12:45 PM, Josiah Carlson wrote:
If it makes you feel any better, I spent an hour this morning building a
2-function API for Linux and Windows, both tested, not using ctypes, and
not even using any part of asyncio (the Windows bits are in msvcrt and
_winapi). It works in Python 3.3+. You can see it here:
http://pastebin.com/0LpyQtU5

Thank you. The docs gave me the impression that I could simply write proc.stdin and read proc.stdout. I failed with even a simple echo server (on Windows) and your code suggests why. So it does not get lost, I attached your code to

http://bugs.python.org/issue18823

My interest is with Idle. It originally ran user code in the same process as the Shell and Editor code. Then Guido added an option to os.spawn a separate process and communicate through a socket connection and the option became the default with same process (requested by -N on the command line) as a backup option. 3.2 switched to using subprocess, but still with a socket. The problem is that the socket connection intermittently fails. Firewalls are, or at least used to be one possible cause, but there are others -- unknown. (While it works, the suggestion to restart with -N is a mystery to people who have never seen a command line.) This is one of the biggest sources of complaints about Idle. A pipe connection method that always worked on Windows, *x, and Mac would be great in itself and would also allow code simplification by removing the -n option. (Roger Serwy has suggested the latter as having two modes makes patching trickier.)

The current socket connection must be non-blocking. Even though the exec loop part of the Shell window waits for a response after sending a user statement, everything else is responsive. One can select text in the window, use the menus, or switch to another window. So Idle definitely needs non-blocking write and read.

In my ignorance, I have no idea whether the approach in your code or that in Viktor's code is better. Either way, I will appreciate any help you give, whether by writing, reviewing, or testing, to make communication with subprocesses easier and more dependable.

One of my other use-cases for this was using this in *my* editor (PyPE), which I wrote (in 2003) because I lost work in Idle. This lost work was due to the same-process interpreter crashing during an interactive session. IIRC, this is partly what pushed Guido to have Idle use os.spawn() + socket. I ended up using wxPython's built-in external process support at the time, but that's obviously not useful in core Python with Idle :P

This is all coming back full circle. :)

 - Josiah

Glenn Linderman

unread,
Mar 28, 2014, 4:35:31 PM3/28/14
to pytho...@python.org
On 3/28/2014 11:35 AM, Josiah Carlson wrote:
If it were me, I'd define three methods, with longer names to clarify what they do, e.g.

proc.write_nonblocking(data)
data = proc.read_nonblocking()
data = proc.read_stderr_nonblocking()

Easily doable.

I'd appreciate being notified if you do update/test as described.

Terry Reedy

unread,
Mar 28, 2014, 4:58:25 PM3/28/14
to pytho...@python.org
On 3/28/2014 6:20 AM, Victor Stinner wrote:

> Full example of asynchronous communication with a subprocess (the
> python interactive interpreter) using asyncio high-level API:

Thank you for writing this. As I explained in response to Josiah, Idle
communicates with a python interpreter subprocess through a socket.
Since making the connection is not dependable, I would like to replace
the socket with the pipes. http://bugs.python.org/issue18823

However, the code below creates a subprocess for one command and one
response, which can apparently be done now with subprocess.communicate.
What I and others need is a continuing (non-blocking) conversion with 1
and only 1 subprocess (see my response to Josiah), and that is much more
difficult. So this code does not do what he claims his will do.

However it is done, I agree with the intent of the PEP to make it much
easier to talk with a subprocess. Victor, if you can rewrite the below
with a run_forever loop that can accept new write-read task pairs and
also make each line read immediately accessible, that would be really
helpful. Post it on the issue above if you prefer.

Another difference from what you wrote below and what Idle does today is
that the shell, defined in idlelib/PyShell.py, does not talk to the
subprocess interpreter directly but to a run supervisor defined in
idlelib/run.py through an rpc protocol ('cmd', 'arg string'). To use the
pipes, the supervisor would grab all input from stdin (instead of the
socket) and exec user code as it does today, or it could be replaced by
a supervisor class with an instance with a name like
_idle_supervisor_3_4_0_ that would be extremely unlikely to clash with
any name created by users.
Terry Jan Reedy

Guido van Rossum

unread,
Mar 28, 2014, 5:09:19 PM3/28/14
to Terry Reedy, Python-Dev
To be clear, the proposal for Idle would be to still use the RPC protocol, but run it over a pipe instead of a socket, right?


Antoine Pitrou

unread,
Mar 28, 2014, 5:12:55 PM3/28/14
to pytho...@python.org
On Fri, 28 Mar 2014 16:58:25 -0400
Terry Reedy <tjr...@udel.edu> wrote:
> On 3/28/2014 6:20 AM, Victor Stinner wrote:
>
> > Full example of asynchronous communication with a subprocess (the
> > python interactive interpreter) using asyncio high-level API:
>
> Thank you for writing this. As I explained in response to Josiah, Idle
> communicates with a python interpreter subprocess through a socket.
> Since making the connection is not dependable, I would like to replace
> the socket with the pipes. http://bugs.python.org/issue18823
>
> However, the code below creates a subprocess for one command and one
> response, which can apparently be done now with subprocess.communicate.
> What I and others need is a continuing (non-blocking) conversion with 1
> and only 1 subprocess (see my response to Josiah), and that is much more
> difficult. So this code does not do what he claims his will do.

Why don't you use multiprocessing or concurrent.futures? They have
everything you need for continuous conversation between processes.

Regards

Antoine.

Victor Stinner

unread,
Mar 28, 2014, 5:24:45 PM3/28/14
to Terry Reedy, pytho...@python.org

Le 28 mars 2014 21:59, "Terry Reedy" <tjr...@udel.edu> a écrit :
>
> On 3/28/2014 6:20 AM, Victor Stinner wrote:
>
>> Full example of asynchronous communication with a subprocess (the
>> python interactive interpreter) using asyncio high-level API:
>

> However, the code below creates a subprocess for one command and one response, which can apparently be done now with subprocess.communicate. What I and others need is a continuing (non-blocking) conversion with 1 and only 1 subprocess (see my response to Josiah), and that is much more difficult. So this code does not do what he claims his will do.

I tried to write the shortest example showing how to read and send data and how to make the call blocking. It's different to communicate() because write occurs after the first read.

It should be quite easy to enhance my example to execute more commands.

Victor

Richard Oudkerk

unread,
Mar 28, 2014, 6:09:11 PM3/28/14
to pytho...@python.org

On 28/03/2014 06:35 pm, Josiah Carlson wrote:

If it were me, I'd define three methods, with longer names to clarify what they do, e.g.

proc.write_nonblocking(data)
data = proc.read_nonblocking()
data = proc.read_stderr_nonblocking()

Easily doable.
To implement write_nonblocking() on Windows, do you intend to use SetNamedPipeHandleState() with PIPE_NOWAIT?  The documentation discourages using this:

   Note that nonblocking mode is supported for compatibility with
   Microsoft LAN Manager version 2.0 and should not be used to
   achieve asynchronous input and output (I/O) with named pipes.


And I guess you will need to use a poll/sleep loop to simulate blocking or multiplexing.  If you want expect-like behaviour then you need some sort of multiplexing.

-- RIchard

Terry Reedy

unread,
Mar 29, 2014, 4:38:18 AM3/29/14
to pytho...@python.org
On 3/28/2014 5:09 PM, Guido van Rossum wrote:
> To be clear, the proposal for Idle would be to still use the RPC
> protocol, but run it over a pipe instead of a socket, right?

The was and is the current proposal, assuming that it is the easiest
thing to do that would work. While responding to Victor, it occurred to
me as a speculative idea that once pipes were working, it *might* be
possible to implement the protocol as class methods, and let user code
go directly to the interpreter. I would ask what you thought before
seriously working on the idea.

--
Terry Jan Reedy


_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Terry Reedy

unread,
Mar 29, 2014, 4:44:32 AM3/29/14
to pytho...@python.org
On 3/28/2014 5:12 PM, Antoine Pitrou wrote:
> On Fri, 28 Mar 2014 16:58:25 -0400
> Terry Reedy <tjr...@udel.edu> wrote:

>> However, the code below creates a subprocess for one command and one
>> response, which can apparently be done now with subprocess.communicate.
>> What I and others need is a continuing (non-blocking) conversion with 1
>> and only 1 subprocess (see my response to Josiah), and that is much more
>> difficult. So this code does not do what he claims his will do.
>
> Why don't you use multiprocessing or concurrent.futures? They have
> everything you need for continuous conversation between processes.

I have not used either and no one suggested either before, while Amaury
Forgeot d'Arc and Guido suggested subprocess pipes. I added those two
ideas to the issue.

--
Terry Jan Reedy

Antoine Pitrou

unread,
Mar 29, 2014, 11:30:25 AM3/29/14
to pytho...@python.org
On Sat, 29 Mar 2014 04:44:32 -0400
Terry Reedy <tjr...@udel.edu> wrote:
> On 3/28/2014 5:12 PM, Antoine Pitrou wrote:
> > On Fri, 28 Mar 2014 16:58:25 -0400
> > Terry Reedy <tjr...@udel.edu> wrote:
>
> >> However, the code below creates a subprocess for one command and one
> >> response, which can apparently be done now with subprocess.communicate.
> >> What I and others need is a continuing (non-blocking) conversion with 1
> >> and only 1 subprocess (see my response to Josiah), and that is much more
> >> difficult. So this code does not do what he claims his will do.
> >
> > Why don't you use multiprocessing or concurrent.futures? They have
> > everything you need for continuous conversation between processes.
>
> I have not used either and no one suggested either before, while Amaury
> Forgeot d'Arc and Guido suggested subprocess pipes. I added those two
> ideas to the issue.

Looking at idlelib/rpc.py, it looks largely like an uncommented
(untested?) reimplementation of multiprocessing pipes, with weird
architecture choices (RPCServer is actually a client?).

multiprocessing should have everything you need: you can run child
processes, communicate with them using Queues, Locks, Conditions, or
you can even automate asynchronous execution with a process Pool. Those
are cross-platform and use the most appropriate platform-specific
primitives (for examples named pipes under Windows). They are also
quite well-tested, and duly maintained by Richard :-)

Regards

Antoine.

R. David Murray

unread,
Mar 29, 2014, 12:43:01 PM3/29/14
to pytho...@python.org
On Sat, 29 Mar 2014 16:30:25 +0100, Antoine Pitrou <soli...@pitrou.net> wrote:
> On Sat, 29 Mar 2014 04:44:32 -0400
> Terry Reedy <tjr...@udel.edu> wrote:
> > On 3/28/2014 5:12 PM, Antoine Pitrou wrote:
> > > On Fri, 28 Mar 2014 16:58:25 -0400
> > > Terry Reedy <tjr...@udel.edu> wrote:
> >
> > >> However, the code below creates a subprocess for one command and one
> > >> response, which can apparently be done now with subprocess.communicate.
> > >> What I and others need is a continuing (non-blocking) conversion with 1
> > >> and only 1 subprocess (see my response to Josiah), and that is much more
> > >> difficult. So this code does not do what he claims his will do.
> > >
> > > Why don't you use multiprocessing or concurrent.futures? They have
> > > everything you need for continuous conversation between processes.
> >
> > I have not used either and no one suggested either before, while Amaury
> > Forgeot d'Arc and Guido suggested subprocess pipes. I added those two
> > ideas to the issue.
>
> Looking at idlelib/rpc.py, it looks largely like an uncommented
> (untested?) reimplementation of multiprocessing pipes, with weird
> architecture choices (RPCServer is actually a client?).

I think instead we might call a "pre" implementation :)
I'm pretty sure this Idle stuff existed before multiprocessing did.

(In English, 'reimplementation' implies that multiprocessing existed
already, and therefore implies someone looked at it and copied the
concepts in it it badly, which is not the case as far as I'm aware.)

--David

Terry Reedy

unread,
Mar 29, 2014, 4:04:11 PM3/29/14
to pytho...@python.org
On 3/29/2014 11:30 AM, Antoine Pitrou wrote:
> On Sat, 29 Mar 2014 04:44:32 -0400
> Terry Reedy <tjr...@udel.edu> wrote:
>> On 3/28/2014 5:12 PM, Antoine Pitrou wrote:
[for Idle]
>>> Why don't you use multiprocessing or concurrent.futures? They have
>>> everything you need for continuous conversation between processes.
>>
>> I have not used either and no one suggested either before, while Amaury
>> Forgeot d'Arc and Guido suggested subprocess pipes. I added those two
>> ideas to the issue.
>
> Looking at idlelib/rpc.py, it looks largely like an uncommented

Some things have comments or docstrings; I am adding some as I can.

> (untested?)

The test of most of Idle is that it works when a person tries to use it,
which it most does. I am working on automated tests too.

> reimplementation of multiprocessing pipes, with weird
> architecture choices (RPCServer is actually a client?).

As David said, pre-implementation. It is at least a decade old.

> multiprocessing should have everything you need: you can run child
> processes, communicate with them using Queues, Locks, Conditions, or
> you can even automate asynchronous execution with a process Pool. Those
> are cross-platform and use the most appropriate platform-specific
> primitives (for examples named pipes under Windows). They are also
> quite well-tested, and duly maintained by Richard :-)

This is not the only thing Idle does that is or should be done
elsewhere. If it is done elsewhere in the stdlib (and tested), I am
happy to switch.

Idle originally created calltips from code objects and docstrings. When
inspect.get... did most of the job for functions coded in Python, Idle
switched to using that and some calltips code was removed. Once most C
coded functions work with inspect.signature (and I do hope the ArgClinic
work gets done for 3.5), I will switch and delete more code and some tests.

--
Terry Jan Reedy

Josiah Carlson

unread,
Mar 30, 2014, 2:58:59 AM3/30/14
to Python-Dev
I've got a patch with partial tests and documentation that I'm holding off on upload because I believe there should be a brief discussion.

Long story short, Windows needs a thread to handle writing in a non-blocking fashion, regardless of the use of asyncio or plain subprocess.

If you'd like to continue following this issue and participate in the discussion, I'll see you over on http://bugs.python.org/issue1191964 .

 - Josiah

Josiah Carlson

unread,
May 29, 2014, 8:35:12 PM5/29/14
to Python-Dev
Pinging this thread 2 months later with a progress/status update.

To those that have reviewed, commented, helped, or otherwise pushed this along, which includes (but is not limited to) Richard Oudkerk, eryksun, Giampaolo Rodola, thank you.


The short version:
As far as I can tell, the patch is ready: http://bugs.python.org/issue1191964

What is available:
There are docs, tests, and obviously the functionality. Some code was moved from asyncio/windows_utils.py (which has a separate issue here: https://code.google.com/p/tulip/issues/detail?id=170). The API was changed slightly from what was proposed by Guido:

sent = Popen.write_nonblocking(input, timeout=0)
data = Popen.read_nonblocking(bufsize=4096)
data = Popen.read_stderr_nonblocking(bufsize=4096)

As a bonus feature, Windows communicate() calls no longer spawn worker threads, and instead use overlapped IO.


I'm bringing this back up to python-dev to offer a slightly wider audience for commentary/concerns, and hopefully to get a stamp of approval that it is ready.

Thank you,
 - Josiah


Josiah Carlson

unread,
May 29, 2014, 8:36:25 PM5/29/14
to Python-Dev
And as I was writing the "thank you" to folks, I hit send too early. Also thank you to Victor Stinner, Guido, Terry Reedy, and everyone else on this thread :)

 - Josiah
Reply all
Reply to author
Forward
0 new messages