reading lines from subprocess in a coroutine

590 views
Skip to first unread message

Phil Schaf

unread,
Jan 23, 2014, 10:04:45 AM1/23/14
to python...@googlegroups.com

hi, i opened a subprocess using

transport, proto = yield from loop.subprocess_exec(SubprocessProtocol, *mycmd)

now, how do i write something to stdin asynchronously, and then react to lines being written? basically:

stdin = transport.get_pipe_transport(0)
stdin.write(msg)  # this is backgrounded… coroutine version?
stdin.close()  # and/or stdin.write_eof()

reader = SomeKindOfReader(self.transport.get_pipe_transport(1))

while reader.can_read():  # not EOF or connection closed
    response = yield from reader.readline()  # should return when EOF or conn. closed
    do_sth_with(response)

Guido van Rossum

unread,
Jan 23, 2014, 10:48:46 AM1/23/14
to Phil Schaf, python-tulip
Read the source code of asyncio/streams.py. There are helper classes
that should let you do it. Please post the solution here.
--
--Guido van Rossum (python.org/~guido)

Phil Schaf

unread,
Jan 23, 2014, 11:17:49 AM1/23/14
to python...@googlegroups.com, Phil Schaf, gu...@python.org

Am Donnerstag, 23. Januar 2014 16:48:46 UTC+1 schrieb Guido van Rossum:

Read the source code of asyncio/streams.py. There are helper classes
that should let you do it. Please post the solution here.
--
--Guido van Rossum (python.org/~guido)

i’m deep inside that source for some hours now, but since i never did multiple inheritance, only your comment convinced me that i can ideed marry SubprocessProtocol and a StreamReaderProtocol.

import sys
from functools import partial
from asyncio.protocols import SubprocessProtocol
from asyncio.streams import StreamReader, StreamReaderProtocol

cmd = […]

@coroutine
def do_task(msg):
    loop = get_event_loop()
    reader = StreamReader(float('inf'), loop)

    transport, proto = yield from loop.subprocess_exec(
        partial(StdOutReaderProtocol, reader, loop=loop), *cmd)

    stdin = transport.get_pipe_transport(0)
    stdin.write(msg)
    stdin.write_eof()  # which of those is actually necessary? only eof? only close?
    stdin.close()

    while True:  # would be nice to do “for line in iter(reader.readline, b'')”, but not possible with coroutines
        line = yield from reader.readline()
        if not line:
            break
        do_something_with(line)

class StdOutReaderProtocol(StreamReaderProtocol, SubprocessProtocol):
    def pipe_data_received(self, fd, data):
        if fd == 1:
            self.data_received(data)
        else:
            print('stderr from subprocess:', data.decode(), file=sys.stderr, end='')

that was completely strange, though. imho there should be a easier way to do it instead of figuring this one out.

thanks for your encouragement!

– Phil

Gustavo Carneiro

unread,
Jan 23, 2014, 11:40:45 AM1/23/14
to Phil Schaf, python-tulip, Guido van Rossum
It's funny that I've been using subprocesses without even being aware of loop.subprocess_exec().  I just followed the child_process.py example in Tulip, and it works fine.  Why should we use loop.subprocess_xxx instead of plain old subprocess.Popen followed by connecting the pipes to asyncio streams?
--
Gustavo J. A. M. Carneiro
Gambit Research LLC
"The universe is always one step beyond logic." -- Frank Herbert

Guido van Rossum

unread,
Jan 23, 2014, 12:12:37 PM1/23/14
to Gustavo Carneiro, Phil Schaf, python-tulip
loop.subprocess_xxx() will give give your protocol a callback when the
process exits (which may be earlier or later than when the pipes are
closed). It also manages connecting the pipes for you. But you don't
*have* to use it.

Guido van Rossum

unread,
Jan 23, 2014, 12:16:05 PM1/23/14
to Phil Schaf, python-tulip
I would have preferred a solution without multiple inheritance but in
this case it seems pretty benign, since SubprocessProtocol is just an
interface class, while StreamReaderProtocol is an implementation
class. A way to avoid the multiple inheritance would be to instantiate
a StreamReaderProtocol instance as a member of your SubprocessProtocol
subclass constructor.

Gustavo Carneiro

unread,
Jan 23, 2014, 12:46:45 PM1/23/14
to Guido van Rossum, Phil Schaf, python-tulip
On 23 January 2014 17:12, Guido van Rossum <gu...@python.org> wrote:
loop.subprocess_xxx() will give give your protocol a callback when the
process exits (which may be earlier or later than when the pipes are
closed).

Sure.  But my feeling is that waiting for the pipe to be closed is good enough for most applications.  You care more about stdin/stdout than you do about the actual process, unless you are writing a daemon process supervisor kind of thing (upstart/systemd).

It also manages connecting the pipes for you.

Well, it gives you one thing but expects another in return.  It connects the pipes, but expects a protocol factory.  I don't want to write a protocol factory just to be able to run a subprocess and capture its output, thank you very much.

Also, this confuses me:

    def subprocess_exec(self, protocol_factory, *args, stdin=subprocess.PIPE,
                        stdout=subprocess.PIPE, stderr=subprocess.PIPE,
                        **kwargs):

So you have a single protocol_factory, but potentially 3 pipes.  Not only you are forced to implement your own protocol, but you have to use the same class to handle stdin, stdout, and stderr.  I would expect to have 3 different protocols, one for each pipe.

But you don't *have* to use it.

Sure.  I think it's easier for me to write my own asyncio-friendly subprocess.Popen wrapper instead.

Sorry, I didn't mean to criticize these APIs this late in the release process, but I hadn't noticed these methods and none of the tulip examples use them, so they have stayed under my radar.

Guido van Rossum

unread,
Jan 23, 2014, 12:48:56 PM1/23/14
to Gustavo Carneiro, Phil Schaf, python-tulip
Well, that's water under the bridge. I'm sure the current design is
perfectly usable.

Phil Schaf

unread,
Jan 23, 2014, 12:50:57 PM1/23/14
to python...@googlegroups.com, Phil Schaf, gu...@python.org

Am Donnerstag, 23. Januar 2014 18:16:05 UTC+1 schrieb Guido van Rossum:

I would have preferred a solution without multiple inheritance but in
this case it seems pretty benign, since SubprocessProtocol is just an
interface class, while StreamReaderProtocol is an implementation
class. A way to avoid the multiple inheritance would be to instantiate
a StreamReaderProtocol instance as a member of your SubprocessProtocol
subclass constructor.

yeah, that’ll work. i’m still confused about that whole factory and protocol business :)

the only change to my previous solution would then be:

class StdOutReaderProtocol(SubprocessProtocol):
    def __init__(self, reader, loop=None):
        self.reader_prot = StreamReaderProtocol(reader, loop=loop)
    
def pipe_data_received(self, fd, data):
        if fd == 1
:
            self.reader_prot.data_received(data)
        
Reply all
Reply to author
Forward
0 new messages