read stdout/stderr without blocking

18 views
Skip to first unread message

Jacek Popławski

unread,
Sep 12, 2005, 4:07:13 AM9/12/05
to
Popen from subprocess module gives me access to stdout, so I can read
it. Problem is, that I don't know how much data is available... How can
I read it without blocking my program?

example:
--------------------------------------------------------------------
import subprocess
import time

command="ls -l -R /"

p=subprocess.Popen(command,shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE)

while (p.poll()==None):
print "."
r=p.stdout.read()
--------------------------------------------------------------------

when you comment out read() - you will notice that loop is working, with
read() loop is blocked
Of course I don't need to read() inside loop, but... if output is very
long (like from "make") and I don't read from stdout - command will
block itself! I tried to increase bufsize, but it didn't help.

Is there a way to read only available data from stdout/stderr?
Is there a way to not block Popen command without reading stdout/stderr?

Adriaan Renting

unread,
Sep 12, 2005, 4:49:04 AM9/12/05
to pytho...@python.org
Check out the select module, for an example on how to use it:
pexpect.sourceforge.net




>>>Jacek Pop*awski <jp...@interia.pl> 09/12/05 10:07 am >>>

--
http://mail.python.org/mailman/listinfo/python-list

Jacek Popławski

unread,
Sep 12, 2005, 7:14:13 AM9/12/05
to
Adriaan Renting wrote:
> Check out the select module, for an example on how to use it:
> pexpect.sourceforge.net

Two problems:
- it won't work on Windows (Cygwin)
- how much data should I read after select? 1 character? Can it block if
I read 2 characters?

Adriaan Renting

unread,
Sep 12, 2005, 7:57:39 AM9/12/05
to pytho...@python.org
I was not aware you were using Windows, you might need to find something similar to select and pty that works in Windows or maybe go though Cygwin, I don't know. I'm on Linux, the only help I can offer is showing you my working code, that's a mix of Pexpect, subProcess and Parseltongue.
I'm not sure if this is 100% correct, it just happens to work and might help you in solving a similar problem:

---- in spawn()
(self._errorpipe_end, self._errorpipe_front) = os.pipe() ## need to handle stderr separate from stdout
try:
(self._pid, self._child_fd) = pty.fork()
except OSError, e:
raise Exception ('fork failed')
if self._pid == 0: ## the new client
try:
os.dup2(self._errorpipe_front, 2) ## we hardcoded assume stderr of the pty has fd 2
os.close(self._errorpipe_end)
os.close(self._errorpipe_front) ## close what we don't need
os.execvp(self.task, self.inputs)
except:
sys.stderr.write('Process could not be started: ' + self.task)
os._exit(1)
else: ## the parent
os.close(self._errorpipe_front) ## close what we don't need
fcntl.fcntl(self._child_fd, fcntl.F_SETFL, os.O_NONBLOCK)

---- in handle_messages()
tocheck=[]
if not self._fd_eof:
tocheck.append(self._child_fd)
if not self._pipe_eof:
tocheck.append(self._errorpipe_end)
ready = select.select(tocheck, [], [], 0.25) ##continues after 0.25s
for file in ready[0]:
try:
text = os.read(file, 1024)
except: ## probalby Input/Output error because the child died
text = ''
if text:
for x in self._expect:
if x[0] in text: ## we need to do something if we see this text
returntext = x[1](text)
if returntext:
os.write(file, returntext)
self.handle_text(text)
else:
if file == self._child_fd:
self._fd_eof = 1
elif file == self._errorpipe_end:
self._pipe_eof = 1
return 1
if self._fd_eof or self._pipe_eof: # should be an and not an or, but python 2.3.5 doesn't like it
return 0
if len(ready[0]) == 0: ## no data in 0.25 second timeout
return 1
return 0

---- in finish()
(pid, status) = os.waitpid(self._pid, os.WNOHANG) ## clean up the zombie
assert(pid == self._pid)
if os.WIFEXITED(status) or os.WIFSIGNALED(status):
self._pid = 0
self.exitstatus = status
assert(self.finished())
del self._pid
os.close(self._child_fd)
os.close(self._errorpipe_end)


|>>>Jacek Pop*awski <jp...@interia.pl> 09/12/05 1:14 pm >>>

|--
|http://mail.python.org/mailman/listinfo/python-list

Jacek Popławski

unread,
Sep 12, 2005, 8:39:42 AM9/12/05
to
> ready = select.select(tocheck, [], [], 0.25) ##continues after 0.25s
> for file in ready[0]:
> try:
> text = os.read(file, 1024)

How do you know here, that you should read 1024 characters?
What will happen when output is shorter?

Adriaan Renting

unread,
Sep 12, 2005, 10:09:30 AM9/12/05
to pytho...@python.org
The line only means it will read a maximum of 1024 characters, most of the output I try to catch is much shorter. I think that if the output is longer as 1024, it will read the rest after another call to select.select, but I think I have not yet come across that case and have not tested it.

I set the error pipe to OS_NONBLOCKing earlier in the code, but I can't remember if that has anything to do with the os.read().

Note: I'm using Python 2.3.4/2.3.5, not tested on 2.4.x yet.

>>>Jacek Pop*awski <jp...@interia.pl> 09/12/05 2:39 pm >>>

--
http://mail.python.org/mailman/listinfo/python-list

Grant Edwards

unread,
Sep 12, 2005, 10:41:19 AM9/12/05
to

It will return however much data is available.

--
Grant Edwards grante Yow! I'm a fuschia bowling
at ball somewhere in Brittany
visi.com

Jacek Popławski

unread,
Sep 13, 2005, 3:23:21 AM9/13/05
to
Grant Edwards wrote:
> On 2005-09-12, Jacek Pop?awski <jp...@interia.pl> wrote:
>
>>> ready = select.select(tocheck, [], [], 0.25) ##continues after 0.25s
>>> for file in ready[0]:
>>> try:
>>> text = os.read(file, 1024)
>>
>>How do you know here, that you should read 1024 characters?
>>What will happen when output is shorter?
>
>
> It will return however much data is available.

My tests showed, that it will block.

Jacek Popławski

unread,
Sep 13, 2005, 3:25:17 AM9/13/05
to
Only solution which works for now is to redirect stderr to stdout, and
read stdout on thread.
Code without thread or with read() or read(n) (when n>1) can block.
Code with select() and read(1) works, but it is very slow.

Peter Hansen

unread,
Sep 13, 2005, 8:07:57 AM9/13/05
to

Not if you use non-blocking sockets, as I believe you are expected to
when using select().

-Peter

Grant Edwards

unread,
Sep 13, 2005, 10:27:04 AM9/13/05
to
On 2005-09-13, Jacek Popławski <jp...@interia.pl> wrote:

>>>> ready = select.select(tocheck, [], [], 0.25) ##continues after 0.25s
>>>> for file in ready[0]:
>>>> try:
>>>> text = os.read(file, 1024)
>>>
>>>How do you know here, that you should read 1024 characters?
>>>What will happen when output is shorter?
>>
>> It will return however much data is available.
>
> My tests showed, that it will block.

You're right. I must have been remembering the behavior of a
network socket. Apparently, you're supposed to read a single
byte and then call select() again. That seems pretty lame.

--
Grant Edwards grante Yow! Psychoanalysis?? I
at thought this was a nude
visi.com rap session!!!

Jacek Popławski

unread,
Sep 13, 2005, 10:36:41 AM9/13/05
to
Grant Edwards wrote:
> You're right. I must have been remembering the behavior of a
> network socket. Apparently, you're supposed to read a single
> byte and then call select() again. That seems pretty lame.

I created another thread with single read(), it works, as long as I have
only one PIPE (i.e. stderr is redirected into stdout).
I wonder is it Python limitation or systems one (I need portable solution)?

Grant Edwards

unread,
Sep 13, 2005, 10:46:12 AM9/13/05
to

Not sure what you mean. Here is my test program that blocks on
the read(1024) call:

#!/usr/bin/python
import os,select

p = os.popen("while sleep 2; do date; done","r")
print p

while 1:
r,w,e = select.select([p],[],[],1)
if r:
d = r[0].read(1024)
print len(d),repr(d)
else:
print "timeout"

It also blocks if the call is changed to read(). This seems
pretty counter-intuitive, since that's not the way read()
usually works on pipes.

Here's the corresponding C program that works as I
expected (read(1024) returns available data):

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/select.h>

unsigned char buffer[1024];
int main(void)
{
fd_set readfds, writefds, exceptfds;
struct timeval tv;
FILE *fp;
int fd;

fp = popen("while sleep 2; do date; done","r");
if (!fp)
{
perror("popen");
exit(1);
}
fd = fileno(fp);

FD_ZERO(&readfds);
FD_ZERO(&writefds);
FD_ZERO(&exceptfds);

while (1)
{
int s;
FD_SET(fd,&readfds);
tv.tv_sec = 1;
tv.tv_usec = 0;
s = select(fd+1,&readfds,&writefds,&exceptfds,&tv);
if (s==0)
printf("timeout\n");
else if (s<0)
{
perror("select");
exit(2);
}
else
{
if FD_ISSET(fd,&readfds)
{
int n = read(fd,buffer,(sizeof buffer)-1);
buffer[n] = '\0';
printf("read %d: '%s'\n",n,buffer);
}
}
}
}


--
Grant Edwards grante Yow! Does that mean
at I'm not a well-adjusted
visi.com person??

Adriaan Renting

unread,
Sep 14, 2005, 3:59:43 AM9/14/05
to pytho...@python.org



>>>Jacek Pop*awski <jp...@interia.pl> 09/13/05 9:23 am >>>
Grant Edwards wrote:
>On 2005-09-12, Jacek Pop?awski <jp...@interia.pl> wrote:
>
>>> ready = select.select(tocheck, [], [], 0.25) ##continues after 0.25s
>>> for file in ready[0]:
>>> try:
>>> text = os.read(file, 1024)
>>
>>How do you know here, that you should read 1024 characters?
>>What will happen when output is shorter?
>
>
>It will return however much data is available.
|
|My tests showed, that it will block.
|

IIRC it only blocks if there's nothing to read, that's why the select.select is done, which has a 0.25s timeout.
Please also note the fcntl.fcntl(self._child_fd, fcntl.F_SETFL, os.O_NONBLOCK) I do in the beginning of my code.

I basically stole this system from subProcess and Pexpect, both use the same mechanic. I just mashed those two together so I can read stdout and stderr separately in (near) real time, and reply to questions the external process asks me (like password prompts). It works on Linux with Python 2.3.4, the OP seems to use a different platform so YMMV. You can poll instead of select I think, but probably also only on Unix, this is from an earlier version of my code:

## self.poll.register(self._child_fd)
## self.poll.register(self._errorpipe_end)
...
## if self._fd_eof and self._pipe_eof:
## return 0
## ready = self.poll.poll(250)
## for x in ready:
## text = ''
## if (x[1] & select.POLLOUT) or (x[1] & select.POLLPRI):
## try:
## text = os.read(x[0], 1024)
## except:
## if x[0] == self._child_fd:
## self._fd_eof = 1
## elif x[0] == self._errorpipe_end:
## self._pipe_eof = 1
## if (x[1] & select.POLLNVAL) or (x[1] & select.POLLHUP) or (x[1] & select.POLLERR) or (text == ''):
## if x[0] == self._child_fd:
## self._fd_eof = 1
## elif x[0] == self._errorpipe_end:
## self._pipe_eof = 1
## elif text:
...
## self.poll.unregister(self._child_fd)
## self.poll.unregister(self._errorpipe_end)


--
http://mail.python.org/mailman/listinfo/python-list

Adriaan Renting

unread,
Sep 14, 2005, 4:09:00 AM9/14/05
to pytho...@python.org
Please note that popen uses pipes, which are block devices, not character devices, so the writes will be done in blocks instead of characters/lines, (you can only read something _after_ the application at the other end of the pipe has done a flush or written 8192 bytes.

When reading from a pty like pexpect does, your read will not block until the stdio block buffer is filled.

Maybe using popen is your problem? The FAQ of Pexpect explains the problem very clearly.


>>>Jacek Pop*awski <jp...@interia.pl> 09/13/05 4:36 pm >>>

|Grant Edwards wrote:
|>You're right. I must have been remembering the behavior of a
|>network socket. Apparently, you're supposed to read a single
|>byte and then call select() again. That seems pretty lame.
|
|I created another thread with single read(), it works, as long as I have
| only one PIPE (i.e. stderr is redirected into stdout).
|I wonder is it Python limitation or systems one (I need portable solution)?

--
http://mail.python.org/mailman/listinfo/python-list

Donn Cave

unread,
Sep 15, 2005, 12:19:03 PM9/15/05
to
In article <ebGdnZ2dnZ1wQWfKnZ2dn...@powergate.ca>,
Peter Hansen <pe...@engcorp.com> wrote:

On the contrary, you need non-blocking sockets only if
you don't use select. select waits until a read [write]
would not block - it's like "if dict.has_key(x):" instead of
"try: val = dict[x] ; except KeyError:". I suppose you
knew that, but have read some obscure line of reasoning
that makes non-blocking out to be necessary anyway.
Who knows, but it certainly isn't in this case.

I don't recall the beginning of this thread, so I'm not sure
if this is the usual wretched exercise of trying to make this
work on both UNIX and Windows, but there are strong signs
of the usual confusion over os.read (a.k.a. posix.read), and
file object read. Let's hopefully forget about Windows for
the moment.

The above program looks fine to me, but it will not work
reliably if file object read() is substituted for os.read().
In this case, C library buffering will read more than 1024
bytes if it can, and then that data will not be visible to
select(), so there's no guarantee it will return in a timely
manner even though the next read() would return right
away. Reading one byte at a time won't resolve this problem,
obviously it will only make it worse. The only reason to
read one byte at a time is for data-terminated read semantics,
specifically readline(), in an unbuffered file. That's what
happens -- at the system call level, where it's expensive --
when you turn off stdio buffering and then call readline().

In the C vs. Python example, read() is os.read(), and file
object read() is fread(); so of course, C read() works
where file object read() doesn't.

Use select, and os.read (and UNIX) and you can avoid blocking
on a pipe. That's essential if as I am reading it there are supposed
to be two separate pipes from the same process, since if one is
allowed to fill up, that process will block, causing a deadlock if
the reading process blocks on the other pipe.

Hope I'm not missing anything here. I just follow this group
to answer this question over and over, so after a while it
gets sort of automatic.

Donn Cave, do...@u.washington.edu

Peter Hansen

unread,
Sep 15, 2005, 10:10:52 PM9/15/05
to
Donn Cave wrote:
> Peter Hansen <pe...@engcorp.com> wrote:
>>Jacek Popławski wrote:
>>>My tests showed, that it will block.
>>
>>Not if you use non-blocking sockets, as I believe you are expected to
>>when using select().
>
> On the contrary, you need non-blocking sockets only if
> you don't use select. select waits until a read [write]
> would not block - it's like "if dict.has_key(x):" instead of
> "try: val = dict[x] ; except KeyError:". I suppose you
> knew that, but have read some obscure line of reasoning
> that makes non-blocking out to be necessary anyway.

No, no, I suspect I was just plain wrong. I think I felt a twinge of
suspicion even as I wrote it, but I went ahead and hit Send anyway
perhaps because I'd had two nights with little sleep (if I need an
excuse to be wrong, that is, which I don't :-) ).

> Who knows, but it certainly isn't in this case.

Thanks for straightening things out for the record.

-Peter

Adriaan Renting

unread,
Sep 16, 2005, 3:02:59 AM9/16/05
to pytho...@python.org, do...@u.washington.edu
Great reply,

I had just mixed Pexpect and subProcess code until I'd got something that worked, you can actually explain my code better a I can myself. I find it quite cumbersome to read stdout/strerr separately, and to be able to write to stdin in reaction to either of them, but at least on Linux you can get it to work. My NON_BLOCKing command might be unnecesary, I'll try without it.

The OP seemed interested on how to do this on Windows, but I've yet to see an answer on that one I think.

Thank you for the reply.

Adriaan Renting

|>>>Donn Cave <do...@u.washington.edu> 09/15/05 6:19 pm >>>
|In article <ebGdnZ2dnZ1wQWfKnZ2dn...@powergate.ca>,
|Peter Hansen <pe...@engcorp.com> wrote:

|
|>Jacek Pop³awski wrote:
|>>Grant Edwards wrote:
|>>
|>>>On 2005-09-12, Jacek Pop?awski <jp...@interia.pl> wrote:
|>>>
|>>>>> ready = select.select(tocheck, [], [], 0.25) ##continues
|>>>>>after 0.25s
|>>>>> for file in ready[0]:
|>>>>> try:
|>>>>> text = os.read(file, 1024)
|>>>>
|>>>>
|>>>>How do you know here, that you should read 1024 characters?
|>>>>What will happen when output is shorter?
|>>>
|>>>It will return however much data is available.
|>>

|>>My tests showed, that it will block.
|>
|>Not if you use non-blocking sockets, as I believe you are expected to
|>when using select().
|
|On the contrary, you need non-blocking sockets only if
|you don't use select. select waits until a read [write]
|would not block - it's like "if dict.has_key(x):" instead of
|"try: val = dict[x] ; except KeyError:". I suppose you
|knew that, but have read some obscure line of reasoning
|that makes non-blocking out to be necessary anyway.

|Who knows, but it certainly isn't in this case.
|

Jacek Popławski

unread,
Sep 16, 2005, 5:07:28 AM9/16/05
to
Donn Cave wrote:
> I don't recall the beginning of this thread, so I'm not sure
> if this is the usual wretched exercise of trying to make this
> work on both UNIX and Windows,

It is used in "test framework" which runs on Linux, Windows (Cygwin) and
QNX. I can't forget about Windows.

Reply all
Reply to author
Forward
0 new messages