Piping processes works with 'shell = True' but not otherwise.

Luca Cerone

unread,

May 24, 2013, 10:04:36 AM5/24/13

to

Hi everybody,
I am new to the group (and relatively new to Python)
so I am sorry if this issues has been discussed (although searching for topics in the group I couldn't find a solution to my problem).

I am using Python 2.7.3 to analyse the output of two 3rd parties programs that can be launched in a linux shell as:

program1 | program2

To do this I have written a function that pipes program1 and program2 (using subprocess.Popen) and the stdout of the subprocess, and a function that parses the output:

A basic example:

from subprocess import Popen, STDOUT, PIPE
def run():
p1 = Popen(['program1'], stdout = PIPE, stderr = STDOUT)
p2 = Popen(['program2'], stdin = p1.stdout, stdout = PIPE, stderr = STDOUT)
p1.stdout.close()
return p2.stdout

def parse(out):
for row in out:
print row
#do something else with each line
out.close()
return parsed_output

# main block here

pout = run()

parsed = parse(pout)

#--- END OF PROGRAM ----#

I want to parse the output of 'program1 | program2' line by line because the output is very large.

When running the code above, occasionally some error occurs (IOERROR: [Errno 0]). However this error doesn't occur if I code the run() function as:

def run():
p = Popen('program1 | program2', shell = True, stderr = STDOUT, stdout = PIPE)
return p.stdout

I really can't understand why the first version causes errors, while the second one doesn't.

Can you please help me understanding what's the difference between the two cases?

Thanks a lot in advance for the help,
Cheers, Luca

Luca Cerone

unread,

May 26, 2013, 6:31:51 AM5/26/13

to

>
> Can you please help me understanding what's the difference between the two cases?
>

Hi guys has some of you ideas on what is causing my issue?

Chris Rebert

unread,

May 26, 2013, 5:05:19 PM5/26/13

to Luca Cerone, Python

On May 24, 2013 7:06 AM, "Luca Cerone" <luca....@gmail.com> wrote:
>
> Hi everybody,
> I am new to the group (and relatively new to Python)
> so I am sorry if this issues has been discussed (although searching for topics in the group I couldn't find a solution to my problem).
>
> I am using Python 2.7.3 to analyse the output of two 3rd parties programs that can be launched in a linux shell as:
>
> program1 | program2
>
> To do this I have written a function that pipes program1 and program2 (using subprocess.Popen) and the stdout of the subprocess, and a function that parses the output:
>
> A basic example:
>
> from subprocess import Popen, STDOUT, PIPE
> def run():
> p1 = Popen(['program1'], stdout = PIPE, stderr = STDOUT)
> p2 = Popen(['program2'], stdin = p1.stdout, stdout = PIPE, stderr = STDOUT)

Could you provide the *actual* commands you're using, rather than the generic "program1" and "program2" placeholders? It's *very* common for people to get the tokenization of a command line wrong (see the Note box in http://docs.python.org/2/library/subprocess.html#subprocess.Popen for some relevant advice).

> p1.stdout.close()
> return p2.stdout
>
>
> def parse(out):
> for row in out:
> print row
> #do something else with each line
> out.close()
> return parsed_output
>
>
> # main block here
>
> pout = run()
>
> parsed = parse(pout)
>
> #--- END OF PROGRAM ----#
>
> I want to parse the output of 'program1 | program2' line by line because the output is very large.
>
> When running the code above, occasionally some error occurs (IOERROR: [Errno 0]).

Could you provide the full & complete error message and exception traceback?

> However this error doesn't occur if I code the run() function as:
>
> def run():
> p = Popen('program1 | program2', shell = True, stderr = STDOUT, stdout = PIPE)
> return p.stdout
>
> I really can't understand why the first version causes errors, while the second one doesn't.
>
> Can you please help me understanding what's the difference between the two cases?

One obvious difference between the 2 approaches is that the shell doesn't redirect the stderr streams of the programs, whereas you /are/ redirecting the stderrs to stdout in the non-shell version of your code. But this is unlikely to be causing the error you're currently seeing.

You may also want to provide /dev/null as p1's stdin, out of an abundance of caution.

Lastly, you may want to consider using a wrapper library such as http://plumbum.readthedocs.org/en/latest/ , which makes it easier to do pipelining and other such "fancy" things with subprocesses, while still avoiding the many perils of the shell.

Cheers,
Chris
--
Be patient; it's Memorial Day weekend.

Luca Cerone

unread,

May 26, 2013, 7:58:57 PM5/26/13

to

> Could you provide the *actual* commands you're using, rather than the generic "program1" and "program2" placeholders? It's *very* common for people to get the tokenization of a command line wrong (see the Note box in http://docs.python.org/2/library/subprocess.html#subprocess.Popen for some relevant advice).
>

Hi Chris, first of all thanks for the help. Unfortunately I can't provide the actual commands because are tools that are not publicly available.
I think I get the tokenization right, though.. the problem is not that the programs don't run.. it is just that sometimes I get that error..

Just to be clear I run the process like:

p = subprocess.Popen(['program1','--opt1','val1',...'--optn','valn'], ... the rest)

which I think is the right way to pass arguments (it works fine for other commands)..

>
> Could you provide the full & complete error message and exception traceback?
>

yes, as soon as I get to my work laptop..

>
> One obvious difference between the 2 approaches is that the shell doesn't redirect the stderr streams of the programs, whereas you /are/ redirecting the stderrs to stdout in the non-shell version of your code. But this is unlikely to be causing the error you're currently seeing.
>
>
> You may also want to provide /dev/null as p1's stdin, out of an abundance of caution.
>

I tried to redirect the output to /dev/null using the Popen argument:
'stdin = os.path.devnull' (having imported os of course)..
But this seemed to cause even more troubles...

> Lastly, you may want to consider using a wrapper library such as http://plumbum.readthedocs.org/en/latest/ , which makes it easier to do pipelining and other such "fancy" things with subprocesses, while still avoiding the many perils of the shell.
>
>

Thanks, I didn't know this library, I'll give it a try.
Though I forgot to mention that I was using the subprocess module, because I want the code to be portable (even though for now if it works in Unix platform is OK).

Thanks a lot for your help,
Cheers,
Luca

Carlos Nepomuceno

unread,

May 26, 2013, 8:14:24 PM5/26/13

to pytho...@python.org

pipes usually consumes disk storage at '/tmp'. Are you sure you have enough room on that filesystem? Make sure no other processes are competing against for that space. Just my 50c because I don't know what's causing Errno 0. I don't even know what are the possible causes of such error. Good luck!

----------------------------------------
> Date: Sun, 26 May 2013 16:58:57 -0700
> Subject: Re: Piping processes works with 'shell = True' but not otherwise.
> From: luca....@gmail.com
> To: pytho...@python.org
[...]

> I tried to redirect the output to /dev/null using the Popen argument:
> 'stdin = os.path.devnull' (having imported os of course)..
> But this seemed to cause even more troubles...
>
>> Lastly, you may want to consider using a wrapper library such as http://plumbum.readthedocs.org/en/latest/ , which makes it easier to do pipelining and other such "fancy" things with subprocesses, while still avoiding the many perils of the shell.
>>
>>
> Thanks, I didn't know this library, I'll give it a try.
> Though I forgot to mention that I was using the subprocess module, because I want the code to be portable (even though for now if it works in Unix platform is OK).
>
> Thanks a lot for your help,
> Cheers,
> Luca

> --
> http://mail.python.org/mailman/listinfo/python-list

Chris Angelico

unread,

May 27, 2013, 4:28:32 AM5/27/13

to pytho...@python.org

On Mon, May 27, 2013 at 9:58 AM, Luca Cerone <luca....@gmail.com> wrote:
>> Could you provide the *actual* commands you're using, rather than the generic "program1" and "program2" placeholders? It's *very* common for people to get the tokenization of a command line wrong (see the Note box in http://docs.python.org/2/library/subprocess.html#subprocess.Popen for some relevant advice).
>>
> Hi Chris, first of all thanks for the help. Unfortunately I can't provide the actual commands because are tools that are not publicly available.
> I think I get the tokenization right, though.. the problem is not that the programs don't run.. it is just that sometimes I get that error..

Will it violate privacy / NDA to post the command line? Even if we
can't actually replicate your system, we may be able to see something
from the commands given.

ChrisA

Luca Cerone

unread,

May 27, 2013, 7:33:19 AM5/27/13

to

>
>
> Will it violate privacy / NDA to post the command line? Even if we
>
> can't actually replicate your system, we may be able to see something
>
> from the commands given.
>
>

Unfortunately yes..

Chris Rebert

unread,

May 29, 2013, 1:17:37 PM5/29/13

to Luca Cerone, pytho...@python.org

On Sun, May 26, 2013 at 4:58 PM, Luca Cerone <luca....@gmail.com> wrote:
<snip>

> Hi Chris, first of all thanks for the help. Unfortunately I can't provide the actual commands because are tools that are not publicly available.
> I think I get the tokenization right, though.. the problem is not that the programs don't run.. it is just that sometimes I get that error..
>
> Just to be clear I run the process like:
>
> p = subprocess.Popen(['program1','--opt1','val1',...'--optn','valn'], ... the rest)
>
> which I think is the right way to pass arguments (it works fine for other commands)..

<snip>

>> You may also want to provide /dev/null as p1's stdin, out of an abundance of caution.
>
> I tried to redirect the output to /dev/null using the Popen argument:
> 'stdin = os.path.devnull' (having imported os of course)..
> But this seemed to cause even more troubles...

That's because stdin/stdout/stderr take file descriptors or file
objects, not path strings.

Cheers,
Chris

Thomas Rachel

unread,

May 29, 2013, 1:39:40 PM5/29/13

to

Am 27.05.2013 02:14 schrieb Carlos Nepomuceno:
> pipes usually consumes disk storage at '/tmp'.

Good that my pipes don't know about that.

Why should that happen?

Thomas

Carlos Nepomuceno

unread,

May 29, 2013, 3:31:05 PM5/29/13

to pytho...@python.org

----------------------------------------
> From: nutznetz-0c1b6768-bfa9...@spamschutz.glglgl.de

> Subject: Re: Piping processes works with 'shell = True' but not otherwise.

> Date: Wed, 29 May 2013 19:39:40 +0200
> To: pytho...@python.org

> --
> http://mail.python.org/mailman/listinfo/python-list

Ooops! My mistake! We've been using 'tee' when in debugging mode and I though that would apply to this case. Nevermind!

Cameron Simpson

unread,

May 29, 2013, 6:18:30 PM5/29/13

to Thomas Rachel, pytho...@python.org

It probably doesn't on anything modern. On V7 UNIX at least there
was a kernel notion of the "pipe fs", where pipe storage existed;
usually /tmp; using small real (but unnamed) files is an easy way
to implement them, especially on systems where RAM is very small
and without a paging VM - for example, V7 UNIX ran on PDP-11s amongst
other things. And files need a filesystem.

But even then pipes are still small fixed length buffers; they don't
grow without bound as you might have inferred from the quoted
statement.

Cheers,
--
Cameron Simpson <c...@zip.com.au>

ERROR 155 - You can't do that. - Data General S200 Fortran error code list

Luca Cerone

unread,

May 31, 2013, 5:28:47 AM5/31/13

to

>
> That's because stdin/stdout/stderr take file descriptors or file
>
> objects, not path strings.
>

Thanks Chris, how do I set the file descriptor to /dev/null then?

Peter Otten

unread,

May 31, 2013, 5:52:41 AM5/31/13

to pytho...@python.org

For example:

with open(os.devnull, "wb") as stderr:
p = subprocess.Popen(..., stderr=stderr)
...

In Python 3.3 and above:

p = subprocess.Popen(..., stderr=subprocess.DEVNULL)

Luca Cerone

unread,

Aug 5, 2013, 11:33:55 AM8/5/13

to

thanks and what about python 2.7?

>
>
> In Python 3.3 and above:
>
>
>
> p = subprocess.Popen(..., stderr=subprocess.DEVNULL)

P.s. sorry for the late reply, I discovered I don't receive notifications from google groups..

Tobiah

unread,

Aug 5, 2013, 1:54:06 PM8/5/13

to

p1 = Popen(['nsa_snoop', 'terror_suspect', '--no-privacy', '--dispatch-squad'], ...