subprocess call is not waiting.

pauls...@gmail.com

unread,

Sep 13, 2012, 11:17:15 AM9/13/12

to

I have a subprocess.call which tries to download a data from a remote server using HTAR. I put the call in a while loop, which tests to see if the download was successful, and if not, loops back around up to five times, just in case my internet connection has a hiccup.

Subprocess.call is supposed to wait.

But it doesn't work as intended. The loop quickly runs 5 times, starting a new htar command each time. After five times around, my program tells me my download failed, because the target file doesn't yet exist. But it turns out that the download is still happening---five times.

When I run htar from the shell, I don't get a shell prompt again until after the download is complete. How come control is returned to python before the htar command is through?

I've tried using Popen with wait and/or communicate, but no waiting ever happens. This is troublesome not only because I don't get to post process my data, but because when I run this script for multiple datasets (checking to see whether I have local copies), I quickly get a "Too many open files" error. (I began working on that by trying to use Popopen with fds_close, etc.)

Should I just go back to os.system?

Jean-Michel Pichavant

unread,

Sep 13, 2012, 11:34:29 AM9/13/12

to pauls...@gmail.com, pytho...@python.org

> --
> http://mail.python.org/mailman/listinfo/python-list
>

A related subset of code would be useful.

You can use subprocess.PIPE to redirect stdout & stderr et get them with communicate, something like:

proc = subprocess.Popen(['whatever'], stdout=subprocess.PIPE, stdout=subprocess.PIPE)
stdout, stderr = proc.communicate()
print stdout
print stderr

Just by looking at stdout and stderr, you should be able to see why htar is returning so fast.

JM

PS : if you see nothing wrong, is it possible that htar runs asynchronously ?

MRAB

unread,

Sep 13, 2012, 11:35:26 AM9/13/12

to pytho...@python.org

Which OS? Is there some documentation somewhere?

MRAB

unread,

Sep 13, 2012, 11:46:38 AM9/13/12

to pytho...@python.org

On 2012-09-13 16:34, Jean-Michel Pichavant wrote:

>> --
>> http://mail.python.org/mailman/listinfo/python-list
>>
>
> A related subset of code would be useful.
>
> You can use subprocess.PIPE to redirect stdout & stderr et get them with communicate, something like:
>
> proc = subprocess.Popen(['whatever'], stdout=subprocess.PIPE, stdout=subprocess.PIPE)
> stdout, stderr = proc.communicate()
> print stdout
> print stderr
>
> Just by looking at stdout and stderr, you should be able to see why htar is returning so fast.
>
> JM
>
> PS : if you see nothing wrong, is it possible that htar runs asynchronously ?
>

The OP says that it waits when run from the shell.

woo...@gmail.com

unread,

Sep 13, 2012, 1:24:46 PM9/13/12

to

It possibly requires a "shell=True", but without any code on any way to test, we can not say.

pauls...@gmail.com

unread,

Sep 13, 2012, 2:36:40 PM9/13/12

to pytho...@python.org

Thanks, guys.
MRAB-RedHat 6 64-bit, Python 2.6.5
JM-Here's the relevant stuff from my last try. I've also tried with subprocess.call. Just now I tried shell=True, but it made no difference.

sticking a print(out) in there just prints a blank line in between each iteration. It's not until the 5 trials are finished that I am told: download failed, etc.

from os.path import exists
from subprocess import call
from subprocess import Popen
from shlex import split
from time import sleep

while (exists(file)==0) and (nTries < 5):
a = Popen(split('htar -xvf ' + htarArgs), stdout=PIPE, stderr=PIPE)
(out,err) = a.communicate()
if exists(file)==0:
nTries += 1
sleep(0.5)

if exists(file)==0: # now that the file should be moved
print('download failed: ' + file)
return 1

I've also tried using shell=True with popopen.

pauls...@gmail.com

unread,

Sep 13, 2012, 2:36:40 PM9/13/12

to comp.lan...@googlegroups.com, pytho...@python.org

Chris Rebert

unread,

Sep 14, 2012, 1:24:04 AM9/14/12

to pauls...@gmail.com, pytho...@python.org

On Thu, Sep 13, 2012 at 11:36 AM, <pauls...@gmail.com> wrote:
> Thanks, guys.
> MRAB-RedHat 6 64-bit, Python 2.6.5

In your Unix shell, what does the command:
type htar
output?

> JM-Here's the relevant stuff from my last try.

If you could give a complete, self-contained example, it would assist
us in troubleshooting your problem.

> I've also tried with subprocess.call. Just now I tried shell=True, but it made no difference.

It's possible that htar uses some trickery to determine whether it's
being invoked from a terminal or by another program, and changes its
behavior accordingly, although I could not find any evidence of that
based on scanning its manpage.

> sticking a print(out) in there just prints a blank line in between each iteration. It's not until the 5 trials are finished that I am told: download failed, etc.
>
> from os.path import exists
> from subprocess import call
> from subprocess import Popen
> from shlex import split
> from time import sleep
>
> while (exists(file)==0) and (nTries < 5):

`file` is the name of a built-in type in Python; it should therefore
not be used as a variable name.
Also, one would normally write that as:
while not exists(file) and nTries < 5:

> a = Popen(split('htar -xvf ' + htarArgs), stdout=PIPE, stderr=PIPE)

What's the value of `htarArgs`? (with any sensitive parts anonymized)

Also, you really shouldn't use shlex.split() at run-time like that.
Unless `htarArgs` is already quoted/escaped, you'll get bad results
for many inputs. Use shlex.split() once at the interactive interpreter
to figure out the general form of the tokenization, then use the
static result in your program as a template.

> (out,err) = a.communicate()
> if exists(file)==0:
> nTries += 1
> sleep(0.5)
>
> if exists(file)==0: # now that the file should be moved
> print('download failed: ' + file)
> return 1
>
> I've also tried using shell=True with popopen.

I presume you meant Popen.

Cheers,
Chris

Chris Rebert

unread,

Sep 14, 2012, 1:27:23 AM9/14/12

to pauls...@gmail.com, pytho...@python.org

On Thu, Sep 13, 2012 at 8:17 AM, <pauls...@gmail.com> wrote:
> I have a subprocess.call

<snip>

> But it doesn't work as intended.

<snip>

> Should I just go back to os.system?

Did the os.system() version work?

As of recent Python versions, os.system() is itself implemented using
the `subprocess` module, so if it does work, then it assuredly can be
made to work using the `subprocess` module instead.

Cheers,
Chris

Hans Mulder

unread,

Sep 14, 2012, 4:52:47 AM9/14/12

to

On 13/09/12 19:24:46, woo...@gmail.com wrote:
> It possibly requires a "shell=True",

That's almost always a bad idea, and wouldn't affect waiting anyway.

> but without any code or any way to test, we can not say.

That's very true.

-- HansM

pauls...@gmail.com

unread,

Sep 14, 2012, 8:22:44 AM9/14/12

to

os.system worked fine, and I found something in another section of code that was causing the "Too many open errors." (I was fooled, because output from subprocess call didn't seem to be coming out until the open files error.

I'll go back and play with subprocess.call more, since os.system works. That's interesting about using shlex at run time. Is that just for the sake of computational cost?

Wanderer

unread,

Sep 14, 2012, 1:38:14 PM9/14/12

to

On Friday, September 14, 2012 8:22:44 AM UTC-4, pauls...@gmail.com wrote:
> os.system worked fine, and I found something in another section of code that was causing the "Too many open errors." (I was fooled, because output from subprocess call didn't seem to be coming out until the open files error.
>
>
>
> I'll go back and play with subprocess.call more, since os.system works. That's interesting about using shlex at run time. Is that just for the sake of computational cost?

I never got the hang of subprocess, either. I ended up wrapping os.system in a python file and using subprocess to call that with:

subprocess.Popen([sys.executable, 'Wrapper.py'])

This works for me. I'm using Windows 7.

Chris Rebert

unread,

Sep 15, 2012, 12:02:22 AM9/15/12

to pauls...@gmail.com, pytho...@python.org

On Fri, Sep 14, 2012 at 5:22 AM, <pauls...@gmail.com> wrote:
> os.system worked fine, and I found something in another section of code that was causing the "Too many open errors." (I was fooled, because output from subprocess call didn't seem to be coming out until the open files error.
>
> I'll go back and play with subprocess.call more, since os.system works. That's interesting about using shlex at run time. Is that just for the sake of computational cost?

No, like I said, you'll also get incorrect results. shlex isn't magic.
If the exact command line it's given wouldn't work in the shell, then
it won't magically fix things. Many (most?) dynamic invocations of
shlex.split() are naive and flawed:

>>> import shlex
>>> filename = "my summer vacation.txt"
>>> # the following error is less obvious when the command is more complex
>>> # (and when the filename isn't hardcoded)
>>> cmd = "cat " + filename
>>> shlex.split(cmd)
['cat', 'my', 'summer', 'vacation.txt']
>>> # that's wrong; the entire filename should be a single list element

Equivalent bash error:
chris@mbp ~ $ cat my summer vacation.txt
cat: my: No such file or directory
cat: summer: No such file or directory
cat: vacation.txt: No such file or directory

The right way, in bash:
chris@mbp ~ $ cat my\ summer\ vacation.txt
Last summer, I interned at a tech company and...
chris@mbp ~ $ cat 'my summer vacation.txt'
Last summer, I interned at a tech company and…

And indeed, shlex will get that right too:
>>> shlex.split("cat my\ summer\ vacation.txt")
['cat', 'my summer vacation.txt']
>>> shlex.split("cat 'my summer vacation.txt'")
['cat', 'my summer vacation.txt']

BUT that presumes that your filenames are already pre-quoted or have
had backslashes added, which very seldom is the case in reality. So,
you can either find an escaping function and hope you never forget to
invoke it (cf. SQL injection), or you can figure out the general
tokenization and let `subprocess` handle the rest (cf. prepared
statements):

>>> split('cat examplesimplefilename')
['cat', 'examplesimplefilename']
>>> # Therefore…
>>> def do_cat(filename):
... cmd = ['cat', filename] # less trivial cases would be more interesting
... call(cmd)
...
>>> filename = "my summer vacation.txt"
>>> # remember that those quotes are Python literal syntax and aren't in the string itself
>>> print filename
my summer vacation.txt
>>> do_cat(filename)
Last summer, I interned at a tech company and…
>>>

Generally, use (a) deliberately simple test filename(s) with shlex,
then take the resulting list and replace the filename(s) with (a)
variable(s).

Or, just figure out the tokenization without recourse to shlex; it's
not difficult in most cases!
The Note in the Popen docs covers some common tokenization mistakes people make:
http://docs.python.org/library/subprocess.html#subprocess.Popen

Cheers,
Chris

pauls...@gmail.com

unread,

Sep 15, 2012, 8:59:52 AM9/15/12

to

That's a habit I'll make sure to avoid, then.
Thanks, Chris!

andrea crotti

unread,

Sep 18, 2012, 9:54:41 AM9/18/12

to pauls...@gmail.com, pytho...@python.org

I have a similar problem, something which I've never quite understood
about subprocess...
Suppose I do this:

proc = subprocess.Popen(['ls', '-lR'], stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

now I created a process, which has a PID, but it's not running apparently...
It only seems to run when I actually do the wait.

I don't want to make it waiting, so an easy solution is just to use a
thread, but is there a way with subprocess?

Message has been deleted

andrea crotti

unread,

Sep 19, 2012, 6:26:30 AM9/19/12

to Dennis Lee Bieber, pytho...@python.org

2012/9/18 Dennis Lee Bieber <wlf...@ix.netcom.com>:
>
> Unless you have a really massive result set from that "ls", that
> command probably ran so fast that it is blocked waiting for someone to
> read the PIPE.

I tried also with "ls -lR /" and that definitively takes a while to run,
when I do this:

proc = subprocess.Popen(['ls', '-lR', '/'], stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

nothing is running, only when I actually do
proc.communicate()

I see the process running in top..
Is it still an observation problem?

Anyway I also need to know when the process is over while waiting, so
probably a thread is the only way..

Hans Mulder

unread,

Sep 19, 2012, 9:23:49 AM9/19/12

to

On 19/09/12 12:26:30, andrea crotti wrote:
> 2012/9/18 Dennis Lee Bieber <wlf...@ix.netcom.com>:
>>
>> Unless you have a really massive result set from that "ls", that
>> command probably ran so fast that it is blocked waiting for someone to
>> read the PIPE.
>
> I tried also with "ls -lR /" and that definitively takes a while to run,
> when I do this:
>
> proc = subprocess.Popen(['ls', '-lR', '/'], stdout=subprocess.PIPE,
> stderr=subprocess.PIPE)
>
> nothing is running, only when I actually do
> proc.communicate()
>
> I see the process running in top..
> Is it still an observation problem?

Yes: using "top" is an observation problem.

"Top", as the name suggests, shows only the most active processes.

It's quite possible that your 'ls' process is not active, because
it's waiting for your Python process to read some data from the pipe.

Try using "ps" instead. Look in thte man page for the correct
options (they differ between platforms). The default options do
not show all processes, so they may not show the process you're
looking for.

> Anyway I also need to know when the process is over while waiting, so
> probably a thread is the only way..

This sounds confused.

You don't need threads. When 'ls' finishes, you'll read end-of-file
on the proc.stdout pipe. You should then call proc.wait() to reap
its exit status (if you don't, you'll leave a zombie process).
Since the process has already finished, the proc.wait() call will
not actually do any waiting.

Hope this helps,

-- HansM

Gene Heskett

unread,

Sep 19, 2012, 11:57:47 AM9/19/12

to pytho...@python.org

On Wednesday 19 September 2012 11:56:44 Hans Mulder did opine:

> On 19/09/12 12:26:30, andrea crotti wrote:
> > 2012/9/18 Dennis Lee Bieber <wlf...@ix.netcom.com>:
> >> Unless you have a really massive result set from that "ls",
> >> that
> >>
> >> command probably ran so fast that it is blocked waiting for someone
> >> to read the PIPE.
> >
> > I tried also with "ls -lR /" and that definitively takes a while to
> > run, when I do this:
> >
> > proc = subprocess.Popen(['ls', '-lR', '/'], stdout=subprocess.PIPE,
> > stderr=subprocess.PIPE)
> >
> > nothing is running, only when I actually do
> > proc.communicate()
> >
> > I see the process running in top..
> > Is it still an observation problem?
>
> Yes: using "top" is an observation problem.
>
> "Top", as the name suggests, shows only the most active processes.
>

Which is why I run htop in a shell 100% of the time. With htop, you can
scroll down and see everything.

> It's quite possible that your 'ls' process is not active, because
> it's waiting for your Python process to read some data from the pipe.
>
> Try using "ps" instead. Look in thte man page for the correct
> options (they differ between platforms). The default options do
> not show all processes, so they may not show the process you're
> looking for.
>
> > Anyway I also need to know when the process is over while waiting, so
> > probably a thread is the only way..
>
> This sounds confused.
>
> You don't need threads. When 'ls' finishes, you'll read end-of-file
> on the proc.stdout pipe. You should then call proc.wait() to reap
> its exit status (if you don't, you'll leave a zombie process).
> Since the process has already finished, the proc.wait() call will
> not actually do any waiting.
>
>
> Hope this helps,
>
> -- HansM

Cheers, Gene
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: <http://coyoteden.dyndns-free.com:85/gene> is up!
To know Edina is to reject it.
-- Dudley Riggs, "The Year the Grinch Stole the Election"

andrea crotti

unread,

Sep 19, 2012, 12:34:58 PM9/19/12

to Hans Mulder, pytho...@python.org

2012/9/19 Hans Mulder <han...@xs4all.nl>:

> Yes: using "top" is an observation problem.
>
> "Top", as the name suggests, shows only the most active processes.

Sure but "ls -lR /" is a very active process if you try to run it..
Anyway as written below I don't need this anymore.

>
> It's quite possible that your 'ls' process is not active, because
> it's waiting for your Python process to read some data from the pipe.
>
> Try using "ps" instead. Look in thte man page for the correct
> options (they differ between platforms). The default options do
> not show all processes, so they may not show the process you're
> looking for.
>
>> Anyway I also need to know when the process is over while waiting, so
>> probably a thread is the only way..
>
> This sounds confused.
>
> You don't need threads. When 'ls' finishes, you'll read end-of-file
> on the proc.stdout pipe. You should then call proc.wait() to reap
> its exit status (if you don't, you'll leave a zombie process).
> Since the process has already finished, the proc.wait() call will
> not actually do any waiting.
>
>
> Hope this helps,
>

Well there is a process which has to do two things, monitor
periodically some external conditions (filesystem / db), and launch a
process that can take very long time.

So I can't put a wait anywhere, or I'll stop everything else. But at
the same time I need to know when the process is finished, which I
could do but without a wait might get hacky.

So I'm quite sure I just need to run the subprocess in a subthread
unless I'm missing something obvious..

Hans Mulder

unread,

Sep 19, 2012, 1:31:59 PM9/19/12

to

On 19/09/12 18:34:58, andrea crotti wrote:
> 2012/9/19 Hans Mulder <han...@xs4all.nl>:
>> Yes: using "top" is an observation problem.
>>
>> "Top", as the name suggests, shows only the most active processes.
>
> Sure but "ls -lR /" is a very active process if you try to run it..

Not necessarily:

>> It's quite possible that your 'ls' process is not active because

>> it's waiting for your Python process to read some data from the pipe.

hbar...@ciphercloud.com

unread,

Sep 19, 2013, 7:42:09 AM9/19/13

to

subprocess.call(tempFileName, shell=True).communicate()

this process is not blocking. I want to make a blocking call to it. please help

Terry Reedy

unread,

Sep 19, 2013, 3:58:44 PM9/19/13

to pytho...@python.org

On 9/19/2013 7:42 AM, harish....@gmail.com wrote:
> subprocess.call(tempFileName, shell=True).communicate()

should raise an AttributeError as the int returned by subprocess.call
does not have a .communicate method.

> this process is not blocking.

Why do you think that? All function calls block until the function
returns, at which point blocking ceases. If you call
Popen(someprog).communicate() and someprog runs quickly, you will hardly
notice the blocking time.

--
Terry Jan Reedy