Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Python 3: Launch multiple commands(subprocesses) in parallel (but upto 4 any time at same time) AND store each of their outputs into a variable

37 views
Skip to first unread message

lax.c...@gmail.com

unread,
Aug 23, 2016, 11:15:43 PM8/23/16
to
Hi,

I've been reading various forums and python documentation on subprocess, multithreading, PIPEs, etc. But I cannot seem to mash together several of my requirements into working code.

I am trying to:

1) Use Python 3+ (specifically 3.4 if it matters)
2) Launch N commands in background (e.g., like subprocess.call would for individual commands)
3) But only limit P commands to run at same time
4) Wait until all N commands are done
5) Have an array of N strings with the stdout+stderr of each command in it.

What is the best way to do this?
There are literally many variations of things in the Python documentation and Stackoverflow that I am unable to see the forest from trees (for my problem).

Thank you very much!

Dale Marvin

unread,
Aug 24, 2016, 12:48:45 AM8/24/16
to
On 8/23/16 8:15 PM, lax.c...@gmail.com wrote:

> I am trying to:
>
> 1) Use Python 3+ (specifically 3.4 if it matters)
> 2) Launch N commands in background (e.g., like subprocess.call would
for individual commands)
> 3) But only limit P commands to run at same time
> 4) Wait until all N commands are done
> 5) Have an array of N strings with the stdout+stderr of each command
in it.
>
> What is the best way to do this?

The best way is a matter of opinion, I have had success using Celery
with Redis. <http://www.celeryproject.org/>

DAle

Paul Rubin

unread,
Aug 24, 2016, 1:25:41 AM8/24/16
to
Dale Marvin <dm...@dop.com> writes:
> The best way is a matter of opinion, I have had success using Celery
> with Redis. <http://www.celeryproject.org/>

I generally use GNU Parallel for stuff like that. Celery looks
interesting though much fancier.

Rob Gaddi

unread,
Aug 24, 2016, 12:57:27 PM8/24/16
to
First off, I'm assuming that the stdout+stderr of these commands is of
reasonable size rather than hundreds of megabytes.

What you want is a finite pool of threads (or processes) that execute
the tasks. multiprocessing.pool.Pool will do it. So will
concurrent.futures, which is what I'd personally use just out of more
familiarity with it.

In either case your task should wrap a call to subprocess.
subprocess.run is your easiest answer if you've got Python 3.5; the task
would call it with stdout and stderr=subprocess.PIPE, get the
CompletedProcess back, and then store the .stdout and .stderr string
results. For older Python, create a subprocess.Popen (again with stdout
and stderr=subprocess.PIPE) and call the communicate() method.

There's probably a dozen other ways. That one there, that's your
easiest.

--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com

Email address domain is currently out of order. See above to fix.
0 new messages