[Python-ideas] Composability and concurrent.futures

8 views
Skip to first unread message

Adrian Sampson

unread,
May 16, 2012, 2:43:05 PM5/16/12
to python...@python.org
The concurrent.futures module in the Python standard library has problems with composability. If I start a ThreadPoolExecutor to run some library functions that internally use ThreadPoolExecutor, I will end up with many more worker threads on my system than I expect. For example, each parallel execution wants to take full advantage of an 8-core machine, I could end up with as many as 8*8=64 competing worker threads, which could significantly hurt performance.

This is because each instance of ThreadPoolExecutor (or ProcessPoolExecutor) maintains its own independent worker pool. Especially in situations where the goal is to exploit multiple CPUs, it's essential for any thread pool implementation to globally manage contention between multiple concurrent job schedulers.

I'm not sure about the best way to address this problem, but here's one proposal: Add additional executors to the futures library. ComposableThreadPoolExecutor and ComposableProcessPoolExecutor would each use a *shared* thread-pool model. When created, these composable executors will check to see if they are being created within a future worker thread/process initiated by another composable executor. If so, the "child" executor will forward all submitted jobs to the executor in the parent thread/process. Otherwise, it will behave normally, starting up its own worker pool.

Has anyone else dealt with composition problems in parallel programs? What do you think of this solution -- is there a better way to tackle this deficiency?

Adrian

_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas

Matt Joiner

unread,
May 21, 2012, 12:17:06 PM5/21/12
to Adrian Sampson, python...@python.org
It's my understanding this is a known flaw with concurrency *in general*. Currently most multi-{threaded,process} applications assume they're the only ones running on the system. As does the likely implementation of the proposed composable pools problem you've posed. A proper interprocess scheduler is required to handle this ideally. (See GCD, and runtime implementations that provide at least some userspace scheduling such as Go, however poor it may be). 

Secondly, composable pools don't handle recursive relationships well. If a thread in one pool depends on the completion of all the tasks in its own pool to complete before it can itself complete, you'll have deadlock.

Personally if I implemented a composable thread pool I'd have it global, creation and submission of tasks would be proxied to it via some composable executor class.

As it stands, thread pools are best for task-oriented concurrency rather than parallelism anyway, especially in CPython.

In short, I think composable thread pools are a hack at best and won't gain you anything except a slightly reduced threading overhead. If you want optimal utilization, threading isn't the right place to be looking.

Adrian Sampson

unread,
May 21, 2012, 1:21:01 PM5/21/12
to Matt Joiner, python...@python.org
On May 21, 2012, at 9:17 AM, Matt Joiner wrote:

> Personally if I implemented a composable thread pool I'd have it
> global, creation and submission of tasks would be proxied to it via
> some composable executor class.

I agree completely. Maybe the implementation I described was overly
hacky for the sake of transparent compatibility with the existing
(non-composable) executors in concurrent.futures. Ideally, the system
would have one global pool which many concurrency APIs -- not just
concurrent.futures -- could potentially share.

(In a *really* ideal world, the OS would provide thread pool management
-- like GCD, which you mentioned, or scheduler activations. But a
cross-platform library currently requires a less ambitious solution.)

> In short, I think composable thread pools are a hack at best and won't
> gain you anything except a slightly reduced threading overhead. If you
> want optimal utilization, threading isn't the right place to be
> looking.

To be clear, I meant to refer to processes *or* threads when discussing
the problem originally. The ProcessPoolExecutor is pretty useful (in my
experience) for easily getting speedup even on pure-Python CPU-bound
workloads.

Matt Joiner

unread,
May 24, 2012, 9:05:12 AM5/24/12
to Adrian Sampson, python...@python.org
To be clear, I meant to refer to processes *or* threads when discussing
the problem originally. The ProcessPoolExecutor is pretty useful (in my
experience) for easily getting speedup even on pure-Python CPU-bound
workloads.

FWIW that wasn't the default "use processes" spike. In my experience toying with concurrency in Python, trying to manage the load threads put on the system always ends badly. The 2 best supported concurrency mechanisms, threads and processes are constantly tête-à-tête, neither are adequate when you start to consider extreme concurrency scenarios. I suggest this because if you're considering composing executors, you're already trying to reduce the overhead (wastage) that processes and threads are incurring on your system for these purposes.

Nick Coghlan

unread,
May 24, 2012, 9:23:42 AM5/24/12
to Matt Joiner, Adrian Sampson, python...@python.org
It's really up to individual libraries to make it possible for
applications to provide the executor explicitly, rather than the
library assuming it's OK to just create its own.

Cheers,
Nick.

--
Nick Coghlan   |   ncog...@gmail.com   |   Brisbane, Australia
Reply all
Reply to author
Forward
0 new messages