Adding active/idle/total worker counts for both ThreadPoolExecutor and ProcessPoolExecutor is pretty straightforward; I threw a patch together for both in 30 minutes or so. However, I don't think its possible to inspect the contents of a ProcessPoolExecutor's queue without actually consuming items from it. While it *is* possible with ThreadPoolExecutor, I don't think we should expose it - the queue.Queue() implementation ThreadPoolExecutor relies on doesn't have a public API for inspecting its contents, so ThreadPoolExecutor probably shouldn't expose one, either. Identifying which task each worker is processing is possible, but would perhaps require more work than its worth, at least for ProcessPoolExecutor.I do think adding worker count APIs is reasonable, and in-line with a TODO item in the ThreadPoolExecutor source:# TODO(bquinlan): Should avoid creating new threads if there are more# idle threads than items in the work queue.So, at the very least there have been plans to internally keep track active/idle thread counts. If others agree it's a good idea, I'll open an issue on the tracker for this and include my patch (which also addresses that TODO item).
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
I agree that basic executor parameters could be reflected, and I also
agree that some other pieces of runtime state cannot be reliably
computed and therefore shouldn't be exposed.
Don't hesitate to open an issue with your patch.
Regards
Antoine.
--
--- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/pl3r5SsbLLU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I cannot say for sure without taking a more detailed look at
concurrent.futures :-) However, any runtime information such as "the
tasks current being processes" (as opposed to, say, waiting) may not be
available to the calling thread or process, or may be unreliable once it
returns to the function's caller (since the actual state may have
changed in-between).
In the former case (information not available to the main process), we
can't expose the information at all; in the latter case, we may still
choose to expose it with the usual caveats in the documentation (exactly
like Queue.qsize()).
Not if that would make the implementation much more complicated, or
significantly slower.
---
You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/pl3r5SsbLLU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to python-ideas...@googlegroups.com.
What if an implementation wants to use something other than a queue?
It seems you're breaking the abstraction here.
Regards
Antoine.
>The IntrospectableQueue idea seems reasonable to me. I think I would prefer passing an introspectable (or similar) keyword to the Executor rather than a queue class, though. Adding support for identifying which tasks are active introduces some extra overhead, which I think can reasonably be made optional. If we're going to use a different Queue class to enable introspection, we might as well disable the other stuff that we're doing to make introspection work. It also makes it easier to raise an exception if an API is called that won't work without IntrospectableQueue being used.
Even though this was my suggestion, let me play devil's advocate for a second…
The main reason to use this is for debugging or exploratory programming.
In the debugger, of course, it's not necessary, because you can just break and suspend all the threads while you do what you want. Would it be reasonable to do the same thing outside the debugger, by providing a threading.Thread.suspend API (and of course the pool and executor APIs have a suspend method that suspends all their threads) so you can safely access the queue's internals?
Obviously suspending threads in general is a bad thing to do unless you're a big fan of deadlocks, but for debugging and exploration it seems reasonable; if a program occasionally deadlocks or crashes while you're screwing with its threads to see what happens, well, you were screwing with its threads to see what happens…
That might be a horrible attractive nuisance, but if you required an extra flag to be passed in at construction time to make these methods available, and documented that it was unsafe and potentially inefficient, it might be acceptable.
On the other hand, it's hard to think of a case where this is a good answer but "just run it in the debugger" isn't a better answer…
>>> Does Jython have to use a mutex and a deque instead of a more efficient (and possibly lock-free) queue from the Java stdlib?
>
>For what it's worth, Jython just uses CPython's queue.Queue implementation, as far as I can tell.
Now that I think about it, that makes sense; if I really need a lock-free thread pool and queue in Jython, I'm probably going to use the native Java executors, not the Python ones, right?
>>> What does multiprocessing.Queue do on each implementation?
>
>In addition to a multiprocessing.Queue, the ProcessPoolExecutor maintains a dict of all submitted work items, so that can be used instead of trying to inspect the queue itself.
Interesting. This implies that supplying an inspectable queue class may not be the best answer here; instead, we could have an option for an inspectable work dict, which would just expose the existing one for ProcessPoolExecutor, while it would make ThreadPoolExecutor maintain an equivalent dict as a thread-local in the launching thread. (I'm assuming you only need to inspect the jobs from the launching process/thread here… I'm not sure if that's sufficient for the OP's intended use or not.)
> > Le 25/08/2014 23:02, Ethan Furman a écrit :
>> On 08/25/2014 07:51 PM, Dan O'Reilly wrote:
>>>
>>> The IntrospectableQueue idea seems reasonable to me. I think I would
>>> prefer passing an introspectable (or similar)
>>> keyword to the Executor rather than a queue class, though.
>>
>> Passing the class is the better choice -- it means that future needs can
>> be more easily met by designing the queue variant needed and passing it
>> in -- having a keyword to select only one option is unnecessarily limiting.
>
> What if an implementation wants to use something other than a queue?
> It seems you're breaking the abstraction here.
A collection of threads and a shared queue is almost the definition of a thread pool. What else would you use?
Also, this could make it a lot easier to create variations on ThreadPoolExecutor without subclassing or forking it. For example, if you want your tasks to run in priority order, just give it a priority queue keyed on task.priority. If you want a scheduled executor, just give it a priority queue whose get method blocks until the first task's task.timestamp or a new task is added ahead of the first. And so on.
I'm not sure if that's a good idea or not, but it's an interesting possibility at least…
Definitions don't necessarily have any relationship with the way a
feature is implemented. Perhaps some version of concurrent.futures would
like to use some advanced dispatch mechanism provided by the OS (or
shared memory, or whatever).
(I'll note that such "flexibility" has been chosen for the API of
threading.Condition and it is making it difficult to write an optimized
implementation that would you use OS-native facilities, such as pthread
condition variables)
We have come from a simple proposal to introspect some runtime
properties of an executor to the idea of swapping out a building block
with another. That doesn't sound reasonable.
Regards
Antoine.
So, you didn't find the docs?
https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor
"""
class concurrent.futures.ThreadPoolExecutor(max_workers)
An Executor subclass that uses a pool of at most max_workers
threads to execute calls asynchronously.
"""
https://docs.python.org/3/library/concurrent.futures.html#processpoolexecutor
"""
class concurrent.futures.ProcessPoolExecutor(max_workers=None)
An Executor subclass that executes calls asynchronously using a
pool of at most max_workers processes. If max_workers is None or not
given, it will default to the number of processors on the machine.
"""
Regards
Antoine.
I did find the docs, and even with your plain text guide I almost didn't see them when I looked just now. Too much
fancy going on there, and all the green examples -- yeah, it's hard to read.
For comparison, here's what help(ThreadPoolExecutor) shows:
class ThreadPoolExecutor(concurrent.futures._base.Executor)
| Method resolution order:
| ThreadPoolExecutor
| concurrent.futures._base.Executor
| builtins.object
|
| Methods defined here:
|
| __init__(self, max_workers)
| Initializes a new ThreadPoolExecutor instance.
|
| Args:
| max_workers: The maximum number of threads that can be used to
| execute the given calls.
|
| shutdown(self, wait=True)
| Clean-up the resources associated with the Executor.
|
| It is safe to call this method several times. Otherwise, no other
| methods can be called after this one.
|
| Args:
| wait: If True then shutdown will not return until all running
| futures have finished executing and the resources used by the
| executor have been reclaimed.
|
| submit(self, fn, *args, **kwargs)
| Submits a callable to be executed with the given arguments.
|
| Schedules the callable to be executed as fn(*args, **kwargs) and returns
| a Future instance representing the execution of the callable.
|
| Returns:
| A Future representing the given call.
Much easier to understand.
Looking at the docs again, I think the biggest hurdle to finding that line and recognizing it for what it is is the fact
that it comes /after/ all the examples. That's backwards. Why would you need examples for something you haven't read yet?
--
~Ethan~
On 26 Aug 2014 16:12, "Ethan Furman" <et...@stoneleaf.us> wrote:
> Looking at the docs again, I think the biggest hurdle to finding that line and recognizing it for what it is is the fact that it comes /after/ all the examples. That's backwards. Why would you need examples for something you haven't read yet?
Many of our module docs serve a dual purpose as a tutorial *and* as an API reference. That's actually a problem, and often a sign of a separate "HOWTO" guide trying to get out.
Actually doing the work to split them is rather tedious though, so it tends not to happen very often.
Cheers,
Nick.