[Python-ideas] pmap, preduce, pmapreduce?

19 views
Skip to first unread message

Mike Meyer

unread,
May 25, 2012, 5:37:29 PM5/25/12
to python...@python.org
Another crazy idea that may not be possible, based on my finally
getting around to watching Guy Steele's talks about what he's up to
these days (http://vimeo.com/6624203).

Given a function that takes a list (or a container class which len
doesn't consume) and a function, and then applies that function to the
list in some way: either element wise, or in pairs of elements/results,
but does it in parallel. It will hold the GIL, but run the function
calls in distinct threads, meaning two applications of the function
could interfere with each other.

Is it possible to place limitations on the function such that this
kind of controlled concurrent operation is safe?

<mike
--
Mike Meyer <m...@mired.org> http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org
_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas

Joao S. O. Bueno

unread,
May 26, 2012, 12:17:29 AM5/26/12
to Mike Meyer, python...@python.org
On 25 May 2012 18:37, Mike Meyer <m...@mired.org> wrote:
> Another crazy idea that may not be possible, based on my finally
> getting around to watching Guy Steele's talks about what he's up to
> these days (http://vimeo.com/6624203).
>
> Given a function that takes a list (or a container class which len
> doesn't consume) and a function, and then applies that function to the
> list in some way: either element wise, or in pairs of elements/results,
> but does it in parallel. It will hold the GIL, but run the function
> calls in distinct threads, meaning two applications of the function
> could interfere with each other.


Just like the already existing "map" method in concurrent.futures.Executor
? *

js
-><-

* all praise the Python time machine

>     <mike
> --
> Mike Meyer <m...@mired.org>              http://www.mired.org/

Devin Jeanpierre

unread,
May 26, 2012, 8:17:35 AM5/26/12
to Mike Meyer, python...@python.org
On Fri, May 25, 2012 at 5:37 PM, Mike Meyer <m...@mired.org> wrote:
> Is it possible to place limitations on the function such that this
> kind of controlled concurrent operation is safe?

I'm not sure what you mean. Tentative answer: restrict it to pure functions.

-- Devin

Masklinn

unread,
May 26, 2012, 11:34:52 AM5/26/12
to Mike Meyer, python...@python.org
On 2012-05-25, at 23:37 , Mike Meyer wrote:
>
> Is it possible to place limitations on the function such that this
> kind of controlled concurrent operation is safe?

This would mean ideally only having pure functions, and at the very
least having functions which can't share state (not easily anyway).

Python, as a language, has no such provision that I know of beyond "be
careful" and "you're on your own".

A possible option, though, would be to use `multiprocessing` rather than
threads: multiprocessing.pool already provides a `map` operation, and
processes can't share state by default (doing so is quite an explicit
— and some would say involved — operation). Going through
multiprocessing puts other limitations/complexities on the function
implementations, but at the very least it wouldn't be possible to
*unknowingly* share state.

Mike Meyer

unread,
May 26, 2012, 6:04:18 PM5/26/12
to python...@python.org
On Sat, 26 May 2012 17:34:52 +0200
Masklinn <mask...@masklinn.net> wrote:
> On 2012-05-25, at 23:37 , Mike Meyer wrote:
> > Is it possible to place limitations on the function such that this
> > kind of controlled concurrent operation is safe?
> This would mean ideally only having pure functions, and at the very
> least having functions which can't share state (not easily anyway).

I'm not sure pure functions is good enough for cPython. If the
function involves looking through a tree of state (shared via the
arguments, even), then the changing reference counts as the code goes
through the key will hose you, unless the function evaluations are
serialized via the GIL.

> Python, as a language, has no such provision that I know of beyond "be
> careful" and "you're on your own".

Generally true for your code, but I think it tries to keep the
interpreter from tripping over it's own feet (via the GIL, etc.).

> A possible option, though, would be to use `multiprocessing` rather than
> threads: multiprocessing.pool already provides a `map` operation, and
> processes can't share state by default (doing so is quite an explicit
> — and some would say involved — operation). Going through
> multiprocessing puts other limitations/complexities on the function
> implementations, but at the very least it wouldn't be possible to
> *unknowingly* share state.

I'm familiar with that option, but was hoping to avoid it.

Though adding reduce (and maybe a mapreduce?) method to something like
concurrent.futures might be nice.

Thanks


<mike
--
Mike Meyer <m...@mired.org> http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Reply all
Reply to author
Forward
0 new messages