Am 30.04.2014 19:13, schrieb Vinzent Steinberg:
> Python does not support running threads in parallel because of the GIL.
Not quite so - there's actually threading modules in Python.
The GIL does not prevent multithreading per se, it just prevents
parallel bytecode execution. For programs like SymPy which are
bytecode-heavy, this means that you don't gain performance (but there
can be improved program structure for some kinds of tasks).
Now the GIL isn't present in all Python implementations.
ipython and parallelpython were mentioned elsewhere, and I believe that
Stackless Python doesn't have it either.
Also, Python does come with multithreading facilities in the standard
library, and I doubt that they would be there if the GIL were really the
end-all of any hope of having multithreading in Python.
> It
> can only run processes in parallel, which adds a large overhead, because
> memory cannot be shared.
Read-only data should be sharable.
Though I guess that the extremely dynamic nature of Python makes it hard
for the interpreter to identify read-only data, so everything ever
shared across threads would undergo frequent locking and unlocking
operations.
I'm only guessing here though.
> This means that all data shared between processes
> has to be serialized (usually using pickle).
For really slow algorithms that can take hours to complete (Risch etc.),
even that may pay off.
SymPy's data structures aren't very large, the pickling overhead would
be negligible for slow algorithms.
The larger problem would be to find out how reliable operation is
possible. I have little knowledge of how the various Python
implementations differ wrt. parallel execution; we might find that we
can't easily make SymPy work reliably on all relevant platforms. We
might also find that it's too much effort to get a parallel and a
sequential version of an algorithm to work equivalently, and keep them
that way while SymPy undergoes improvements and changes.