Just been told that GIL doesn't make things slower, but as I didn't know that such a thing even existed I went out looking for more info and found that document: http://www.dabeaz.com/python/UnderstandingGIL.pdf
Is it current? I didn't know Python threads aren't preemptive. Seems to be something really old considering the state of the art on parallel execution on multi-cores.
What's the catch on making Python threads preemptive? Are there any ongoing projects to make that?
> One: The common Python implementation uses a global interpreter lock
> to prevent interpreted code from interfering with itself in multiple
> threads. So "number cruncher" applications don't gain any speed from
> being partitioned into thread -- even on a multicore processor, only one
> thread can have the GIL at a time. On top of that, you have the overhead
> of the interpreter switching between threads (GIL release on one thread,
> GIL acquire for the next thread).
>
> Python threads work fine if the threads either rely on intelligent
> DLLs for number crunching (instead of doing nested Python loops to
> process a numeric array you pass it to something like NumPy which
> releases the GIL while crunching a copy of the array) or they do lots of
> I/O and have to wait for I/O devices (while one thread is waiting for
> the write/read operation to complete, another thread can do some number
> crunching).
>
> If you really need to do this type of number crunching in Python
> level code, you'll want to look into the multiprocessing library
> instead. That will create actual OS processes (each with a copy of the
> interpreter, and not sharing memory) and each of those can run on a core
> without conflicting on the GIL.
Which library do you suggest?
> --
> Wulfraed Dennis Lee Bieber AF6VN
> wlf...@ix.netcom.com HTTP://wlfraed.home.netcom.com/
>
> --
> http://mail.python.org/mailman/listinfo/python-list
On 18 May 2013 20:33, "Dennis Lee Bieber" <wlf...@ix.netcom.com> wrote:
> Python threads work fine if the threads either rely on intelligent
> DLLs for number crunching (instead of doing nested Python loops to
> process a numeric array you pass it to something like NumPy which
> releases the GIL while crunching a copy of the array) or they do lots of
> I/O and have to wait for I/O devices (while one thread is waiting for
> the write/read operation to complete, another thread can do some number
> crunching).
Has nobody thought of a context manager to allow a part of your code to free up the GIL? I think the GIL is not inherently bad, but if it poses a problem at times, there should be a way to get it out of your... Way.
I meant operating system preemptive. I've just checked and Python does not start Windows threads.
> The standard answers for using multiple cores is to either run
> multiple processes (either explicitly spawning other executables,
> or spawning child python processes using the multiprocessing module),
> or to use (as suggested) libraries that can do the compute intensive
> bits themselves, releasing the while doing so so that the Python
> interpreter can run other bits of your python code.
I've just discovered the multiprocessing module[1] and will make some tests with it later. Are there any other modules for that purpose?
I've found the following articles about Python threads. Any suggestions?
http://www.ibm.com/developerworks/aix/library/au-threadingpython/
http://pymotw.com/2/threading/index.html
http://www.laurentluce.com/posts/python-threads-synchronization-locks-rlocks-semaphores-conditions-events-and-queues/
[1] http://docs.python.org/2/library/multiprocessing.html
I just got my hands dirty trying to synchronize Python prints from many threads.
Sometimes they mess up when printing the newlines.
I tried several approaches using threading.Lock and Condition. None of them worked perfectly and all of them made the code sluggish.
Is there a 100% sure method to make print thread safe? Can it be fast???
> It is easy for a C extension to release the GIL, and then to do
> meaningful work until it needs to return to python land. Most C
> extensions will do that around non-trivial sections, and anything
> that may stall in the OS.
>
> So your use case for the context manager doesn't fit well.
> --
> Cameron Simpson <c...@zip.com.au>
>
> Gentle suggestions being those which are written on rocks of less than 5lbs.
> - Tracy Nelson in comp.lang.c
My use case was a tight loop processing an image pixel by pixel, or crunching a CSV file. If it only uses local variables (and probably hold a lock before releasing the GIL) it should be safe, no?
My idea is that it's a little bad to have to write C or use multiprocessing just to do simultaneous calculations. I think an application using a reactor loop such as twisted would actually benefit from this. Sure, it will be slower than a C implementation of the same loop, but isn't fast prototyping a very important feature of the Python language?
On 20May2013 07:25, Fábio Santos <fabiosa...@gmail.com> wrote:
| On 18 May 2013 20:33, "Dennis Lee Bieber" <wlf...@ix.netcom.com> wrote:
| > Python threads work fine if the threads either rely on intelligent
| > DLLs for number crunching (instead of doing nested Python loops to
| > process a numeric array you pass it to something like NumPy which
| > releases the GIL while crunching a copy of the array) or they do lots of
| > I/O and have to wait for I/O devices (while one thread is waiting for
| > the write/read operation to complete, another thread can do some number
| > crunching).
|
| Has nobody thought of a context manager to allow a part of your code to
| free up the GIL? I think the GIL is not inherently bad, but if it poses a
| problem at times, there should be a way to get it out of your... Way.
The GIL makes individual python operations thread safe by never
running two at once. This makes the implementation of the operations
simpler, faster and safer. It is probably totally infeasible to
write meaningful python code inside your suggested context
manager that didn't rely on the GIL; if the GIL were not held the
code would be unsafe.
It is pretty cool although it looks like a recursive function at first ;)
It works! Think I was running the wrong script...
Anyway, the suggestion you've made is the third and latest attempt that I've tried to synchronize the print outputs from the threads.
I've also used:
### 1st approach ###
lock = threading.Lock()
[...]
try:
lock.acquire()
[thread protected code]
finally:
lock.release()
### 2nd approach ###
cond = threading.Condition()
[...]
try:
[thread protected code]
with cond:
print '[...]'
### 3rd approach ###
from __future__ import print_function
def safe_print(*args, **kwargs):
global print_lock
with print_lock:
print(*args, **kwargs)
[...]
try:
[thread protected code]
safe_print('[...]')
Except for the first one all kind of have the same performance. The
problem was I placed the acquire/release around the whole code block,
instead of only the print statements.
Thanks a lot! ;)
I didn't know that.
On 20 May 2013 12:10, "Dave Angel" <da...@davea.name> wrote:
> Are you making function calls, using system libraries, or creating or deleting any objects? All of these use the GIL because they use common data structures shared among all threads. At the lowest level, creating an object requires locked access to the memory manager.
>
>
> Don't forget, the GIL gets used much more for Python internals than it does for the visible stuff.
I did not know that. It's both interesting and somehow obvious, although I didn't know it yet.
The only usage difference, AFAIK, is to add '\n' at the end of the string.
It's faster and thread safe (really?) by default.
BTW, why I didn't find the source code to the sys module in the 'Lib' directory?
----------------------------------------
> Date: Tue, 21 May 2013 11:50:17 +1000
> Subject: Re: Please help with Threading
> From: ros...@gmail.com
> To: pytho...@python.org
>
> On Tue, May 21, 2013 at 11:44 AM, 88888 Dihedral
> <dihedr...@googlemail.com> wrote:
>> OK, if the python interpreter has a global hiden print out
>> buffer of ,say, 2to 16 K bytes, and all string print functions
>> just construct the output string from the format to this string
>> in an efficient low level way, then the next question
>> would be that whether the uses can use functions in this
>> low level buffer for other string formatting jobs.
>
> You remind me of George.
> http://www.chroniclesofgeorge.com/
>
> Both make great reading when I'm at work and poking around with random
> stuff in our .SQL file of carefully constructed mayhem.
>
> ChrisA
lol I need more cowbell!!! Please!!! lol