> Inside __call__ method after some processing we are in find_controller
> method - and here we are *accessing and modyfing* dictionary
> self.controller_classes. But this dictionary is shared between many
> threads, so is it thread safe?
Access to Python objects is thread-safe in that Python itself has a
Global Interpreter Lock (GIL) that prevents say, two threads from
updating the exact same key at the exact same time. The GIL locks on
the dict access/setting.
The GIL does not generally lock on many operations that occur at the C
layer, like I/O, and while waiting on database access.
Looking up Python GIL should provide quite a few threads about it.
It's for this reason that to ensure you're effectively using a multi-
core processor to its full potential, you should run a Pylons process
for every core. This is what I do for all the sites I run.
Cheers,
Ben
>
> I could not find anywhere unambigous answer if accessing Python
> primitives from many threads is safe or not - for me it looks that it
> might be not safe (because modyfing/iterating/accessing e.g.
> dictionary may result in context switches).
This is the most authoritative page:
http://effbot.org/pyfaq/what-kinds-of-global-value-mutation-are-thread-safe.htm
It's still a bit ambiguous on what is really safe, but I can guarantee
the two basic dict operations in question (a getitem and setitem on
steady dict value) are in fact safe.
--
Philip Jenvey
> Although, if I read this correctly, you are counting on the fact the
> current implementation of the GIL will always remain exactly this way
> for the operations getitem and setitem. That seems to be a fairly
> safe bet though, right?
If it stops working in this way, there'll be significantly larger
problems in Python code than just Pylons. :)
- Ben
I know CPython has the GIL, but I am not sure if all python
implementations hvae it. If I remember correctly stackless python did
not suffer from that particular wart.
Wichert.
--
Wichert Akkerman <wic...@wiggy.net> It is simple to make things.
http://www.wiggy.net/ It is hard to make things simple.
Yeah, this site is a bit ambigous, e.g.
"Operations that replace other objects may invoke those other objects’
__del__ method when their reference count reaches zero, and that can
affect things. This is especially true for the mass updates to
dictionaries and lists. When in doubt, use a mutex!"
But before that statement author said that following operations are atomic:
D[x] = y
D1.update(D2)
where x,y are objects and D1 and D2 are dict. So how this could be true?
But even if we assume that getitem and setitem on dict are atomic (on
'steady' value only? which means primitives?). How to solve problem
which I am facing now:
I want to start another thread T1 in my Pylons App (manually, not
Paste#http worker) and have some global dict which all http workers
will read. But T1 periodically will update this dictionary (in fact
all he want to do is to swap this global dict with local dict which
was prepared during his work). In this dict I will have Python
primitives (other dicts too) and simple classes which act only as
'structs'. Do I need lock?
Cheers,
Kamil Gorlo
If unsure I'd suggest use a lock. What are you saving by skipping the
lock? If the dict swap happens infrequently then the lock will be
insignificant.
You also need to ensure that modifications to any of the objects that
the dict points to are protected by locks.
Cheers,
Chris Miles
Stackless has the GIL, IronPython does not, Jython does not
<http://fisheye3.atlassian.com/browse/jython/trunk/jython/src/org/python/compiler/Future.java?r1=4203&r2=4202>.
--
Lawrence Oluyede
[eng] http://oluyede.org - http://twitter.com/lawrenceoluyede
[ita] http://neropercaso.it - http://twitter.com/rhymes
I will probably use lock, but even if this will be own implemented
RWLock (because there is no ReadWriteLock in python library) readers
also have to acquire it, not only writer thread. I was asking because
I want to know what can I assume using python primitives in
multithread environment and it looks that nobody knows for sure, what
is quite strange.
Hovewer, first tests shows on my ubuntu machine with python 2.5 that
using lock sometimes is faster than not using it. Could anybody
explain this?
This is simple example showing that x += 1 is not atomic:
>>> test1.py
import threading
gv = 0
class T(threading.Thread):
def run(self):
global gv
for i in xrange(10000000):
gv += 1
x = T()
y = T()
x.start()
y.start()
x.join()
y.join()
print gv
<<< test1.py
This is another example, but this time with lock:
>>> test2.py
import threading
gv = 0
gl = threading.Lock()
class T(threading.Thread):
def __init__(self, lock):
threading.Thread.__init__(self)
self.lock = lock
def run(self):
global gv
self.lock.acquire()
try:
for i in xrange(10000000):
gv += 1
finally:
self.lock.release()
x = T(gl)
y = T(gl)
x.start()
y.start()
x.join()
y.join()
print gv
<<< test2.py
And here are my results:
$ time python test1.py
14224533
real 0m6.630s
user 0m5.616s
sys 0m2.560s
$ time python test2.py
20000000
real 0m4.446s
user 0m4.380s
sys 0m0.036s
Cheers,
Kamil
I believe the adjective "atomic" applies only to the affect on D or D1,
in this case, and given the comment about __del__ I think it probably
doesn't apply even to the .update() case (unless D2 has length 1).
> But even if we assume that getitem and setitem on dict are atomic (on
> 'steady' value only? which means primitives?). How to solve problem
> which I am facing now:
>
> I want to start another thread T1 in my Pylons App (manually, not
> Paste#http worker) and have some global dict which all http workers
> will read. But T1 periodically will update this dictionary (in fact
> all he want to do is to swap this global dict with local dict which
> was prepared during his work). In this dict I will have Python
> primitives (other dicts too) and simple classes which act only as
> 'structs'. Do I need lock?
If by "swap" you simply mean you will be rebinding the global name to a
new dictionary, then the rebinding itself will certainly be atomic. One
the other hand, if the various worker threads access this global name
more than once in a request (or session, or whatever) then you will
certainly have problems as they would use the old dictionary for part of
the request then use the new one.
If they simply grab a local reference to the global name (possibly
storing it in their request object, or session, depending on what you
need) when first needed, and thereafter access it only from there, I
can't see how you'd have any problems (ignoring issues involving changes
to various contained items, which might be shared if you haven't ensured
they are not).
I think if you aren't sure yet what will work, post some pseudo-code
showing what you would do if you were to ignore thread-safety issues.
Also note that use of the Queue module is a very common step to dealing
with thread-safety issues, generally resulting in most or all potential
problems just going away. If that could work for you it's probably the
best bet.
--
Peter Hansen, P.Eng.
Engenuity Corporation
416-617-1499
Yes, of course - we should consider "atomic" separately for each line.
>> But even if we assume that getitem and setitem on dict are atomic (on
>> 'steady' value only? which means primitives?). How to solve problem
>> which I am facing now:
>>
>> I want to start another thread T1 in my Pylons App (manually, not
>> Paste#http worker) and have some global dict which all http workers
>> will read. But T1 periodically will update this dictionary (in fact
>> all he want to do is to swap this global dict with local dict which
>> was prepared during his work). In this dict I will have Python
>> primitives (other dicts too) and simple classes which act only as
>> 'structs'. Do I need lock?
>
> If by "swap" you simply mean you will be rebinding the global name to a
> new dictionary, then the rebinding itself will certainly be atomic. One
> the other hand, if the various worker threads access this global name
> more than once in a request (or session, or whatever) then you will
> certainly have problems as they would use the old dictionary for part of
> the request then use the new one.
>
> If they simply grab a local reference to the global name (possibly
> storing it in their request object, or session, depending on what you
> need) when first needed, and thereafter access it only from there, I
> can't see how you'd have any problems (ignoring issues involving changes
> to various contained items, which might be shared if you haven't ensured
> they are not).
Yes, that is the case. I have global variable and 'swap' means
'rebinding' to new dictionary.
So there is global dict G and two types of threads:
1. HTTP Worker (in many instances), and he does in each request:
- local_G = G
- from this moment operate only on 'local_G'
2. Refresh Worker (only one instance) and he from time to time does:
- tmp_G = {'bla' : 42, .... }
- G = tmp_G
So everything probably will be OK and this solution is thread-safe in
Python. But.. I will probably use locks even if they *might* be not
necessary - this code is also for other people than me and for them it
will be probably easier to read and understand that everything works
OK. Especially that there is probably no performance trade-off.
> I think if you aren't sure yet what will work, post some pseudo-code
> showing what you would do if you were to ignore thread-safety issues.
>
> Also note that use of the Queue module is a very common step to dealing
> with thread-safety issues, generally resulting in most or all potential
> problems just going away. If that could work for you it's probably the
> best bet.
I know this Queue pattern and I also know that it is probably the best
solution for synchronization problems not only in Python :) I was only
wondering how much we can get from Python primitives :)
However - big thanks to all of you guys for posts, I've learned
something new (as everyday :)).
Cheers,
Kamil
>
> On Thu, Mar 5, 2009 at 8:07 AM, Wichert Akkerman <wic...@wiggy.net>
> wrote:
>> I know CPython has the GIL, but I am not sure if all python
>> implementations hvae it. If I remember correctly stackless python did
>> not suffer from that particular wart.
>
> Stackless has the GIL, IronPython does not, Jython does not
> <http://fisheye3.atlassian.com/browse/jython/trunk/jython/src/org/python/compiler/Future.java?r1=4203&r2=4202
> >.
Though Jython and IronPython lack a GIL, they ensure the methods we
expect to be thread safe on the core data structures are in fact
thread safe, for compatibility with CPython.
--
Philip Jenvey
So, is there any place where can I read what is thread safe in
IronPython or Jython (what means: what should be done to be compatible
with CPython and what means: what can CPython guarantee in terms of
thread safety)?
Cheers,
Kamil
Unfortunately, no, not one that I know of.
Which is a little unfortunate for Jython/IronPython -- their
collection classes would be a little faster if they didn't do thread
safety. Even still, you should be able to utilize multiple threads
better on these implementations. They're essentially doing fine
grained locking, instead of one giant lock.
--
Philip Jenvey