Critical sections and mutexes

David Brady

unread,

Oct 23, 2001, 3:06:59 PM10/23/01

to

Firstoff, Shouldn't the plural of "mutex" be
"mutices"? :-)

I am trying to write a multithreaded app in Python on
Win32. These threads need to communicate with one
another, so they can collaborate. Right now, there is
a common resource in the main thread that each child
needs to interact with, and there are resources in
each child thread that the main thread will interact
with.

I've read through the thread and threading
documentation as well as the sample code in Appendix D
of Python Programming on Win32. I can make threads
start, run and stop. But the "lock object" code is
baffling me outright. Do I create a lock inside the
child thread, or do I create it wherever and attach it
to a thread? What I'm wondering is:

- How can I lock a resource owned by a child thread
from the main application, diddle with it, and unlock
it?
- How can I lock a resource owned by the main thread
from a child thread, diddle with it, and unlock it?
- Each lock/diddle/unlock operation is designed to
keep the resource locked for as little time as
possible, but I can still imagine the thread being
preempted in the middle of an update. Do I need a
critical section? If so, how do I implement one? The
only case I can see this being a problem is when, say,
Thread A locks something in the main thread, starts to
interact with it, is preempted, and then the main
thread gets a timeslice and interacts with the locked
resource--which it views as local and doesn't need to
check for a lock.

Can that last situation even exist? Do I need to
create a third, "go-between" object so that both
threads have to go through the lock/unlock process to
get at the resource?

Here's a rough cut at what I'm sort of thinking:

main_Q = PriorityQueue() # queueing class

class ServerThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.Q = PriorityQueue()

def run(self):
# if messages waiting on socket:
# get messages into self.Q
# if self.Q has messages:
# if can acquire lock on main_Q:
# move msgs from self.Q to main_Q
# release lock on main_Q

# main thread:
while 1:
# if main_Q has messages:
# dispatch them

Thank you,

-dB

=====
--
David Brady
daves_spam_do...@yahoo.com
I'm feeling very surreal today... or *AM* I?

__________________________________________________
Do You Yahoo!?
Make a great connection at Yahoo! Personals.
http://personals.yahoo.com

Cliff Wells

unread,

Oct 23, 2001, 3:39:10 PM10/23/01

to

On Tuesday 23 October 2001 12:06, David Brady wrote:
>
> main_Q = PriorityQueue() # queueing class
>
> class ServerThread(threading.Thread):
> def __init__(self):
> threading.Thread.__init__(self)
> self.Q = PriorityQueue()
>
> def run(self):
> # if messages waiting on socket:
> # get messages into self.Q
> # if self.Q has messages:
> # if can acquire lock on main_Q:
> # move msgs from self.Q to main_Q
> # release lock on main_Q
>
> # main thread:
> while 1:
> # if main_Q has messages:
> # dispatch them
>

I would put a threading.Lock() in PriorityQueue that controls access to it,
something like this:

class ThreadingQueue:
def __init__(self, args):
self.lock = threading.Lock()
....

def put(self, something):
self.lock.acquire()
try:
self._put(something)
finally:
self.lock.release()

--
Cliff Wells
Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 x308
(800) 735-0555 x308

Skip Montanaro

unread,

Oct 23, 2001, 4:04:11 PM10/23/01

to

>> self.Q = PriorityQueue()
..
Cliff> I would put a threading.Lock() in PriorityQueue that controls
Cliff> access to it, something like this:
...

Sorry to drop in late on this thread, but why not simply make PriorityQueue
a subclass of Queue.Queue (which already does all the locking for you)?

--
Skip Montanaro (sk...@pobox.com)
http://www.mojam.com/
http://www.musi-cal.com/

Ignacio Vazquez-Abrams

unread,

Oct 23, 2001, 4:25:19 PM10/23/01

to

On Tue, 23 Oct 2001, David Brady wrote:

> Firstoff, Shouldn't the plural of "mutex" be
> "mutices"? :-)

No, because "mutex" isn't a word; it's an acronym for "mutual exclusion". As
such it probably shouldn't have a plural form at all, but who's counting ;)

--
Ignacio Vazquez-Abrams <ign...@openservices.net>

"As far as I can tell / It doesn't matter who you are /
If you can believe there's something worth fighting for."
- "Parade", Garbage

David Brady

unread,

Oct 23, 2001, 4:54:24 PM10/23/01

to

> -----Original Message-----
> From: Skip Montanaro [mailto:sk...@pobox.com]
> Sent: Tuesday, October 23, 2001 2:04 PM
>
> Sorry to drop in late on this thread, but why not
> simply make
> PriorityQueue
> a subclass of Queue.Queue (which already does all
> the locking for you)?

Ooooh. Had I known this existed... *sigh*

Well, now that I've written a priority queue, however,
I'm thinking it does what I need moderately better.
my queue class has a method Put(self, object,
Priority=0) that puts the object into one of two
internal queues, a priority queue or a normal queue.
The reason for this is that it could conceivably take
a while (possibly 100ms or more) to process some data
packets, and during high traffic times, packets come
in pretty thick bursts. During this time, our server
may send out a keepalive (ping) packet that must be
replied to within 1-5 seconds or so to prevent the
connection from being closed by the server.

The idea was to have ping messages go into the
priority queue, so that even if there are 10s worth of
"hard-to-process" packets waiting, the keepalive
message is the next one to be processed.

However, I have implemented PriorityQueue using a pair
of lists internally, and perhaps it would be
faster/better to implement them as Queue.Queue
objects.

One question that still has not been answered:
critical sections. Can they be implemented, or can I
only lock other threads out of a given resource?

P.S. I figured out how to lock a resource in the main
thread:

main_mutex = threading.Lock()

creates a mutex in the main thread, and now anyone can
acquire()/release() it as needed.

Thanks again,

-dB

=====

Cliff Wells

unread,

Oct 23, 2001, 5:11:08 PM10/23/01

to

On Tuesday 23 October 2001 13:54, David Brady wrote:
> One question that still has not been answered:
> critical sections. Can they be implemented, or can I
> only lock other threads out of a given resource?

Reading your original post, I believe you are thinking about this wrong
(please, no offense).

From original post:

"only case I can see this being a problem is when, say,
Thread A locks something in the main thread, starts to
interact with it, is preempted, and then the main
thread gets a timeslice and interacts with the locked
resource--which it views as local and doesn't need to
check for a lock."

If you are going to share a resource among threads, then _every_ access to
that resource _must_ be enclosed in locks (even if you are only reading from
the resource). You suggested that you would need a "critical section" so
that the main thread would not access the resource if another thread is. If
you make your main thread check the lock, this won't happen, end of story.
It doesn't matter if the resource is local to the main thread.

Regards,

Cliff Wells

unread,

Oct 23, 2001, 5:01:43 PM10/23/01

to

On Tuesday 23 October 2001 13:04, Skip Montanaro wrote:
> Sorry to drop in late on this thread, but why not simply make PriorityQueue
> a subclass of Queue.Queue (which already does all the locking for you)?

Good suggestion. However, IMHO he really should learn how locks work before
using them.

bru...@tbye.com

unread,

Oct 23, 2001, 9:33:23 PM10/23/01

to

On Tue, 23 Oct 2001, Cliff Wells wrote:

> >From original post:
> "only case I can see this being a problem is when, say,
> Thread A locks something in the main thread, starts to
> interact with it, is preempted, and then the main
> thread gets a timeslice and interacts with the locked
> resource--which it views as local and doesn't need to
> check for a lock."
>
> If you are going to share a resource among threads, then _every_ access to
> that resource _must_ be enclosed in locks (even if you are only reading from
> the resource).

[snip]

No, "normal" operations on Python objects are atomic as far as threads are
concerned. There are some very good reasons for using locking/signaling
(to sequentialize access to a function, to keep a worker thread asleep
until you use a semaphore to signal it to awake, etc), but it's not always
a requirement. Consider a simple producer/consumer situation:

import threading, time, random

foo = []

def producer():
for i in range(10):
print 'Producing', i
foo.append(i)
time.sleep(random.random() * 1.0)
print 'Producer done'

def consumer():
count = 0
while 1:
try:
num = foo.pop(0)
print 'Consuming', num
count += 1
if count >= 10:
break
except IndexError:
pass

time.sleep(random.random() * 1.0)
print 'Consumer done'

threading.Thread(target=consumer).start()
threading.Thread(target=producer).start()

Output:

Producing 0
Producing 1
Consuming 0
Producing 2
Producing 3
Consuming 1
Producing 4
Consuming 2
Producing 5
Consuming 3
Consuming 4
Producing 6
Consuming 5
Producing 7
Consuming 6
Producing 8
Producing 9
Producer done
Consuming 7
Consuming 8
Consuming 9
Consumer done

No crashes, stomped memory, or any other problems you'd expect.

-Dave

David Brady

unread,

Oct 24, 2001, 1:56:04 PM10/24/01

to

> -----Original Message-----
> From: bru...@tbye.com [mailto:bru...@tbye.com]
> Sent: Tuesday, October 23, 2001 7:33 PM
> Subject: Re: Critical sections and mutexes
>
> No, "normal" operations on Python objects
> are atomic as far as threads are concerned.

[snip]

> Consider a simple producer/consumer
> situation:

Yes, but.

Consider this minor rewriting of your code. Three
changes have been made. First, I made it a stack
instead of a list, so insertions and removals are
happening at the same point. Second, I do a decidedly
non-atomic set of operations: I peek at the stack,
print a message, then pop the stack. Finally, I
tightened the timing down to 0.1 seconds max to
increase the likelihood of a collision.

It took about 6 runs, but I finally got one:

#----------------------------------------
# Code
import threading, time, random

foo = []

def producer():
for i in range(10):
print 'Producing', i

foo.insert(0,i)
time.sleep(random.random() * 0.1)
print 'Producer done'

def consumer():
count = 0
while 1:
try:

peeknum = foo[0]
print 'Consumer: about to consume %d.' %
peeknum

num = foo.pop(0)
print 'Consuming', num
count += 1
if count >= 10:
break
except IndexError:
pass

time.sleep(random.random() * 0.1)
print 'Consumer done'

threading.Thread(target=consumer).start()
threading.Thread(target=producer).start()

"""
Output: Notice last 2 lines
...
Producing 3
Consumer: about to consume 3.
Consuming 3
Producing 4
Producing 5
Consumer: about to consume 4.
Consuming 5
...
"""

Now, this is arguably a malicious rewriting of your
code, and it certainly flies in the face of the notion
that multithreaded code should at least be careful
about how it interacts with its neighbors.

Anyway, this is the kind of "critical section" idea
I'm talking about. If I do a series of operations
expecting a shared resource to remain unchanged, I
either need to have uninterruptable code (a critical
section) or the resource needs to be mutexed.

Interestingly, in several sets of tests, I discovered
that print statements aren't entirely atomic; tightly
threaded code can interrupt a print statement after
the text, but before the newline has been printed:

def fn1():
for i in range(10):
print "Hello Bob!"
time.sleep(random.random() * 0.1)

def fn2():
for i in range(10):
print "This is Fred!"
time.sleep(random.random() * 0.1)

...can produce output like this:
Hello Bob!
This is Fred!
Hello Bob!This is Fred!

Hello Bob!
This is Fred!

Anyway. The upshot is, I need to think about Python
threading in terms of mutices [1] instead of critical
sections; groups of resources that need to be accessed
atomically will need a mutex to control them.

For the nonce, I have rewritten my PriorityQueue class
to use a pair of Queue.Queues, and included a mutex
above them so that two threads cannot simultaneously
access the separate queues. I still need to profile
it, but I'm assuming that Queue.Queue will be more
efficient over the long haul than a list. My
assumption is that Queues are fairly smart about
managing memory creep, but the queue's mutex code may
be an unnecessary speed hit. I'll know for sure once
I profile them; it may be that lists are better. (The
reality is, both are probably sufficient.)

Thank you all for your responses.

-dB
[1] Yeah, yeah. I know it's not a word... but neither
is "Kleenices", and I say that, too. :-)

dharland

unread,

Oct 24, 2001, 3:15:20 PM10/24/01

to

"David Brady" <daves_spam_do...@yahoo.com> wrote in message
news:mailman.1003946245...@python.org...

> > -----Original Message-----
> > From: bru...@tbye.com [mailto:bru...@tbye.com]
> > No, "normal" operations on Python objects
> > are atomic as far as threads are concerned.
> [snip]
> Interestingly, in several sets of tests, I discovered
> that print statements aren't entirely atomic; tightly
> threaded code can interrupt a print statement after
> the text, but before the newline has been printed:

This is because what really is atomic in python are not "operations" but
Python bytecode.
And print "Boo!" for example is implemented in at least three bytecodes
(LOAD_FAST, PRINT_ITEM, PRINT_NEWLINE), hence the behavoiur you see with the
newline.

Skip Montanaro

unread,

Oct 24, 2001, 3:37:40 PM10/24/01

to

dave> P.S. - Print statements are not atomic and do not fall in the same
dave> realm as "normal" Python operations (e.g. "i = 5") ...

Assignment is not atomic either. Your example would byte compile to

LOAD_CONST 5
STORE_FAST i

As others have mentioned, at the level of individual opcodes Python is
atomic. The statement

i = j

would have to be protected if j could be rebound or the object j referenced
could be modified by another thread.

bru...@tbye.com

unread,

Oct 24, 2001, 2:59:37 PM10/24/01

to

On Wed, 24 Oct 2001, David Brady wrote:

> > No, "normal" operations on Python objects
> > are atomic as far as threads are concerned.
> [snip]

> > Consider a simple producer/consumer
> > situation:
>
> Yes, but.
>
> Consider this minor rewriting of your code. Three

[ snipped code rewrite that misses the point ;-) ]

Uh, yeah. That's why in my post I pointed out that there *are* times when
you need some sort of locking/signaling mechanism. In fact, I think I
specifically mentioned something along the lines of enforcing "sequential
access to a function" (i.e. - a mutex or critical section).

I was merely responding to somebody's not-entirely-true assertion that
"_every_ access to that [shared] resource _must_ be enclosed in locks".

Thanks,
Dave

P.S. - Print statements are not atomic and do not fall in the same realm
as "normal" Python operations (e.g. "i = 5") because 'print' involves
a potentially blocking access to an external resource (stdout). To avoid
blocking the interpreter releases its global lock before writing to a file
descriptor and reaquires it when done.

Cliff Wells

unread,

Oct 24, 2001, 3:54:23 PM10/24/01

to

On Tuesday 23 October 2001 18:33, bru...@tbye.com wrote:

> No, "normal" operations on Python objects are atomic as far as threads are

> concerned. There are some very good reasons for using locking/signaling
> (to sequentialize access to a function, to keep a worker thread asleep
> until you use a semaphore to signal it to awake, etc), but it's not always

> a requirement. Consider a simple producer/consumer situation:

True, I may have overstated it a bit, however, I would expect that the
typical multi-threaded application is going to do things that require more
than a single-line of python to manipulate data, i.e. len() prior to append()
or pop() or whatever. If another thread alters the list (or other mutable
object) between the call to len() and append() (or whatever) then the value
returned from len() may be invalid. Unless you are a very careful
programmer, you are asking for trouble that can be difficult to track down.
In this case, the person asking for help clearly has little experience with
threads and wrapping all shared objects in locks would be a safer choice for
him.

bru...@tbye.com

unread,

Oct 24, 2001, 3:43:03 PM10/24/01

to

On Wed, 24 Oct 2001, Skip Montanaro wrote:

>
> dave> P.S. - Print statements are not atomic and do not fall in the same
> dave> realm as "normal" Python operations (e.g. "i = 5") ...
>
> Assignment is not atomic either. Your example would byte compile to
>
> LOAD_CONST 5
> STORE_FAST i

Ooh, good point, bad example from me. As far as atomicity, I was thinking
in terms of the C functions that implement the handling of the bytecodes
themselves (e.g. PyDict_SetItem - no two threads will call PyDict_SetItem
on the same dictionary at the same time).

> As others have mentioned, at the level of individual opcodes Python is
> atomic. The statement
>
> i = j
>
> would have to be protected if j could be rebound or the object j referenced
> could be modified by another thread.

Not necessarily. My point was simply that locking or no was mostly an
application-level requirement rather than a Python interpreter one. For
example (back to the producer/consumer example), assume that you have two
producers and a consumer that toggles between the producer lists called
'j' and 'k'. If 'i' is the consumer's reference to whatever list it's
currently working on, you don't need any locking around 'i = j' even if
a producer happens to be modifying j at the "same time". Data won't be
lost, the interpreter won't die, etc, etc. That's all I was pointing out.

Thanks,
Dave

bru...@tbye.com

unread,

Oct 24, 2001, 4:06:50 PM10/24/01

to

On Wed, 24 Oct 2001, Cliff Wells wrote:

> True, I may have overstated it a bit, however, I would expect that the
> typical multi-threaded application is going to do things that require more
> than a single-line of python to manipulate data

You might *expect* that, but it often turns out not to be the case. :) It
all depends on the type of program you're writing, but I've found that
very often (maybe even 50% of the time or more) multithreaded apps get by
just fine without additional work. Even a few really large programs boiled
down to unidirectional work queues. In those cases I often use a semaphore
as a "work ready" signal, but that has nothing to do with maintaing data
integrity and everything to do with performance (avoiding busy waits).

>, i.e. len() prior to append()
> or pop() or whatever.

Hence the try..pop..except construct. With checking for length before
appending, it's often the case that there is no firm requirement on max
list size, only that you don't want it to be too big (ie - it's a fuzzy
requirement). That being true, the worst case for a non-locking
implementation would be a list that has (n-1) too many objects in it where
n is the number of producers.

For example, if you have a work queue that you don't want to grow to some
extreme length you may decide to cap its size at 1000 elements. With 10
threads adding work to the list, occasionally they'll drop or hold a
packet of work too long or occasionally your list will grow to 1009
elements. No big deal: the drop/hold problem exists regardless of any sort
of locking, and you don't care that your list is 9 elements too big
because your requirement was simply "don't let it grow without bounds".

-Dave

Cliff Wells

unread,

Oct 24, 2001, 5:23:37 PM10/24/01

to

On Wednesday 24 October 2001 11:59, bru...@tbye.com wrote:

> I was merely responding to somebody's not-entirely-true assertion that
> "_every_ access to that [shared] resource _must_ be enclosed in locks".

My assertion may not have been entirely true, but more useful to someone
trying to learn threading than a discussion on the esoterics of which Python
statements may or may not be atomic =)

Chris Tavares

unread,

Oct 24, 2001, 6:43:39 PM10/24/01

to

<bru...@tbye.com> wrote in message
news:mailman.1003954284...@python.org...
[... snip ...]

> Not necessarily. My point was simply that locking or no was mostly an
> application-level requirement rather than a Python interpreter one. For
> example (back to the producer/consumer example), assume that you have two
> producers and a consumer that toggles between the producer lists called
> 'j' and 'k'. If 'i' is the consumer's reference to whatever list it's
> currently working on, you don't need any locking around 'i = j' even if
> a producer happens to be modifying j at the "same time". Data won't be
> lost, the interpreter won't die, etc, etc. That's all I was pointing out.
>

[ ...snip... ]

Isn't this relying rather heavily on the implementation of a specific
version of the python interpreter, though? Removing the global interpreter
lock (and these semantics that you're leaning on) has been the focus of
some serious effort in the past. Does jython share this same behavior? What
about the version with the "free threading patches" from a couple of years
ago?

You're asking for trouble in the long run if you rely on this, I think.

-Chris

David Bolen

unread,

Oct 24, 2001, 6:49:37 PM10/24/01

to

<bru...@tbye.com> writes:

> I was merely responding to somebody's not-entirely-true assertion that
> "_every_ access to that [shared] resource _must_ be enclosed in locks".

But that statement is more true than not (as other respondents have
shown, even a simple assignment may be an issue), and certainly it's
much smarter when writing threaded code to behave as if it were always
true, since the potential consequences when it isn't can be very
subtle.

There are some cases (e.g., when performing read-only access to a
value where it isn't critical if the value changes and you just get
one of the two potential values, such as when watching a sentinel that
you'll just pick up on the next loop), but you really do need to
consider all possible users of every shared resource.

> P.S. - Print statements are not atomic and do not fall in the same realm
> as "normal" Python operations (e.g. "i = 5") because 'print' involves
> a potentially blocking access to an external resource (stdout). To avoid
> blocking the interpreter releases its global lock before writing to a file
> descriptor and reaquires it when done.

While the GIL is released around external I/O, even if that weren't
the case, a 'print' could be interrupted simply because it requires
multiple bytecodes, and the regular interruption at every 'n'
bytecodes could occur in the middle of the overall print processing.

--
-- David
--
/-----------------------------------------------------------------------\
\ David Bolen \ E-mail: db...@fitlinxx.com /
| FitLinxx, Inc. \ Phone: (203) 708-5192 |
/ 860 Canal Street, Stamford, CT 06902 \ Fax: (203) 316-5150 \
\-----------------------------------------------------------------------/

David Bolen

unread,

Oct 24, 2001, 6:51:58 PM10/24/01

to

<bru...@tbye.com> writes:

> For example, if you have a work queue that you don't want to grow to some
> extreme length you may decide to cap its size at 1000 elements. With 10
> threads adding work to the list, occasionally they'll drop or hold a
> packet of work too long or occasionally your list will grow to 1009
> elements. No big deal: the drop/hold problem exists regardless of any sort
> of locking, and you don't care that your list is 9 elements too big
> because your requirement was simply "don't let it grow without bounds".

Just be careful to fully document that fact in your code and avoid
using this sort of example for newbie questions about threaded code.
Otherwise what tends to happen is that someone believes that the code
in question is really establishing a cap of 1000 (since they won't
necessarily perform the same threaded analysis of the code you have)
and this can lead to maintenance problems down the road.

bru...@tbye.com

unread,

Oct 24, 2001, 10:29:51 PM10/24/01

to

On Wed, 24 Oct 2001, Chris Tavares wrote:

> Isn't this relying rather heavily on the implementation of a specific
> version of the python interpreter, though?

I'd rephrase that as relying on the well-documented behavior of many
versions over several years of the same implementation. This isn't exactly
some odd quirk in a dark corner of the interpreter...

> Removing the global interpreter lock (and these semantics that you're
> leaning on) has been the focus of some serious effort in the past.

Yes... and after that effort the GIL is still there. If it helps you sleep
better, always do your own locking and assume the GIL doesn't exist. :)
I'm just saying that a lot of times you're solving problems you don't
have, that's all.

> Does jython share this same behavior?

Don't know, don't care.

> What about the version with the "free threading patches" from a couple
> of years ago?

What about it? Instead of building my app based on the behavior of the
standard Python interpreter, I should make sure it works on some patched
version? Naah...

> You're asking for trouble in the long run if you rely on this, I think.

You're overestimating the cost of changing a Python program, I think.

Should the GIL be removed someday down the road, *all* multithreaded apps
would become suspect and overdue for a thorough review.

FWIW, I don't think the GIL will ever go away, or if it does, we'll end up
with something that gives the same behavior. Think about it: no GIL means
that a normal Python program (ie no extension modules in use) could crash
the interpreter. Maybe it's just me, but Guido et al have gone to great
lengths to make it nearly impossible to do that (and when it happens, it's
considered a bug and is fixed ASAP). The GIL is what lets multi-threaded
Python programs still behave like Python programs should.

-Dave

bru...@tbye.com

unread,

Oct 24, 2001, 10:45:44 PM10/24/01

to

On 24 Oct 2001, David Bolen wrote:

> > For example, if you have a work queue that you don't want to grow to some
> > extreme length you may decide to cap its size at 1000 elements. With 10
> > threads adding work to the list, occasionally they'll drop or hold a
> > packet of work too long or occasionally your list will grow to 1009
> > elements. No big deal: the drop/hold problem exists regardless of any sort
> > of locking, and you don't care that your list is 9 elements too big
> > because your requirement was simply "don't let it grow without bounds".
>
> Just be careful to fully document that fact in your code and avoid
> using this sort of example for newbie questions about threaded code.
> Otherwise what tends to happen is that someone believes that the code
> in question is really establishing a cap of 1000 (since they won't
> necessarily perform the same threaded analysis of the code you have)
> and this can lead to maintenance problems down the road.

No, no, no. That just means the code has really crummy comments. Why in
the world would "1000" show up in the comment? Comments, especially in
Python, should be level-of-intent, so the comment for this case would be
nearly exactly what I wrote in my post, i.e. "make sure the list doesn't
get too big".

-Dave

bru...@tbye.com

unread,

Oct 24, 2001, 10:42:01 PM10/24/01

to

On 24 Oct 2001, David Bolen wrote:

> > I was merely responding to somebody's not-entirely-true assertion that
> > "_every_ access to that [shared] resource _must_ be enclosed in locks".
>
> But that statement is more true than not (as other respondents have
> shown, even a simple assignment may be an issue)

Maybe, maybe not. Like I said before, it depends entirely on your
application, and is not a hard and fast rule by any means. A ton of the
locking work I have to do in other languages revolves around preventing
the type of corruption that leads to crashes, and this is what the
interpreter gives me for free.

> There are some cases (e.g., when performing read-only access to a
> value where it isn't critical if the value changes and you just get
> one of the two potential values, such as when watching a sentinel that
> you'll just pick up on the next loop), but you really do need to
> consider all possible users of every shared resource.

That's true either way. With a very few multithreaded Python apps under
your belt it gets so that it doesn't take too many brain cycles to
recognize the GIL-does-this-for-free scenarios. Yes, you can always
overengineer a solution if you want, but that's not very Pythonic.

Think of it this way: the way the GIL works can impose a slight
performance hit on your program (a cost). With that cost is an associated
benefit. You can choose whether or not to enjoy that benefit, but you've
already paid the cost so you might as well.

-Dave

bru...@tbye.com

unread,

Oct 24, 2001, 10:56:34 PM10/24/01

to

On Wed, 24 Oct 2001, Cliff Wells wrote:

> > I was merely responding to somebody's not-entirely-true assertion that
> > "_every_ access to that [shared] resource _must_ be enclosed in locks".
>

> My assertion may not have been entirely true, but more useful to someone
> trying to learn threading than a discussion on the esoterics of which Python
> statements may or may not be atomic =)

I do see your point, but I think of it from the opposite point of view: a
newbie learning about threads would be thrilled to get a simple
producer/consumer program working to play around with. Once they see how
cool it is, then you move on to teaching them about locking problems,
mutexes, etc. And hopefully, they played with it enough already to get
bitten by a problem so that they really understand the lesson.

I think it discourages newbies to bombard them with all the locking issues
up front, and it's a real strength of Python (especially for programming
newbies) that you can often "get away" with no explicit thread-safety
measures. Were your first multithreaded programs threadsafe? Mine sure
weren't. But it was okay because at that point it was much more important
to get something working. This is no different than any of the other
newbie-friendly aspects of Python (no type decls, no memory management,
etc.) - people can get up to speed quickly because they do have to get
overwhelmed with the details all at the start.

-Dave

David Bolen

unread,

Oct 25, 2001, 12:22:58 AM10/25/01

to

<bru...@tbye.com> writes:

> Should the GIL be removed someday down the road, *all* multithreaded apps
> would become suspect and overdue for a thorough review.

Perhaps multithreaded programs that have made assumptions, but to be
honest, anything I've written is always protecting itself from its use
of any shared resources (whether internal or system) and I wouldn't
expect any problems should the GIL be removed.

> FWIW, I don't think the GIL will ever go away, or if it does, we'll end up
> with something that gives the same behavior. Think about it: no GIL means
> that a normal Python program (ie no extension modules in use) could crash
> the interpreter.

No it doesn't. Obviously removing the GIL requires making the C core
of Python threadsafe, thus it will internally use whatever locks or
synchronization mechanisms are necessary around shared structures or
processes. You don't just stop using the GIL without any other
changes. The interpreter just won't be a single global lock. Python
scripts shouldn't have any risk at all - other than the same risk they
have today through misuse of shared resources without locking, and yes
those scripts that today make assumptions may run into trouble. But I
just see that as the definition of a well-written threadsafe program.

Who will undoubtably be affected are extension writers, and the core
would probably have to automatically allocate an extension-global lock
per extension (or one for all extensions) to ensure only one Python
thread can be in the extension(s) at a time, at least for legacy
support.

> the interpreter. Maybe it's just me, but Guido et al have gone to great
> lengths to make it nearly impossible to do that (and when it happens, it's
> considered a bug and is fixed ASAP). The GIL is what lets multi-threaded
> Python programs still behave like Python programs should.

Well, to some extent that's true, but it's hardly the only way to do
it. It's just a poor man's global lock around an entire interpreter.
Safe, but also inefficient. Whether resources will become available
to work on more fine grained locking for better efficiency is
questionable, but I'd certainly be in favor of any work in that
direction.

David Bolen

unread,

Oct 25, 2001, 12:28:15 AM10/25/01

to

<bru...@tbye.com> writes:

> That's true either way. With a very few multithreaded Python apps under
> your belt it gets so that it doesn't take too many brain cycles to
> recognize the GIL-does-this-for-free scenarios. Yes, you can always
> overengineer a solution if you want, but that's not very Pythonic.

The GIL doesn't do much for free for me in terms of controlling access
to my shared resources within my application, given that my code can
be interrupted between any two bytecodes and moved to a separate
thread. Sure there are some pure producer/consumer scenarios that are
safe, but they're safe in any language, with or without a GIL, to some
level of granularity. (E.g., even in C, a producer consumer of a
shared integer can be considered the same level of safety we're
talking about here).

If you're able to make assumptions in applications and get away with I
certainly won't stop you from writing that way, but with more than a
few multithreaded applications in more than a few languages/systems
under my belt, I've learned it's always better to be safe than sorry
in multithreaded programs.

> Think of it this way: the way the GIL works can impose a slight
> performance hit on your program (a cost). With that cost is an associated
> benefit. You can choose whether or not to enjoy that benefit, but you've
> already paid the cost so you might as well.

But the GIL does nothing for application-specific shared resources.
It only protects the interpreter, so that, as you mentioned in another
response, a normal Python script can't crash the interpreter. It does
nothing to prevent application-specific data structures from becoming
corrupt and affecting the behavior of the application.

At the very least, for anyone just getting into multithreaded
programming, it can be dangerous to think it guarantees anything else.

David Bolen

unread,

Oct 25, 2001, 12:32:46 AM10/25/01

to

<bru...@tbye.com> writes:

> On 24 Oct 2001, David Bolen wrote:

(...)

> > Just be careful to fully document that fact in your code and avoid
> > using this sort of example for newbie questions about threaded code.
> > Otherwise what tends to happen is that someone believes that the code
> > in question is really establishing a cap of 1000 (since they won't
> > necessarily perform the same threaded analysis of the code you have)
> > and this can lead to maintenance problems down the road.
>
> No, no, no. That just means the code has really crummy comments. Why in
> the world would "1000" show up in the comment? Comments, especially in
> Python, should be level-of-intent, so the comment for this case would be
> nearly exactly what I wrote in my post, i.e. "make sure the list doesn't
> get too big".

Read my comments again. Nowhere do I say that 1000 shows up in the
comment. What I said was to fully document the very "impreciseness"
that you are assuming is ok.

Let's say I'm maintaining your code. I see a comment that says "make
sure the list doesn't get too big". Then I read the code and I see
your comparison to 1000 (or to a constant representing 1000). The
logical conclusion - particularly for one who hasn't been knee deep in
the threaded development - is going to be that it shouldn't exceed
that constant. If you've written the code knowing that there's some
error margin around that constant because you didn't protect against
simultaneous access, if I were reviewing the code I'd sure want a
comment to that affect in that section of code. That's a critical
behavior characteristic of how you are performing the check.

Someone someday later may want to change the constant (that for all
intents and purposes looks like - and probably even behaves like in
most cases - a true maximum), and expects it to really be a maximum.

Or when you write the code you know there are about 10 threads vying
for contention and you figure that overflowing by 10 is no big deal.
But then someone changes the environment and you have hundreds of
threads. And your class got used multiple times for multiple
purposes. All of a sudden you're significantly over memory budget
with no clear reason why.

David Bolen

unread,

Oct 25, 2001, 12:36:05 AM10/25/01

to

<bru...@tbye.com> writes:

> I do see your point, but I think of it from the opposite point of view: a
> newbie learning about threads would be thrilled to get a simple
> producer/consumer program working to play around with. Once they see how
> cool it is, then you move on to teaching them about locking problems,
> mutexes, etc. And hopefully, they played with it enough already to get
> bitten by a problem so that they really understand the lesson.

If that's what the newbie wants, they should be pointed to an
appropriate class (like the Queue) that specifically provides simple
producer/consumer behavior with all the appropriate locking hidden
within the class. I don't think suggesting that they just write their
own without such locking (which can lead to believing it isn't
necessary) is to their benefit, even if it makes the problem seem
simpler.

To my mind, the benefit Python has for newbies is that the typical
housekeeping required of multithreading applications _and_ the types
of synchronization that is a fact of life for such applications are
very simple to use. And a number of useful classes are provided for
common scenarios within multithreaded applications.

BTW, one of the reasons I feel strongly about this is that a very
large percentage of my applications over the years have been
multithreaded (and I'm a big fan of having the support in the language
such as Python does) and there are just some things you don't skimp
on, if you want to do it write, maintainably and with assured behavior
over time.

Donn Cave

unread,

Oct 25, 2001, 3:15:57 AM10/25/01

to

Quoth David Bolen <db...@fitlinxx.com>:

| If that's what the newbie wants, they should be pointed to an
| appropriate class (like the Queue) that specifically provides simple
| producer/consumer behavior with all the appropriate locking hidden
| within the class. I don't think suggesting that they just write their
| own without such locking (which can lead to believing it isn't
| necessary) is to their benefit, even if it makes the problem seem
| simpler.
|
| To my mind, the benefit Python has for newbies is that the typical
| housekeeping required of multithreading applications _and_ the types
| of synchronization that is a fact of life for such applications are
| very simple to use. And a number of useful classes are provided for
| common scenarios within multithreaded applications.
|
| BTW, one of the reasons I feel strongly about this is that a very
| large percentage of my applications over the years have been
| multithreaded (and I'm a big fan of having the support in the language
| such as Python does) and there are just some things you don't skimp
| on, if you want to do it write, maintainably and with assured behavior
| over time.

An anecdote of sorts on that. The BeOS user interface graphics API
is inherently multi-threaded. Every window in an application runs in
its own thread.

BeOS is pretty much dead, sold to Palm if the stockholders ratify it.
In a kind of post-mortem mood one of the complaints that has come up,
even from ex-Be engineers, is that applications are full of bugs because
it's too hard to deal with all these threads safely. No one seems to
be willing to say that threads aren't at all useful, though. Anecdote
over. The moral of the story seems to be that in C++ (it's a C++ API),
you can't easily work safely with threads, so the solution is to limit
your use of threads to the strictly essential. OK, that's probably
unfair, I imagine the choice of language isn't as important as the
commitment to learning how to program with threads, instead of just
hoping that you can put threads in your program and it will all work
out. But decent language support for that commitment sure helps.

Incidentally, ever seen how a multi-threaded asynchronous API looks
when you use the Stackless Python continuations feature to turn
message queueing inside out? May not apply to all systems, but
normally my thread interactions are mediated by message data, used
to dispatch functions. After posting a message, the thread handler
returns to the dispatcher. With continuations, the thread handler
can later return to the point where it posted the message, and
"receive" the "function return" message, and the code looks just
like calling any ordinary function. Brr, I am normally repelled
by magic like that, but it sure makes some multithreaded problems
look easy.

Donn Cave, do...@drizzle.com

bru...@tbye.com

unread,

Oct 25, 2001, 9:30:19 AM10/25/01

to

On 25 Oct 2001, David Bolen wrote:

> BTW, one of the reasons I feel strongly about this is that a very
> large percentage of my applications over the years have been
> multithreaded (and I'm a big fan of having the support in the language
> such as Python does) and there are just some things you don't skimp
> on, if you want to do it write, maintainably and with assured behavior
> over time.

BTW, one of the reasons *I* feel strongly about this is that a very large
percentage of *my* applications over the years have been multithreaded
yadda yadda yadda. I'm simply telling what I've happened to experience:
resolving thread-safety issues has gone from lots of work in other
languages to almost none in Python. People have expressed lots of concern
over problems that could happen theoretically, and I understand that, and
I'm reporting that those theoretical problems happen far less often than
people are implying (this whole silly thread tanget was started by me
saying that someone spoke too strongly about thread-safety issues in
Python).

-Dave

bru...@tbye.com

unread,

Oct 25, 2001, 9:39:54 AM10/25/01

to

On 25 Oct 2001, David Bolen wrote:

> > FWIW, I don't think the GIL will ever go away, or if it does, we'll end up
> > with something that gives the same behavior. Think about it: no GIL means
> > that a normal Python program (ie no extension modules in use) could crash
> > the interpreter.
>
> No it doesn't. Obviously removing the GIL requires making the C core
> of Python threadsafe, thus it will internally use whatever locks or
> synchronization mechanisms are necessary around shared structures or
> processes. You don't just stop using the GIL without any other
> changes. The interpreter just won't be a single global lock. Python
> scripts shouldn't have any risk at all

That's exactly my point: if the GIL goes away, its replacement will have
the same behavior. For example, today I can safely have two threads append
items to the same list "simultaneously". In your threadsafe Python core
implementation, the same will be true. In neither case do I need to
perform explicit locking on the list, while in C, for example, I'd have
to. This is precisely the type of "free thread safety" that I've been
talking about, and I'm in no way saying that this fulfills all your needs
all the time.

> have today through misuse of shared resources without locking , and

> yes those scripts that today make assumptions may run into trouble.
> But I just see that as the definition of a well-written threadsafe
> program.

So assuming the GIL is replaced by a threadsafe Python core, what will
break in my program with two worker threads appending to the same list?

-Dave

bru...@tbye.com

unread,

Oct 25, 2001, 9:54:30 AM10/25/01

to

On 25 Oct 2001, David Bolen wrote:

> thread. Sure there are some pure producer/consumer scenarios that are
> safe, but they're safe in any language, with or without a GIL, to some
> level of granularity. (E.g., even in C, a producer consumer of a
> shared integer can be considered the same level of safety we're
> talking about here).

Exactly! All languages have a safe level of granularity (certainly not
limited to simple producer/consumer cases); I'm observing that Python has
an expanded level that you can very safely rely on. In a C class that
handles log messages, I wouldn't bother locking access to the integer flag
that tells the current log level, but I certainly would use a lock to
synchronize appending a new log message to the output queue. In Python,
there's no compelling reason to locks in either case. Are there still
cases where I do need locking in Python? Absolutely!

> If you're able to make assumptions in applications and get away with I
> certainly won't stop you from writing that way

Thank goodness! I was worried. :-)

> But the GIL does nothing for application-specific shared resources.
> It only protects the interpreter, so that, as you mentioned in another
> response, a normal Python script can't crash the interpreter. It does
> nothing to prevent application-specific data structures from becoming
> corrupt and affecting the behavior of the application.

Hey, did you cut and paste that from one of my previous posts? It sounds
an awful lot like what I was saying. :-)

Have fun,
Dave

Cliff Wells

unread,

Oct 25, 2001, 1:21:05 PM10/25/01

to

On Wednesday 24 October 2001 19:56, bru...@tbye.com wrote:
>
> I do see your point, but I think of it from the opposite point of view: a
> newbie learning about threads would be thrilled to get a simple
> producer/consumer program working to play around with. Once they see how
> cool it is, then you move on to teaching them about locking problems,
> mutexes, etc. And hopefully, they played with it enough already to get
> bitten by a problem so that they really understand the lesson.

Yes, I can see your point here (when I took CS classes we were often taught
that way: do it wrong, then learn language features that help do it right).
If this were a classroom, then I would probably concede at this point.
However, without the luxury of being able to hold someone's hand and lead
them step-by-step, I think it's best to just give them the most appropriate
information for solving their immediate problem.

I've found that it's far more frustrating to have what appears to be a
correct program fail for some obscure reason (as your example would if one
started modifying it to any significant degree - and a user trying to
understand the mechanics of the code would undoubtedly do so). Any CS book
that discusses threading (in the general sense) will emphasize the need for
locking shared resources.

Not only that, but the code you supplied may have worked, but it also drives
CPU utilization through the roof (as you mentioned yourself, locks can be
used to avoid busy loops) so a newbie testing such code might be immediately
put off from further investigation into threads simply due to apparent
performance reasons. Another good reason to supply correct code, not just
the simplest code.

> I think it discourages newbies to bombard them with all the locking issues
> up front, and it's a real strength of Python (especially for programming
> newbies) that you can often "get away" with no explicit thread-safety
> measures. Were your first multithreaded programs threadsafe? Mine sure
> weren't. But it was okay because at that point it was much more important

Actually, the reason I started using Python was because I needed to write a
multi-threaded program on Windows and Cygwin gcc didn't support threads at
the time (and I didn't want to use fork). So my very first Python program
was also my very first threaded program _and_ my first network program. It
was in fact, thread-safe and only took around a week to get working (if you
don't count feature-creep). I don't think locking is such a difficult
concept to grasp (at least no more than threading itself), and is fundamental
to threading.

> to get something working. This is no different than any of the other
> newbie-friendly aspects of Python (no type decls, no memory management,
> etc.) - people can get up to speed quickly because they do have to get
> overwhelmed with the details all at the start.

Yes, one of the appealing things about Python is that it tends to make
concepts that are difficult to master in other languages appear simple.
However, it doesn't (and shouldn't) encourage you to adopt bad programming
practices just because you can (I _could_ use the variable x to hold an int,
a string and a list at different times in the same function, but it would be
bad).

I don't feel locking is a detail so much as a fundamental aspect of
threading. The fact that you can get away with it (in special cases) under
current implementations of Python is not really a good reason to do it. It's
my feeling that making high-level code rely on low-level implementation
details of an interpreter or compiler is always a bad idea (in any language -
look at all the non-portable C programs that exist [and the OS's that have
hacks to support them] due to programmer's using implementation-specific
features of a particular platform or compiler).

David Brady

unread,

Oct 25, 2001, 2:45:14 PM10/25/01

to

> -----Original Message-----
> From: Cliff Wells
> [mailto:logiplex...@earthlink.net]
> Subject: Re: Critical sections and mutexes
>

> If this were a classroom, then I would probably
> concede at this point. However, without the luxury
> of being able to hold someone's hand and lead
> them step-by-step, I think it's best to just give
> them the most appropriate information for solving
> their immediate problem.

As the Newbie In Question, I'd like to chime in here
and say that (a) the help that has already been
provided has been most useful, and (b) watching this
discussion has really helped me work out some of the
finer points of the issue.

> > I think it discourages newbies to bombard them
> > with all the locking issues up front, and it's
> > a real strength of Python (especially for
> > programming newbies) that you can often "get
> > away" with no explicit thread-safety measures.

I both agree and disagree. To a newbie unfamiliar
with threading in the first place, yes, I can see that
bombarding them with the locking issues can be
discouraging. In my case, however, I'm familiar with
threading, and I have found Python's locking mechanism
to be absolutely brilliant. I knew what I wanted to
do; I just needed someone to explain the Python way of
doing it. Once I learned that I couldn't do critical
sections, I redesigned those parts of my code that
wanted them to use shared resources instead, and I was
in business.

My take on this point is that one should always err on
the side of too much information rather than too
little. :-) Both sides of this discussion should
take that as a compliment, btw. It turns out I'm
going to need a more complete locking scheme than the
GIL provides, but now I can say that because I have a
better feel for where the GIL stops and manual locking
has to start.

For example, I can predict a common type of race
condition in my class without any extra locking. My
class looks something like this:

class PriorityQueue:
def __init__(self):
self.NormalQueue = Queue.Queue()
self.PriorityQueue = Queue.Queue()
self.IsWaiting = 0

Now, the Queue.Queue()'s are themselves threadsafe, so
I shouldn't have any collisions accessing either queue
directly. However, I have a Push() and Pop() method
as well. Push() puts a message on one of the queues
(depending on whether or not the priority flag is
set), and then sets the IsWaiting flag. Pop() checks
the priority queue first, then the normal queue, and
returns one message. If both queues are empty after
the message is popped, it sets the IsWaiting flag to
0.

If the class itself doesn't support locking, it would
be possible for one thread to drain all the queues,
and be just about to clear the IsWaiting flag, then
have another thread push a message and set IsWaiting,
and then return to the first thread who clears
IsWaiting. We are left with a message in the queue
but nobody retrieves it until more messages arrive and
set IsWaiting again.

Again, thank you all for your comments.

-dB

bru...@tbye.com

unread,

Oct 25, 2001, 2:57:40 PM10/25/01

to

On Thu, 25 Oct 2001, Cliff Wells wrote:

> Any CS book that discusses threading (in the general sense) will
> emphasize the need for locking shared resources.

But only because, in the general sense, you have to do that work yourself.
It's not that locking isn't happening, it's that someone is doing some of
it for you. A CS book that discusses memory management will cover
everything from heap strategies to allocation and deallocation - that does
not imply that you, the developer, will always need to do it though.

> However, it doesn't (and shouldn't) encourage you to adopt bad programming
> practices just because you can

Hold on... the definition of a "bad" programming practice is certainly not
universal. Not deleting memory you allocated is bad in C; that doesn't
make it bad in Python. Yes there are principles that apply to many
languages, after lots of multithreaded Python programs I've come to find
that "always use explicit locking" isn't one of them, that's all.

> (I _could_ use the variable x to hold an int, a string and a list at
> different times in the same function, but it would be bad).

This is baggage you're bringing with you from previous languages - we all
do this to one degree or another. Once again, though, what may be an
"always" rule in C is a good rule of thumb but not as strict in Python:

for line in open('ages.txt').readlines():
name, age = line.strip().split()
age = int(age)

"Always bad in language X" does not imply "Always bad in Python".

> I don't feel locking is a detail so much as a fundamental aspect of
> threading.

I'm not disagreeing with you, Cliff. At issue is the fact that it is so
fundamental that the interpreter already does some of it for you.

> The fact that you can get away with it (in special cases) under
> current implementations of Python is not really a good reason to do it.

Do you explicitly close every file object you open? I sure don't. And when
the time comes that I do have to close the file myself, I certainly don't
feel that all those other times I'm getting away with something. It's just
how things work and it's very convenient.

> It's my feeling that making high-level code rely on low-level
> implementation details of an interpreter or compiler is always a bad
> idea

I wholeheartedly agree that reliance on low-level implementation details
can be a Bad Thing. I don't, however, see this as all that low-level (in
fact, always requiring explicit locking even on simple resource access
smacks of just the type of annoying task that Python helps you avoid).

The fact that the Python core is inherently thread-safe is a very
high-level concept, and is a fundamental part of threading support in
Python. It is completely in line with other housekeeping/safety features
of Python. That type of basic thread safety will not go away. The GIL
might, but its replacement would still be a threadsafe core, and so you'd
end up with the same thing.

-Dave

David Bolen

unread,

Oct 25, 2001, 5:33:20 PM10/25/01

to

<bru...@tbye.com> writes:

> So assuming the GIL is replaced by a threadsafe Python core, what will
> break in my program with two worker threads appending to the same list?

If you're sure it's a built-in list and that the append operation is
being serviced by C core code (the built-in part could change today,
and with 2.2 even the built-in might be subclassed) then probably
nothing, assuming that the list was the only world within which you
needed the safety (e.g., there was no correlation between a newly
appended entry in the list and any other data structure that needed to
be kept in sync).

But if either of those conditions might not always be true, then you
could potentially run into problems.

As I noted earlier in the thread, there are certainly specific cases
that can be proposed where there shouldn't be a problem, and I agree
that Python pushes the granularity of such cases higher than a
language such as C.

David Bolen

unread,

Oct 25, 2001, 5:34:31 PM10/25/01

to

<bru...@tbye.com> writes:

> > If you're able to make assumptions in applications and get away with I
> > certainly won't stop you from writing that way
>
> Thank goodness! I was worried. :-)

You jest! :-)

> > But the GIL does nothing for application-specific shared resources.
> > It only protects the interpreter, so that, as you mentioned in another
> > response, a normal Python script can't crash the interpreter. It does
> > nothing to prevent application-specific data structures from becoming
> > corrupt and affecting the behavior of the application.
>
> Hey, did you cut and paste that from one of my previous posts? It sounds
> an awful lot like what I was saying. :-)

Drat, and we were doing so well politely disagreeing... :-)

David Bolen

unread,

Oct 25, 2001, 5:39:11 PM10/25/01

to

<bru...@tbye.com> writes:

> The fact that the Python core is inherently thread-safe is a very
> high-level concept, and is a fundamental part of threading support in
> Python. It is completely in line with other housekeeping/safety features
> of Python. That type of basic thread safety will not go away. The GIL
> might, but its replacement would still be a threadsafe core, and so you'd
> end up with the same thing.

Ah, but Python's dynamicism can make it hard to predict when you're
sure that you are executing within the core (say your code is handed a
Python implemented object that just happens to look like a list), or
especially when you can subclass the built-in types in 2.2. Although
I'd concede that if done right such other objects or subclasses ought
to maintain the same sort of threadsafe contract as the built-in
objects do, but that's harder to guarantee and I'm not sure I'd want
to risk it when writing code for later maintenance.

Not to mention that later updates or maintenance of the code may
introduce other uses that affect what underlying objects are in use as
shared resources, and it would be nice to have the framework in place
to already secure such resources.

bru...@tbye.com

unread,

Oct 25, 2001, 10:58:41 PM10/25/01

to

On 25 Oct 2001, David Bolen wrote:

> > The fact that the Python core is inherently thread-safe is a very
> > high-level concept, and is a fundamental part of threading support in
> > Python. It is completely in line with other housekeeping/safety features
> > of Python. That type of basic thread safety will not go away. The GIL
> > might, but its replacement would still be a threadsafe core, and so you'd
> > end up with the same thing.
>
> Ah, but Python's dynamicism can make it hard to predict when you're
> sure that you are executing within the core (say your code is handed a
> Python implemented object that just happens to look like a list), or
> especially when you can subclass the built-in types in 2.2. Although
> I'd concede that if done right such other objects or subclasses ought
> to maintain the same sort of threadsafe contract as the built-in
> objects do, but that's harder to guarantee and I'm not sure I'd want
> to risk it when writing code for later maintenance.

Good point - every time I've relied on this behavior it has been in a
self-contained program rather than a library that I was making available
to someone else. In the library case I don't know how common the
"simultaneous access of a user-provided object" is, but in that case I'd
definitely agree that strict locking would be high on the list of things
to do to make sure the library is sufficiently robust.

-Dave

Cliff Wells

unread,

Oct 26, 2001, 1:04:33 PM10/26/01

to

On Thursday 25 October 2001 11:57, bru...@tbye.com wrote:

> Hold on... the definition of a "bad" programming practice is certainly not
> universal. Not deleting memory you allocated is bad in C; that doesn't
> make it bad in Python. Yes there are principles that apply to many
> languages, after lots of multithreaded Python programs I've come to find
> that "always use explicit locking" isn't one of them, that's all.

True, but it's almost always safe to not deallocate in Python - it's not
always safe to forego explicit locks (and we may disagree, but it's at least
arguable that threaded code not requiring locks is atypical).

> This is baggage you're bringing with you from previous languages - we all
> do this to one degree or another. Once again, though, what may be an
> "always" rule in C is a good rule of thumb but not as strict in Python:
> for line in open('ages.txt').readlines():
> name, age = line.strip().split()
> age = int(age)
>
> "Always bad in language X" does not imply "Always bad in Python".

Granted. Once again, I may be guilty of overstatement (I do the above on a
regular basis - casting to int() that is, not overstatement. Well, maybe
both =). I was thinking more along the lines of using it for different types
in unrelated contexts where x means one thing at one time and then something
else later on.

> > I don't feel locking is a detail so much as a fundamental aspect of
> > threading.
>
> I'm not disagreeing with you, Cliff. At issue is the fact that it is so
> fundamental that the interpreter already does some of it for you.
>
> > The fact that you can get away with it (in special cases) under
> > current implementations of Python is not really a good reason to do it.
>
> Do you explicitly close every file object you open? I sure don't. And when
> the time comes that I do have to close the file myself, I certainly don't
> feel that all those other times I'm getting away with something. It's just
> how things work and it's very convenient.

Actually I do explicitly close every file - I realize it's not really
necessary, but I feel it's a good habit, and there are times (be they few and
far between) when leaving a file open can cause problems (for long-running
tasks where the handle never goes out of scope).

I do recognize that Python delivers us from many of the programming tasks
that other languages require (which is why we're both here =), but I don't
feel that the thread-safety provided by the interpreter is enough (at least
at the moment) to recommend ever not using locks. Besides, even in the cases
where you can avoid them, I can see no real benefit to doing so. Even if
your data is safe, other issues such as performance or synchronization will
usually mandate their use anyway.

> > It's my feeling that making high-level code rely on low-level
> > implementation details of an interpreter or compiler is always a bad
> > idea
>
> I wholeheartedly agree that reliance on low-level implementation details
> can be a Bad Thing. I don't, however, see this as all that low-level (in
> fact, always requiring explicit locking even on simple resource access
> smacks of just the type of annoying task that Python helps you avoid).
>

> The fact that the Python core is inherently thread-safe is a very
> high-level concept, and is a fundamental part of threading support in
> Python. It is completely in line with other housekeeping/safety features
> of Python. That type of basic thread safety will not go away. The GIL
> might, but its replacement would still be a threadsafe core, and so you'd
> end up with the same thing.

You are probably correct on this point, I expect the thread-safety will only
improve, but at the moment it's still not enough.

BTW, Dave, I would like to say that I have learned a few things on this topic
that I wasn't aware of - thanks for the interesting discussion.

bru...@tbye.com

unread,

Oct 26, 2001, 2:50:01 PM10/26/01

to

On Fri, 26 Oct 2001, Cliff Wells wrote:

> BTW, Dave, I would like to say that I have learned a few things on this topic
> that I wasn't aware of - thanks for the interesting discussion.

Me too, and thank *you*! I know c.l.py isn't the only one like this, but
there aren't too many other newsgroups where people can take strong
positions and have a detailed discussion without it always turning into a
childish flamewar. Very refreshing and educational for me!

Thanks,
Dave

Terry Reedy

unread,

Oct 26, 2001, 10:35:15 PM10/26/01

to

<bru...@tbye.com> wrote in message
news:mailman.100412384...@python.org...

This lurker (w/r/t this thread) also learned a few things. Terry

Graham Ashton

unread,

Oct 27, 2001, 8:17:42 AM10/27/01

to

In article <mailman.1003979425...@python.org>, "brueckd"
<bru...@tbye.com> wrote:

> With a very few multithreaded Python apps under your belt it gets so
> that it doesn't take too many brain cycles to recognize the
> GIL-does-this-for-free scenarios.

I take it that others (possibly significantly less experienced than you)
don't ever need to modify your software then? Adhering to basic principles
like using mutexs where people unfamiliar with your code would expect to
find them helps you to write self documenting code. It's not everybody
who has the experience you do of when you can get away without locks in
Python.

> Yes, you can always overengineer a solution if you want, but that's not
> very Pythonic.

Eh? Where is that written down then? Surely the amount of "engineering"
that goes into your software is defined by it's intended purpose? Perhaps
I'd better go and learn C.

> Think of it this way: the way the GIL works can impose a slight
> performance hit on your program (a cost). With that cost is an
> associated benefit. You can choose whether or not to enjoy that benefit,
> but you've already paid the cost so you might as well.

That's fine, so long as you bear in mind that it all depends on the domain
in which your software is intended to run. It's that kind of attitude that
lead to the Y2K problem.

--
Graham