Thread safetyness in Python

John Goerzen

unread,

Jul 3, 2002, 8:52:32 AM7/3/02

to

Hi,

I have some questions about what is thread-safe in Python. Can someone tell
me whether each of the following are thread-safe:

1. a = a + 1
2. a = a + 2
3. list.append(x)
4. list.remove(x)
5. del(list[0])

Here's why I'm asking:

1. In C, a++ would be atomic because CPUs have an atomic increment
operation, in most cases. In Python, it might be the case that if two
threads hit this code at the same time, a would only be incremented once
because both threads would read a, add 1, and store the same result back.
So one might need a lock here.

2. This operation is not possible with just the increment operator.

3. If this is not thread-safe and two threads try to do it simultaneously,
it might be the case that each thread adds a new list[5] instead of one
adding list[5] and the other list[6].

4. If this is not thread-safe, both threads might remove the same item,
rather than one thread removing it and the other raising an exception.

5. If not thread-safe, both threads might remove the same item, rather than
one thread removing the first item and the next, the second.

Thanks,
John

--
John Goerzen <jgoe...@complete.org> GPG: 0x8A1D9A1F www.complete.org

Michael Hudson

unread,

Jul 3, 2002, 10:02:39 AM7/3/02

to

John Goerzen <jgoe...@complete.org> writes:

> Hi,
>
> I have some questions about what is thread-safe in Python. Can someone tell
> me whether each of the following are thread-safe:

I'm not a thread expert, but these are my answers:

> 1. a = a + 1

No. a += 1 would be, though (I'm fairly sure). Assuming a.__add__
isn't written in Python...

> 2. a = a + 2

No. a += 2, again.

> 3. list.append(x)

Yes.

> 4. list.remove(x)

Yes.

> 5. del(list[0])

Yes.

Cheers,
M.

--
Strangely enough I saw just such a beast at the grocery store
last night. Starbucks sells Javachip. (It's ice cream, but that
shouldn't be an obstacle for the Java marketing people.)
-- Jeremy Hylton, 29 Apr 1997

Gerhard Haering

unread,

Jul 3, 2002, 10:12:39 AM7/3/02

to

John Goerzen wrote:
> 1. In C, a++ would be atomic because CPUs have an atomic increment
> operation, in most cases.

What about SMP systems?

Gerhard
--
mail: gerhard <at> bigfoot <dot> de registered Linux user #64239
web: http://www.cs.fhm.edu/~ifw00065/ OpenPGP public key id 86AB43C0
public key fingerprint: DEC1 1D02 5743 1159 CD20 A4B6 7B22 6575 86AB 43C0
reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b')))

Oren Tirosh

unread,

Jul 3, 2002, 9:58:51 AM7/3/02

to

On Wed, Jul 03, 2002 at 12:52:32PM +0000, John Goerzen wrote:
> Hi,
>
> I have some questions about what is thread-safe in Python. Can someone tell
> me whether each of the following are thread-safe:
>
> 1. a = a + 1
> 2. a = a + 2
> 3. list.append(x)
> 4. list.remove(x)
> 5. del(list[0])

From www.python.org/doc/current/api/threads.html :

Therefore, the rule exists that only the thread that has acquired the
global interpreter lock may operate on Python objects or call Python/C API
functions. In order to support multi-threaded Python programs, the
interpreter regularly releases and reacquires the lock -- by default, every
ten bytecode instructions (this can be changed with sys.setcheckinterval()).
The lock is also released and reacquired around potentially blocking I/O
operations like reading or writing a file, so that other threads can run
while the thread that requests the I/O is waiting for the I/O operation to
complete.

The resolution is one bytecode operation so a+=1 is atomic but not a=a+1.
The list operations above are atomic.

Oren

Jeff Epler

unread,

Jul 3, 2002, 10:45:00 AM7/3/02

to

On Wed, Jul 03, 2002 at 02:12:39PM +0000, Gerhard Haering wrote:
> John Goerzen wrote:
> > 1. In C, a++ would be atomic because CPUs have an atomic increment
> > operation, in most cases.
>
> What about SMP systems?

Being one instruction is not enough to be atomic in SMP systems. For
instance, quoting from "AMD x86-64 Architecture", Volume 1, page 90:
Lock Prefix. The LOCK prefix causes certain read-modify-write
instructions that access memory to occur atomically. The mechanism
for doing so is implementation-dependent (for example, the mechanism
may involve locking of data-cache lines that contain copies of the
referenced memory operands, and/or bus signaling or packet-messaging
on the bus). The prefix is intended to give the processor exclusive
use of shared memory operands in a multiprocessor system.

The prefix can only be used with forms of the following instructions
that write a memory operand: ADC, ADD, AND, BTC, BTR, BTS, CMPXCHG,
CMPXCH8B, DEC, INC, NEG, NOT, OR, SBB, XADD, XCHG, and XOR. An
invalid-opcode exception occurs if LOCK is used with any other
instruction.
(this is largely the same for plain 'ol x86, but that's from the manual
I have at hand)

This means that if two processors are executing 'INC' at about the same
time, the R-M-W steps could take place like
CPU0 CPU1
Read 0
Read 0
Modify Modify
Write 1
Write 1

Only 'LOCK INC' will ensure that the operation takes place in an atomic
fashion.

Of course, 'LOCK INC' is bad for performance, and a compiler won't
generate it for every increment instruction. Some standard has defined
"atomic_t" and "sigatomic_t" integral data types which have the desired
behavior. I don't know offhand if this is ISO C, POSIX, or something
else, though.

Jeff

Jeff Epler

unread,

Jul 3, 2002, 10:34:14 AM7/3/02

to

On Wed, Jul 03, 2002 at 09:58:51AM -0400, Oren Tirosh wrote:
> The resolution is one bytecode operation so a+=1 is atomic but not a=a+1.
> The list operations above are atomic.

a+=1 is multiple bytecodes.

Jeff

Duncan Grisby

unread,

Jul 3, 2002, 12:02:18 PM7/3/02

to

In article <mailman.102570757...@python.org>,
Jeff Epler <jep...@unpythonic.net> wrote:

>On Wed, Jul 03, 2002 at 02:12:39PM +0000, Gerhard Haering wrote:
>> John Goerzen wrote:
>> > 1. In C, a++ would be atomic because CPUs have an atomic increment
>> > operation, in most cases.
>>
>> What about SMP systems?
>
>Being one instruction is not enough to be atomic in SMP systems. For

Even if the architecture had a reliable atomic increment instruction,
there's no guarantee that the compiler would use it. It's entirely
likely that the optimiser would generate code to do the increment in a
register, and write it back to memory much later.

None of this is relevant to Python, of course...

Cheers,

Duncan.

--
-- Duncan Grisby --
-- dun...@grisby.org --
-- http://www.grisby.org --

Aahz

unread,

Jul 3, 2002, 12:48:08 PM7/3/02

to

In article <slrnai5ssg....@christoph.complete.org>,

John Goerzen <jgoe...@complete.org> wrote:
>
>I have some questions about what is thread-safe in Python. Can someone tell
>me whether each of the following are thread-safe:
>
>1. a = a + 1
>2. a = a + 2

Depends on what "a" is. If it's a function/method local, it's
guaranteed thread-safe (assuming no pathological poking into code
objects).

Generally speaking, worrying about thread-safety of objects is the wrong
tack to take with Python. What you want to do is make sure that only
one thread has access to any given object, usually by passing references
to the object around with Queue.Queue.

>3. If this is not thread-safe and two threads try to do it simultaneously,
>it might be the case that each thread adds a new list[5] instead of one
>adding list[5] and the other list[6].

This is thread-safe. Anything with only one bytecode is thread-safe.
You can use the dis module if you want to check it out. The problem
comes when you use list.append(f())....
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/

Tim Peters

unread,

Jul 3, 2002, 11:22:33 AM7/3/02

to

[John Goerzen]

> I have some questions about what is thread-safe in Python. Can
> someone tell me whether each of the following are thread-safe:

Yes <wink>. Assuming you mean thread-safe in the sense of sequential
consistency:

> 1. a = a + 1
> 2. a = a + 2

No and no. Others have suggested Python

a += 1

is atomic, but it isn't.

> 3. list.append(x)
> 4. list.remove(x)
> 5. del(list[0])

Yes, yes, and yes. Note that del isn't a function, and

del list[0]

is more idiomatic. Also,

x = list.pop(0)

is an atomic way to remove the first item and retrieve its value in one
gulp.

> Here's why I'm asking:
>
> 1. In C, a++ would be atomic because CPUs have an atomic increment
> operation, in most cases.

Do a reality check before relying on this. I don't know of any platform
where C compiles a++ to atomic code, even assuming a is of integral type.
On platforms that have some sort of atomic increment instruction, you
typically need to use inline assembly, or call a vendor-supplied builtin
function, to get at it. Else you're going to get code like this (from a
listing of generated code on an Intel box, where 'a' is of the native int
type):

; 6 : a++;

00022 8b 55 fc mov edx, DWORD PTR _a$[ebp]
00025 83 c2 01 add edx, 1
00028 89 55 fc mov DWORD PTR _a$[ebp], edx

> In Python, it might be the case that if two threads hit this code at the
> same time, a would only be incremented once because both threads would
> read a, add 1, and store the same result back. So one might need a lock
> here.

Yes, although the same is equally true of C a++.

Skip Montanaro

unread,

Jul 3, 2002, 11:34:28 AM7/3/02

to

Oren> So a+=1 isn't atomic, but l+=[1] is. Interesting.

How so?

>>> def f(l):
... l += [1]
...
>>> dis.dis(f)
0 LOAD_FAST 0 (l)
3 LOAD_CONST 1 (1)
6 BUILD_LIST 1
9 INPLACE_ADD
10 STORE_FAST 0 (l)
13 LOAD_CONST 0 (None)
16 RETURN_VALUE

Looks to me like there's the opportunity for another thread to sneak in
there and modify l between the LOAD_FAST and STORE_FAST instructions...

--
Skip Montanaro
sk...@pobox.com
consulting: http://manatee.mojam.com/~skip/resume.html

Tim Peters

unread,

Jul 3, 2002, 12:15:02 PM7/3/02

to

[Oren Tirosh]

> So a+=1 isn't atomic, but l+=[1] is. Interesting.

[Skip Montanaro]

> How so?
>
> >>> def f(l):
> ... l += [1]
> ...
> >>> dis.dis(f)
> 0 LOAD_FAST 0 (l)
> 3 LOAD_CONST 1 (1)
> 6 BUILD_LIST 1
> 9 INPLACE_ADD
> 10 STORE_FAST 0 (l)
> 13 LOAD_CONST 0 (None)
> 16 RETURN_VALUE
>
> Looks to me like there's the opportunity for another thread to sneak in
> there and modify l between the LOAD_FAST and STORE_FAST instructions...

The deal is that the STORE_FAST is really a nop in this case:
list.__iadd__(obj) returns obj, and rebinding obj (the STORE_FAST) to the
same value is a semantic nop.

Another way to look at it is that

list += iterable

is equivalent to

list.extend(iterable)

except that under the covers the implementation makes the former act like

list.extend(iterable)
list = list

The bytecode for the "list = list" part has no effect in the end, so doesn't
matter.

Oren Tirosh

unread,

Jul 3, 2002, 11:26:34 AM7/3/02

to

6 LOAD_FAST 0 (a)
9 LOAD_CONST 1 (1)
12 INPLACE_ADD
13 STORE_FAST 0 (a)

You're right. I forgot that the inplace operations also need to store the
new object afterwards since immutable objects such as ints cannot really be
modified "in place".

So a+=1 isn't atomic, but l+=[1] is. Interesting.

Oren

Oren Tirosh

unread,

Jul 3, 2002, 2:18:14 PM7/3/02

to

On Wed, Jul 03, 2002 at 10:34:28AM -0500, Skip Montanaro wrote:
>
> Oren> So a+=1 isn't atomic, but l+=[1] is. Interesting.

>
> How so?
>
> >>> def f(l):
> ... l += [1]
> ...
> >>> dis.dis(f)
> 0 LOAD_FAST 0 (l)
> 3 LOAD_CONST 1 (1)
> 6 BUILD_LIST 1
> 9 INPLACE_ADD
> 10 STORE_FAST 0 (l)
> 13 LOAD_CONST 0 (None)
> 16 RETURN_VALUE
>
> Looks to me like there's the opportunity for another thread to sneak in
> there and modify l between the LOAD_FAST and STORE_FAST instructions...

That is also true of l.append(1) - if another thread modifies the binding
of the name l it doesn't matter so much if the append operation on the
object it previously pointed to was atomic or not.

Assuming the two threads both use l+=[x] or otherwise modify the object but
avoid rebinding the reference they should be ok.

Oren

John Baxter

unread,

Jul 3, 2002, 6:42:38 PM7/3/02

to

In article <mailman.1025709845...@python.org>,
"Tim Peters" <t...@zope.com> wrote:

> Do a reality check before relying on this. I don't know of any platform
> where C compiles a++ to atomic code, even assuming a is of integral type.

And even if one finds such a platform, beware of builds with
optimization turned off. (With optimization turned on, of course, the
a++ may vanish entirely if the result is used not used and a is not used
later, which can be "disconcerting".)

(Although it's not possible to turn all code mashing off in many cases:
it's common to compile as if the machine had an infinite supply of
registers and then deal with the reality in a fixup operation.)

--John--whose boss once wondered why his compiled code was putting 0
into a memory cell, then reading out the cell and checking whether it
was 0...a clear result of
if (a = 0) ...

jep...@unpythonic.net

unread,

Jul 3, 2002, 6:24:09 PM7/3/02

to

On Wed, Jul 03, 2002 at 12:15:02PM -0400, Tim Peters wrote:
> Another way to look at it is that
>
> list += iterable
>
> is equivalent to
>
> list.extend(iterable)
>
> except that under the covers the implementation makes the former act like
>
> list.extend(iterable)
> list = list
>
> The bytecode for the "list = list" part has no effect in the end, so doesn't
> matter.

However, you could notice something awry in the following case:
Thread 1 Thread 2
_temp = l
_temp.extend(iterable)
l = 37
l = _temp

if the += was atomic, then the order would have to be

Thread 1 Thread 2
_temp = l
_temp.extend(iterable)
l = _temp
l = 37
or
Thread 1 Thread 2
l = 37
_temp = l
_temp.extend(iterable)
[exception raised, no 'extend' attribute]

But since the operation is not actually atomic, you can end up with 37 not
actually stored in 'l'.

Jeff