Thread-safe Python Tips

953 views
Skip to first unread message

Greg

unread,
Jun 29, 2011, 1:20:05 AM6/29/11
to Google App Engine
Hi -

Could anyone familiar with threads explain the basic principals of
python thread-safety?

Cheers!
Greg.

Brandon Wirtz

unread,
Jun 29, 2011, 3:25:06 AM6/29/11
to google-a...@googlegroups.com
Thread safe, means that two thing can be done with out the order mattering,
and without any dependency on the other thread.

Example:
Find the sum of a column 1-1000

Single Thread:
1+2+3+4....+998+999+1000

4 Threads:
A= 1+2+3+4...248+249+250
B = 251+252+253+254...498+499+500
C = 501+502+503+504...748+749+750
D = 751+752+753+754...998+999+1000

Return A+B+C+D

Not Thread Safe:

Replace all entities with Value -1 with 0
Sum column


Not Thread Safe:

New A = Old A + 3

New B = New A + Old B

Can be multithreaded for part of the operation:
Fetch 4 URLs, Store returned data, do formula, display result.

You can multithread the Fetch, and the store, but the Do formula and display
result has to happen after.

Hi -

Cheers!
Greg.

--
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to
google-appengi...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.


Joshua Smith

unread,
Jun 29, 2011, 7:43:34 AM6/29/11
to google-a...@googlegroups.com
I would assume they are the same as the basic principles of thread safety in any language:

- Don't rely on global state, because multiple of your functions might be running simultaneously

This usually isn't very hard to achieve - just pass parameters instead of modifying globals. The places where it can get tricky are where you really *want* to use global state, such as for an in-memory cache. Usually the language provides some primitives to ensure that only one thread at a time is updating the cache. It appears that python gives you thread-safety for a lot of cases:

http://effbot.org/zone/thread-synchronization.htm

jay

unread,
Jun 29, 2011, 4:57:47 PM6/29/11
to Google App Engine
We are not going to have to go into that much detail for our apps
thought are we.

Will we be able to take advantage of the multi-threading by simply
creating multiple request handlers. I imagined I would be changing a
few lines of code in my main function an that's about it.

Ikai Lan (Google)

unread,
Jun 30, 2011, 1:47:21 AM6/30/11
to google-a...@googlegroups.com
I don't believe you'll need to do anything when we ship concurrent requests for Python, as long as you're not using global mutable state anywhere. It's unlikely your threads will ever need to talk to each other.

Ikai Lan 
Developer Programs Engineer, Google App Engine

Joshua Smith

unread,
Jun 29, 2011, 1:40:31 PM6/29/11
to google-a...@googlegroups.com
I have this code in one of my apps:

townCache = {}
def getTown(id):
if not id in townCache:
townCache[id] = TownModel.get_by_id(id)
return townCache[id]

Is this thread safe? I think it is, because the worst that happens is the assignment happens redundantly with the same data.

Random other question: Why don't I have to say "global townCache" at the top of that function?

Ikai Lan (Google)

unread,
Jul 1, 2011, 1:10:22 PM7/1/11
to google-a...@googlegroups.com
You don't have to use the "global" keyword because this is Python, not PHP. If you wanted to use that variable in another module, you would have to import it like this:

from module_name import townCache

It's possible for two operations to update townCache concurrently, but in your case it looks like it doesn't really matter. If TownModel is somehow updated between reads, it's theoretically possible for you to have an older TownModel in the local cache, but if you're going to store something in the cache with no expiration, it sounds like you don't care about this case anyway.

Two code tips:

- a general python convention is to use underscore_case, not CamelCase. CamelCase is reserved for class names. I'd rename townCache to town_cache and getTown to get_town

- you can probably get away with calling TownModel Town

Ikai Lan 
Developer Programs Engineer, Google App Engine


Geoffrey Spear

unread,
Jul 1, 2011, 6:12:16 PM7/1/11
to Google App Engine


On Jun 29, 1:40 pm, Joshua Smith <JoshuaESm...@charter.net> wrote:
> I have this code in one of my apps:
>
> townCache = {}
> def getTown(id):
>  if not id in townCache:
>    townCache[id] = TownModel.get_by_id(id)
>  return townCache[id]
>
> Is this thread safe?  I think it is, because the worst that happens is the assignment happens redundantly with the same data.
>
> Random other question: Why don't I have to say "global townCache" at the top of that function?

You can't *assign* to a global variable in another scope without the
global keyword; however, townCache is the global name here, not
townCache[id].

Joshua Smith

unread,
Jul 4, 2011, 6:47:42 PM7/4/11
to google-a...@googlegroups.com
Thanks for that clarification. I'm sure there's a reason for this asymmetry (must declare globals to write them, but not to read them), but it's really weird.

Nick Johnson (Google)

unread,
Jul 5, 2011, 11:19:19 PM7/5/11
to google-a...@googlegroups.com
On Tue, Jul 5, 2011 at 8:47 AM, Joshua Smith <Joshua...@charter.net> wrote:
Thanks for that clarification.  I'm sure there's a reason for this asymmetry (must declare globals to write them, but not to read them), but it's really weird.

Since Python doesn't require variable declaration at all, there's no way for it to know if "foo = 3" is intended to create a local or a global without an explicit declaration. If it assumed a local unless a global already existed, creating a global in your module could unexpectedly change the behavior of a function that happened to use the same name for a local variable.

-Nick Johnson
 

On Jul 1, 2011, at 6:12 PM, Geoffrey Spear wrote:

>
>
> On Jun 29, 1:40 pm, Joshua Smith <JoshuaESm...@charter.net> wrote:
>> I have this code in one of my apps:
>>
>> townCache = {}
>> def getTown(id):
>>  if not id in townCache:
>>    townCache[id] = TownModel.get_by_id(id)
>>  return townCache[id]
>>
>> Is this thread safe?  I think it is, because the worst that happens is the assignment happens redundantly with the same data.
>>
>> Random other question: Why don't I have to say "global townCache" at the top of that function?
>
> You can't *assign* to a global variable in another scope without the
> global keyword; however, townCache is the global name here, not
> townCache[id].
>
> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
>

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.




--
Nick Johnson, Developer Programs Engineer, App Engine


Tobias

unread,
Jul 6, 2011, 3:53:02 AM7/6/11
to google-a...@googlegroups.com
Hi,

Class variables can introduce pitfalls within a threaded environment. See http://stackoverflow.com/questions/1072821/is-modifying-a-class-variable-in-python-threadsafe

Regards,
Tobias

Pol

unread,
Jul 6, 2011, 1:00:22 AM7/6/11
to Google App Engine
On Jul 1, 10:10 am, "Ikai Lan (Google)" <ika...@google.com> wrote:
> It's possible for two operations to update townCache concurrently, but in
> your case it looks like it doesn't really matter. If TownModel is somehow
> updated between reads, it's theoretically possible for you to have an older
> TownModel in the local cache, but if you're going to store something in the
> cache with no expiration, it sounds like you don't care about this case
> anyway.

Are collections thread-safe in Python? Otherwise, "townCache[id]
= ..." called at the same time on multiple threads would likely
corrupt something.

P. Petrov

unread,
Jul 7, 2011, 4:02:00 PM7/7/11
to google-a...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages