Re: [gevent] Gevent (greenlet/threaded) synchronization question

446 views
Skip to first unread message

Markus Thurlin

unread,
May 14, 2013, 6:16:35 PM5/14/13
to gev...@googlegroups.com
Even though only one greenlet is running at the same time, you can still have synchronization issues.

A simple example:

if key in myDict:
    gevent.sleep(1)
    del myDict[key]

Any call that causes a greenlet switch might switch to another greenlet that runs some other code that modifies your state.

In practice, I find this is relatively uncommon, but I guess that depends on your application.




On Tue, May 14, 2013 at 8:02 PM, Ralph Caraveo <deck...@gmail.com> wrote:
Hello gevent group,

I've been reading up on gevent for awhile now and starting to use it on side hobby projects and I do have a question about synchronization which I know can be a difficult topic.

My question is, since gevent takes the approach of "greenlets" instead of true threads I understand the idea that there can only ever be one green thread running at a time and apparently you don't have 
to worry about locking/synchronizing shared-state.

It even mentions this on the gevent documentation site in the following section:


Synchronizing access to the objects shared across the greenlets is unnecessary in most cases, thusLock and Semaphore classes although present aren’t used as often. Other abstractions from threading and multiprocessing remain useful in the cooperative world: 

What concerns me however is the text:  "in most cases".

It sounds to me like for the majority of cases I can happily spin up any number of green threads and access global shared state in my Python application and simply not worry about synchronizing or locking on the shared state but I feel like I can't be sure that I don't have to worry about data corruption because gevent works "in most cases" yet still provides some tools in it's library to do locking.

So my question really is: when do I have to concern myself with locking/synchronizing mutable structures from multiple green threads?

To give some background information of what I'm doing:

1. I have a very small Flask application that is served through Gevents pywsgi server
2. I have a handful of rest methods that will modify the state of my Flask app
3. A few of these rest methods will read/write to simple Python data structures such as standard Python lists or dictionaries.
4. Multiple users will be hitting this site (served in a single process) calling the rest methods that will be mutating these lists/dictionaries
5. As an example: a user may login so I simply add that user's cookie-id to a dictionary with a key,value pair
6. Obviously, another user may be trying to login at the same time, but again only one green thread should ever be running at a time, so I'm safe right?  (I mean, I think I am...but how can I know for sure)

Please keep in mind, my web-app is just small-time hobby project stuff.  It's meant to be used by a tiny group of say 20 people.   I'm serving the site off of a single process, I'm not using a database (on purpose)...and I'm only 
using gevent for the sake of learning because obviously it's probably not even needed in my use-case.

One more thing, I understand that perhaps modifying global state isn't always a good thing in any kind of application and that perhaps you should keep your global state to a minimum and favor passing mutable data around.  In other words there are probably better ways of doing this.  But still, I just want to make sure that in the scenarios like above that I can know I am still safe.

Anyhow, much help is appreciated if anybody would like to take a stab.

Thanks so much for your time,

-Ralph


--
You received this message because you are subscribed to the Google Groups "gevent: coroutine-based Python network library" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gevent+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Matt Billenstein

unread,
May 16, 2013, 5:44:10 AM5/16/13
to gev...@googlegroups.com
On Wed, May 15, 2013 at 12:16:35AM +0200, Markus Thurlin wrote:
> Even though only one greenlet is running at the same time, you can still
> have synchronization issues.
> A simple example:
>
> if key in myDict:
> gevent.sleep(1)
> del myDict[key]

Or, more typically I think:

for k, v in mydict.iteritems():
r = requests.get(...) # <--- switch
...

I've typically just taken a copy of the dict here to get around this rather
than reaching for a lock, but it depends on the usecase.

m

--
Matt Billenstein
ma...@vazor.com
http://www.vazor.com/

Ralph Caraveo

unread,
May 16, 2013, 1:15:17 PM5/16/13
to gev...@googlegroups.com
Thanks guys, these are good pitfalls to avoid and I can see how they can potentially be a source of problems.  

One additional observation I take from this, is that if you've got some really abstracted code where potentially a switch occurs deep in your code it almost seams like these issue can't fully be avoided because what may look like a harmless loop may cause issues.  I guess this goes back to minimizing the sharing of state as much as possible or taking copy of sequence before looping as Matt suggested.

In both of your examples, it's pretty obvious that a switch will occur on the sleep() method and also on the .get() from the requests library...but it may not be so obvious in other cases.

As I continue to think about these kinds of issues any other examples or pitfalls to avoid are appreciated.  I almost feel like these are worth documenting or writing about.

Thanks again!




Reply all
Reply to author
Forward
0 new messages