Python 2.6.7 gevent 1.0b2 haigha v0.4.1 uwsgi server v1.2.5
nginx routing to uwsgi server via uwsgi protocol.
During handling of an request, the python handler (invoked from uwsgi worker subprocess) makes use of the haigha AMQP client library with its GeventTransport to send message and wait for responses via RabbitMQ broker. Each transaction entails connecting to RabbitMQ broker, sending a single request message via Haigha, waiting for a single response message, then tearing down the RabbitMQ connection.
Usually one or two such transactions succeed, but then all subsequent requests time out while waiting for the response message. The handler uses "with gevent.Timeout(seconds=timeoutSec):" to implement the timeout.
The exact same python code works perfectly when executed outside of uwsgi server, even when many such transactions are executed concurrently using gevent Greenlets. I've done plenty of testing without any failures at all to be pretty certain that this haigha-based code works reliably *outside* the uwsgi environment. The problems occur only when executing the exact same coe in the uwsgi environment.
When Haigha receives a message, a Haigha callback (running in another Greenlet) deposits the message in a gevent Queue. The handler waits on this gevent Queue. When gevent.Timeout() expires, the traceback looks like this: File "/Users/current/lib/python2.6/site-packages/gevent/queue.py", line 189, in get result = waiter.get() File "/Users/current/lib/python2.6/site-packages/gevent/hub.py", line 616, in get return self.hub.switch() File "/Users/current/lib/python2.6/site-packages/gevent/hub.py", line 373, in switch return greenlet.switch(self) Timeout: 30 seconds
> nginx routing to uwsgi server via uwsgi protocol.
> During handling of an request, the python handler (invoked from uwsgi
> worker subprocess) makes use of the haigha AMQP client library with its
> GeventTransport to send message and wait for responses via RabbitMQ
> broker.
> Each transaction entails connecting to RabbitMQ broker, sending a single
> request message via Haigha, waiting for a single response message, then
> tearing down the RabbitMQ connection.
> Usually one or two such transactions succeed, but then all subsequent
> requests time out while waiting for the response message. The handler
> uses
> "with gevent.Timeout(seconds=timeoutSec):" to implement the timeout.
> The exact same python code works perfectly when executed outside of uwsgi
> server, even when many such transactions are executed concurrently using
> gevent Greenlets. I've done plenty of testing without any failures at all
> to be pretty certain that this haigha-based code works reliably *outside*
> the uwsgi environment. The problems occur only when executing the exact
> same coe in the uwsgi environment.
> When Haigha receives a message, a Haigha callback (running in another
> Greenlet) deposits the message in a gevent Queue. The handler waits on
> this gevent Queue. When gevent.Timeout() expires, the traceback looks
> like
> this:
> File "/Users/current/lib/python2.6/site-packages/gevent/queue.py", line
> 189, in get
> result = waiter.get()
> File "/Users/current/lib/python2.6/site-packages/gevent/hub.py", line
> 616, in get
> return self.hub.switch()
> File "/Users/current/lib/python2.6/site-packages/gevent/hub.py", line
> 373, in switch
> return greenlet.switch(self)
> Timeout: 30 seconds
I have never used haigha, but the only thing popping in my mind is not
taking in account uWSGI fork() usage. If you create a Queue in
myapp.webappi module (so in the master), it will not be usable by the
workers. Try adding --lazy, if it works i suggest you to upgrade to uWSGI
1.3 and use --lazy-apps (load apps like in lazy mode but maintains the
non-lazy behaviour for all of the other uWSGI parts)
On Saturday, August 18, 2012 12:33:01 AM UTC-7, Roberto De Ioris wrote:
> I have never used haigha, but the only thing popping in my mind is not > taking in account uWSGI fork() usage. If you create a Queue in > myapp.webappi module (so in the master), it will not be usable by the > workers. Try adding --lazy, if it works i suggest you to upgrade to uWSGI > 1.3 and use --lazy-apps (load apps like in lazy mode but maintains the > non-lazy behaviour for all of the other uWSGI parts)
thank you for the quick follow-up. I forgot to mention that we already tried the --lazy option, but still had the same problem. Based on your description on http://projects.unbit.it/uwsgi/wiki/ThingsToKnow, I expected that --lazy would solve this problem, but unfortunately it did not. It's a complete mystery -- the code runs fine outside of uwsgi, but fails very easily in the uwsgi environment. Does uwsgi server patch any python built-in APIs? (socket, etc.)
> On Saturday, August 18, 2012 12:33:01 AM UTC-7, Roberto De Ioris wrote:
>> I have never used haigha, but the only thing popping in my mind is not
>> taking in account uWSGI fork() usage. If you create a Queue in
>> myapp.webappi module (so in the master), it will not be usable by the
>> workers. Try adding --lazy, if it works i suggest you to upgrade to
>> uWSGI
>> 1.3 and use --lazy-apps (load apps like in lazy mode but maintains the
>> non-lazy behaviour for all of the other uWSGI parts)
> thank you for the quick follow-up. I forgot to mention that we already
> tried the --lazy option, but still had the same problem. Based on your
> description on http://projects.unbit.it/uwsgi/wiki/ThingsToKnow, I
> expected
> that --lazy would solve this problem, but unfortunately it did not. It's a
> complete mystery -- the code runs fine outside of uwsgi, but fails very
> easily in the uwsgi environment. Does uwsgi server patch any python
> built-in APIs? (socket, etc.)
> Thank you,
> Vitaly
Can you write a tiny test-script to allow me to reproduce the problem ?
Have you tried with uWSGI 1.3 (there are a bunch of optimizations
gevent-related, maybe it contains some fix i forgot to backport...) ?
Hi Roberto, I haven't tried uWSGI 1.3. I could a small program to reproduce this, but it would involve calls into the Haigha library. Is that okay? I will take a fresh look at this when I return from vacation a week from now.
I did more debugging yesterday and finally nailed it down -- it's nothing to do with uWSGI; it was an internal error. Sorry for the trouble and thank you for all the help.
> Hi Roberto, I haven't tried uWSGI 1.3. I could a small program to > reproduce this, but it would involve calls into the Haigha library. Is > that okay? I will take a fresh look at this when I return from vacation a > week from now.