Hello,
We have an application that is writing messages out to RabbitMQ by invoking Channel.basic_publish on a channel created off of a SelectConnection.
We're noticing delays (~2-4 seconds) between calling basic_publish and when the message is actually written to the socket. After some digging, we found that in using a SelectConnection, Pika chooses a SelectPoller for polling. From SelectPoller.poll():
if (self._fd_events[READ] or self._fd_events[WRITE] or
self._fd_events[ERROR]):
read, write, error = select.select(
self._fd_events[READ],
self._fd_events[WRITE],
self._fd_events[ERROR],
self._get_next_deadline())
Based on what we see for the implementation of _get_next_deadline(), this seems to return _MAX_POLL_TIMEOUT (ie. 5 seconds) when not using timeouts (which sounds like what I'm seeing, as we invoke basic_publish directly rather than through setting a timeout+callback).
Things we've tried with some success:
Wrapping our call to basic_publish inside of a timeout callback with a low timeout:
connection.add_timeout(0.25, lambda: self.channel.basic_publish(...))
Although it primes the return value of self._get_next_deadline(), the first attempt at calling basic_publish still seems to be delayed (I suspect this is because we may be already waiting on select() with a timeout of _MAX_POLL_TIMEOUT). Subsequent publishes sped up as expected. This approach seems more in-line with the example code, which also uses add_timeout to schedule a regular interval for publishing messages (our use case is more for publishing messages in response to irregular events, like button presses or even consumption of another message).
We've also tried tweaking _MAX_POLL_TIMEOUT:
SelectPoller._MAX_POLL_TIMEOUT = 0.25
As far as results go, this sped up all of our publishes, including the first. However, this doesn't appear to be the "pythonic" way of doing things based on my understanding (the variable is "protected"?). Also, we're not sure of whether there are any consequences in lowering the default poll time (all we've seen is CPU usage jumping when set to 0, but nothing notable > 0)
Is the expected usage of SelectConnection with add_timeout instead of directly calling basic_publish? What is the recommended way for handling this delay in SelectPoller - should we be using add_timeout, overriding _MAX_POLL_TIMEOUT, defining a custom IOLoop/Poller, or perhaps something else entirely?
Other info:
OS: Windows 10
Python: 3.5.4
Pika: 0.11.2
Thanks
--