I spent a few days stuck on a similar problem. It helps to understand
what Event Machine is doing under the covers.
Event Machine has a single thread of execution, and it simply runs a
loop. Inside that loop it checks to see if it should send any data
out, checks to see if it should process any incoming data, and checks
to see if it should execute any periodic timers (there may be another
step or two that it does, it has been a while).
The problem arises that RabbitMQ will basically try to keep your
socket full of data at all times, so what happens is Event Machine
gets stalled in the loop pulling data off the incoming socket. It
won't do this forever, but for us RabbitMQ was sending about 500
messages at a time, so Event Machine would process all 500 incoming
messages before it had a chance to continue the loop. RabbitMQ had
already filled the socket back up, so the next time through the loop
it will get stalled for just as long.
Depending on how long it takes to process each of your messages, this
may not be a problem, but if your Event Machine loop is getting
stalled for more than a couple of seconds it probably will be. In
Adam's case I would expect that by the time it got done processing the
incoming socket, it was time to process the periodic timer that told
RabbitMQ to do the Recover and THEN it actually had a chance to start
the loop over again and probably sent all the ACK's out before getting
stalled on the incoming socket again.
Setting the prefetch to 1 solves this problem because RabbitMQ will
only keep a single message in your incoming socket so your loop
doesn't get stalled and it can process the outgoing ACK's quickly.
The trade off here is that you may need to spend time waiting for
RabbitMQ to send data to your incoming socket because it is not always
full. You can probably tweak the prefetch value to be something more
optimal where it won't stall the loop, but will still keep a healthy
number of messages queued up in the socket.
On Nov 6, 7:15 am, Daniel DeLeo <
d...@kallistec.com> wrote:
> Hi Adam,
> I'm not exactly an EventMachine expert, but this is what's happening as best
> I can understand and explain:
>
> Without the prefetch option, RabbitMQ is sending messages to subscribers as
> quickly as the subscribers can fetch them from the buffer. Under lighter
> loads, this is fine, because you'll clear all the messages from the queue
> before any trouble strikes. When the load increases, however, you reach a
> state where EventMachine is constantly reading messages from the buffer, and
> the event loop never gets the opportunity to fire the callbacks that send
> acks back to RabbitMQ.
>
> So, to answer your questions: the messages were re-queued because they were
> never acked from the broker's point of view, prefetch solves this because it
> prevents the situation I described above from occurring, thus your acks get
> sent and RabbitMQ knows not to re-queue the messages. And yes, this does
> solve the problem.
>
> HTH, and anyone feel free to jump in and correct me on any details I
> might've butchered.
>
> Dan DeLeo
>