After that publishing and consuming messages works as expected, but breaks forever (for the life-time of the VM) once the first error happens on that channel (e.g., by calling channel.direct w/ passive: true for a non-existing exchange resulting in an error '#<AMQ::Protocol::Channel::Close:0x00000008338800 @method_id=10, @reply_code=404, @class_id=40, @reply_text="NOT_FOUND - no exchange 'test' in vhost '/'">').
Channel.on_error is invoked correctly and we can see that Channel.auto_recovering? also returns true at that point. Still, none of the messages that we publish from that point on reach any subscriber -- according to `rabbitmqctl list_queues` they do not even reach the respective queue.
Any ideas how we could isolate what is causing this issue? Is my assumption correct that auto_recovery on channel level is supposed to work correctly in amqp 1.0.1?
2013/4/24 Thilo-Alexander Ginkel <th...@ginkel.com>After that publishing and consuming messages works as expected, but breaks forever (for the life-time of the VM) once the first error happens on that channel (e.g., by calling channel.direct w/ passive: true for a non-existing exchange resulting in an error '#<AMQ::Protocol::Channel::Close:0x00000008338800 @method_id=10, @reply_code=404, @class_id=40, @reply_text="NOT_FOUND - no exchange 'test' in vhost '/'">').Channels that had an exception on them cannot be used again. That's how the protocol works.
Channel.on_error is invoked correctly and we can see that Channel.auto_recovering? also returns true at that point. Still, none of the messages that we publish from that point on reach any subscriber -- according to `rabbitmqctl list_queues` they do not even reach the respective queue.Any ideas how we could isolate what is causing this issue? Is my assumption correct that auto_recovery on channel level is supposed to work correctly in amqp 1.0.1?
Automatic recovery reopens channels when there is a network connection failure, not when thereis a channel-level exception. amqp gem absolutely must not try to be that intelligent: channel errorrecovery is application-specific.There is a method that lets you manually "reuse" a channel object (its id will change, but otherwise it is as if it was "reopend"):
Would it be viable to call Channel.reuse from within the Channel.on_error callback? I already tried doing so, but message delivery via that channel still seems to be negatively impacted, i.e., published messages vanish into thin air.
2013/4/24 Thilo-Alexander Ginkel <th...@ginkel.com>Would it be viable to call Channel.reuse from within the Channel.on_error callback? I already tried doing so, but message delivery via that channel still seems to be negatively impacted, i.e., published messages vanish into thin air.
You can call it from Channel#on_error. "vanish into thin air" is not very descriptive. Consult RabbitMQ log and management UI to see what's going on.
If that helps, I can try to reproduce the issue in a minimal example.
We have an example that reopens a channel, I believe I see an issue there.Investigating.
2013/4/25 Michael Klishin <michael....@gmail.com>amq-client 1.0.1 [1
Forgot the link: http://rubygems.org/gems/amq-client/versions/1.0.1
Excellent. With amqp 1.0.1 and amq-client 1.0.2 things have changed a bit: Now, Channel.reuse is effective, i.e., it reconnects the channel, but effectively gets stuck in an endless loop as it seems to try to also reconnect the exchange that originally caused the channel to go down due to the 404 response, leading to yet another channel failure, and so on:
I don't have a solution in mind about how to tell which entities need to be filtered out.
I don't have a solution in mind about how to tell which entities need to be filtered out.
There is one option for you: ch.exchanges is a hash that maps exchange names to AMQP::Exchange instances. You probably can just delete an exchange that failsdeclaration from it.
I'll try to come up with a way to deregister entities that fail declaration but it isa really counterintuitive feature, I'm afraid. Adding it may do more harm than good.
IMHO that makes sense if an exchange is failing declaration due to different parameters. If it, however, fails due to not having been created before if `passive: true` is supplied
I would not expect any negative consequences (apart from an AMQP::Error being raised) as the documentation suggests that `passive: true` may be used as a means to check whether an exchange actually exists
2013/4/26 Thilo-Alexander Ginkel <th...@ginkel.com>I would not expect any negative consequences (apart from an AMQP::Error being raised) as the documentation suggests that `passive: true` may be used as a means to check whether an exchange actually exists
AMQP::Error hasn't been in use for a few years.
This is how the protocol works: passive declarations that fail close the channel.
Maybe it would be a good idea to drop these lines if they no longer reflect the current situation.