MassTransit: Reconnecting to Rabbit in TopShelf Windows Service After Rabbit Restarts

1,257 views
Skip to first unread message

jon333

unread,
Feb 6, 2014, 1:09:21 PM2/6/14
to masstrans...@googlegroups.com
We are seeing an issue when RabbitMQ restarts and it has "long running" consumer connections in the form of TopShelf based Windows Services.  We receive the following error and wanted to find out how we could reconnect to RabbitMQ when this issue occurs without restarting the Windows Service.  When RabbitMQ is restart and a consumer is connected to a queue via a Windows Service we get the following error and need a way to tell the app to reconnect to Rabbit.


 catch (EndOfStreamException ex)
                    {
                        throw new InvalidConnectionException(_address.Uri, "Connection was closed", ex);
                    }

jon333

unread,
Feb 6, 2014, 4:45:31 PM2/6/14
to masstrans...@googlegroups.com

This is the stack, but what we're seeing is that MassTransit does NOT reconnect to RabbitMQ after RabbitMQ is restarted and comes back online.  Our consumer is in a TopShelf Windows Service and our bus is configured to be a single instance through Autofac.  It would be great to know what we could do to reconnect after this exception.

ERROR MassTransit.Context.ServiceBusReceiveContext Consumer Exception Exposed

MassTransit.Transports.InvalidConnectionException: rabbitmq://[server]:[port]/QueueName_control?ha=true => Connection was closed ---> System.IO.EndOfStreamException: SharedQueue closed

   at RabbitMQ.Util.SharedQueue.Dequeue(Int32 millisecondsTimeout, Object& result)

   at MassTransit.Transports.RabbitMq.RabbitMqConsumer.Get(TimeSpan timeout) in c:\Projects\MassTransit\MassTransit\src\Transports\MassTransit.Transports.RabbitMq\RabbitMqConsumer.cs:line 127

   at MassTransit.Transports.RabbitMq.InboundRabbitMqTransport.<>c__DisplayClass1.<Receive>b__0(RabbitMqConnection connection) in c:\Projects\MassTransit\MassTransit\src\Transports\MassTransit.Transports.RabbitMq\InboundRabbitMqTransport.cs:line 68

   --- End of inner exception stack trace ---

   at MassTransit.Transports.RabbitMq.InboundRabbitMqTransport.<>c__DisplayClass1.<Receive>b__0(RabbitMqConnection connection) in c:\Projects\MassTransit\MassTransit\src\Transports\MassTransit.Transports.RabbitMq\InboundRabbitMqTransport.cs:line 121

   at MassTransit.Transports.DefaultConnectionPolicy.Execute(Action callback) in d:\BuildAgent-03\work\aa063b4295dfc097\src\MassTransit\Transports\DefaultConnectionPolicy.cs:line 64

   at MassTransit.Transports.ConnectionPolicyChainImpl.Next(Action callback) in d:\BuildAgent-03\work\aa063b4295dfc097\src\MassTransit\Transports\ConnectionPolicyChainImpl.cs:line 49

   at MassTransit.Transports.ConnectionHandlerImpl`1.Use(Action`1 callback) in d:\BuildAgent-03\work\aa063b4295dfc097\src\MassTransit\Transports\ConnectionHandlerImpl.cs:line 86

   at MassTransit.Transports.RabbitMq.InboundRabbitMqTransport.Receive(Func`2 lookupSinkChain, TimeSpan timeout) in c:\Projects\MassTransit\MassTransit\src\Transports\MassTransit.Transports.RabbitMq\InboundRabbitMqTransport.cs:line 63

   at MassTransit.Transports.Endpoint.Receive(Func`2 receiver, TimeSpan timeout) in d:\BuildAgent-03\work\aa063b4295dfc097\src\MassTransit\Transports\Endpoint.cs:line 360

   at MassTransit.Context.ServiceBusReceiveContext.ReceiveFromEndpoint() in d:\BuildAgent-03\work\aa063b4295dfc097\src\MassTransit\Context\ServiceBusReceiveContext.cs:line 91

Chris Patterson

unread,
Feb 6, 2014, 5:27:16 PM2/6/14
to masstrans...@googlegroups.com
What version of MT are you using?

We have the latest version in production for weeks and we can restart RabbitMQ and have no issues with reconnection, in fact, it reconnects immediately once the server is available.

As a test, you can download the MassTransit stress tool (https://github.com/phatboyg/MassTransit-Stress), run it against your RabbitMQ server, and restart the RabbitMQ server in the middle of the test run. Not only should it reconnect, but there should be no loss of messages. I've done this many times to verify our RabbitMQ cluster in production under load.



--
You received this message because you are subscribed to the Google Groups "masstransit-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-dis...@googlegroups.com.
To post to this group, send email to masstrans...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/masstransit-discuss/b84e9383-90cd-4139-abb3-417a947a9c1b%40googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

jon333

unread,
Feb 6, 2014, 5:34:20 PM2/6/14
to masstrans...@googlegroups.com
Thanks for replying Chris.  We are using MassTransit 2.8.0.

jon333

unread,
Feb 13, 2014, 11:56:20 AM2/13/14
to masstrans...@googlegroups.com
We used the MassTransit stress tool (https://github.com/phatboyg/MassTransit-Stress) and confirmed that the issue with consumers not reconnecting to RabbitMQ after a RabbitMQ restart or temporary downtime has been fixed in MassTransit 2.9.5.

Ash

unread,
Jul 28, 2015, 11:48:07 AM7/28/15
to masstransit-discuss, ch...@phatboyg.com
Any chance this has regressed with MT 2.10.0. We have upgraded all our services to MT 2.10.0 last week. And just noticed the same issue on a topshelf service that failed to restart after last night's patch cycle by our IT-Admins. I have run the stress test tool and can confirm the issue.

Attached is the log-file from the stress tool, if you like.

-Ash
mt-rabbit-stress.log

Chris Patterson

unread,
Jul 28, 2015, 1:09:05 PM7/28/15
to MassTransit Mailing List
What version were you running before? Just need to understand what the delta is between what you were using before and what you have running now. 

I'm thinking there may have been a change in the RabbitMQ client since it was upgraded with 2.10 as well, and a new exception is causing it to be handled differently. 

--
Chris Patterson

<mt-rabbit-stress.log>

Ash

unread,
Jul 28, 2015, 2:07:18 PM7/28/15
to masstransit-discuss, ch...@phatboyg.com
We were using MT-v2.9.9 with RabbitMQ.Client v3.4.0. And we upgraded to MT-v2.10 with RabbitMQ.Client v3.4.3, as MassTransit.RabbitMQ-2.10 requires it which is same as the stress tool.

Tim Gebhardt

unread,
Jul 29, 2015, 1:04:14 AM7/29/15
to masstransit-discuss, ch...@phatboyg.com, a.sha...@gmail.com
I'm going to pile on and say that we've been seeing the same issue as well.  These out of band zero-days for Windows have been... rough... on our system with our windows services (topshelf) not reconnecting to rabbitmq.  MT 2.9.8 and RMQ.Client 3.3.5.  

Sadprofessor

unread,
Jul 29, 2015, 5:32:06 AM7/29/15
to masstransit-discuss, ch...@phatboyg.com, a.sha...@gmail.com, t...@gebhardtcomputing.com
Yes we saw this issue this morning as well after a service restart

Sadprofessor

unread,
Jul 30, 2015, 8:58:29 AM7/30/15
to masstransit-discuss, ch...@phatboyg.com
I tried running the stress tool against lots of different versions of MassTransit.Rabbitmq last night and couldn't get any of them to work properly. On disconnection from RMW the console stress app logs lots of errors (expected) but then completes the test early and closes. Can someone else verify they can actually get it to work and if so what versions of MassTransit and RabbitMq?

Attached are the logs from stress running against 2.9.8 (using rabbitmq client 3.3.5)
masstransit-stressconsole-20150730-13.0.log

Sadprofessor

unread,
Jul 30, 2015, 10:59:02 AM7/30/15
to masstransit-discuss, ch...@phatboyg.com, thomas...@gmail.com
Ok I've gone further back with stress and MT 2.9.0 is the last version to work properly (surviving a rabbit restart)

Sadprofessor

unread,
Jul 30, 2015, 12:22:22 PM7/30/15
to masstransit-discuss, ch...@phatboyg.com
Apologies for keep adding to this post seemingly forever but I keep finding issues (or could be the same issue, very hard to tell).

I have a bug where a service won't reconnect to rabbit (again) but the errors look a little different.

When rabbit restarts the service hosting the bus seems to get into an infinite loop of reconnecting/disconnecting.

7,235 16:57:10 - Warn - Invalid Connection when executing callback RabbitMQ.Client.Exceptions.AlreadyClosedException: Already closed: The AMQP operation was interrupted: AMQP close-reason, initiated by Peer, code=320, text="CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'", classId=0, methodId=0, cause=
7,236 at RabbitMQ.Client.Framing.Impl.Connection.CreateModel()
7,237 at MassTransit.Transports.RabbitMq.RabbitMqProducer.Bind(RabbitMqConnection connection) in z:\Builds\work\4ed32a1c3fc3f594\src\Transports\MassTransit.Transports.RabbitMq\RabbitMqProducer.cs:line 58    at RabbitMQ.Client.Framing.Impl.Connection.CreateModel()
7,238 at MassTransit.Transports.RabbitMq.RabbitMqProducer.Bind(RabbitMqConnection connection) in z:\Builds\work\4ed32a1c3fc3f594\src\Transports\MassTransit.Transports.RabbitMq\RabbitMqProducer.cs:line 58

and then


  7,220 16:57:10 - Error - Failed to consume message from endpoint MassTransit.Transports.InvalidConnectionException: rabbitmq://localhost/blah?prefetch=10&autodelete=False => Invalid connection to host ---> RabbitMQ.Client.Exceptions.AlreadyClosedException: Already closed: The AMQP operation was interrupted: AMQP close-reason, initiated by Peer, code=320, text="CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'", classId=0, methodId=0, cause=
7,221 at RabbitMQ.Client.Framing.Impl.Connection.CreateModel()
7,222 at MassTransit.Transports.RabbitMq.RabbitMqProducer.Bind(RabbitMqConnection connection) in z:\Builds\work\4ed32a1c3fc3f594\src\Transports\MassTransit.Transports.RabbitMq\RabbitMqProducer.cs:line 58

it does this forever. Looking at the RabbitMq logs you can see it accepting a connection every time but then it immediately closes. Rabbit doesn't indicate any issues, only that the it's closing the AMQP connection.

Other services starting from scratch can connect to this restarted rabbit ok, so I don't think there is anything weird going on with the broker.

Cheers

Chris Patterson

unread,
Jul 30, 2015, 12:46:15 PM7/30/15
to masstrans...@googlegroups.com
I think that's where the issue is, and am working on it right now. I have the reconnection under control now on my branch of develop, so I think there will be a 2.11 release pretty quickly.

Sadprofessor

unread,
Jul 30, 2015, 12:51:36 PM7/30/15
to masstransit-discuss, ch...@phatboyg.com
sweeeeet, thanks

Chris Patterson

unread,
Jul 30, 2015, 1:39:12 PM7/30/15
to masstrans...@googlegroups.com
I pushed the code changes to /develop on MassTransit, if anyone wants to build their own version and verify that it works.

Chris Patterson

unread,
Jul 30, 2015, 5:40:59 PM7/30/15
to masstrans...@googlegroups.com
If anyone is able to try out the packages before I update the rest of the dependency chain, that would really help a lot!



Sadprofessor

unread,
Jul 31, 2015, 3:42:39 AM7/31/15
to masstransit-discuss, ch...@phatboyg.com
Hi Chris, I'll give them a try this morning
To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-discuss+unsub...@googlegroups.com.
To post to this group, send email to masstrans...@googlegroups.com.

Sadprofessor

unread,
Jul 31, 2015, 4:59:55 AM7/31/15
to masstransit-discuss, ch...@phatboyg.com
Hi Chris,

Tried 2.10.1 in our system and it reconnects great. We're still having issues with the stress tool though, it seems to want to end early and not even complete the number of iterations on a broker restart (and loses messages anyway), but I'm much happier with the reconnection logic which was the big issue for us.

Let me know if you need any more info from me

Cheers


On Thursday, 30 July 2015 22:40:59 UTC+1, Chris Patterson wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-discuss+unsub...@googlegroups.com.
To post to this group, send email to masstrans...@googlegroups.com.

Chris Patterson

unread,
Jul 31, 2015, 10:20:59 AM7/31/15
to masstrans...@googlegroups.com
Yeah, the stress tool has a bug where it exits on the first exception (the publish throws an exception when sending the request). I have fixed that, but haven't pushed it to GitHub yet.


To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-dis...@googlegroups.com.
To post to this group, send email to masstrans...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages