Some background on our environment, we have a wcf based windows service that gets on the order of 10-100 calls per second in production, there is a single global IConnection in the application and when a message is to be sent a Task.Factory.StartNew task is created and a new channel is created (some channels use publisher confirms, some dont). This service only sends and does not read any messages. The RabbitMQ broker is running on localhost
This has been running fine for years using RabbitMQ.Client 5.1, then this week we deployed an update that brought RabbitMQ.Client up to the latest 6.4 version, and the system will sporadically get tons of failed sends at the same time, some sends do work. The deploy worked fine for a month in our staging environment, and only experienced these issues on the production system.
```
2023-02-09 16:46:18.020
Level: Error
Description: Failed to send BroadcastQueryEvent
Exception: System.TimeoutException: The operation has timed out.
at RabbitMQ.Util.BlockingCell`1.WaitForValue(TimeSpan timeout)
at RabbitMQ.Client.Impl.SimpleBlockingRpcContinuation.GetReply(TimeSpan timeout)
at RabbitMQ.Client.Impl.ModelBase.ModelRpc(MethodBase method, ContentHeaderBase header, Byte[] body)
at RabbitMQ.Client.Framing.Impl.Model._Private_ExchangeDeclare(String exchange, String type, Boolean passive, Boolean durable, Boolean autoDelete, Boolean internal, Boolean nowait, IDictionary`2 arguments)
at RabbitMQ.Client.Impl.AutorecoveringModel.ExchangeDeclare(String exchange, String type, Boolean durable, Boolean autoDelete, IDictionary`2 arguments)
at ....BroadcastEventMessenger.SendQueryEvent(BroadcastQueryEvent queryEvent)
2023-02-09 16:46:18.020
Level: Error
Description: Failed to send BroadcastQueryEvent
Exception: System.TimeoutException: The operation has timed out.
at RabbitMQ.Util.BlockingCell`1.WaitForValue(TimeSpan timeout)
at RabbitMQ.Client.Impl.SimpleBlockingRpcContinuation.GetReply(TimeSpan timeout)
at RabbitMQ.Client.Impl.ModelBase.ModelRpc(MethodBase method, ContentHeaderBase header, Byte[] body)
at RabbitMQ.Client.Framing.Impl.Model._Private_ExchangeDeclare(String exchange, String type, Boolean passive, Boolean durable, Boolean autoDelete, Boolean internal, Boolean nowait, IDictionary`2 arguments)
at RabbitMQ.Client.Impl.AutorecoveringModel.ExchangeDeclare(String exchange, String type, Boolean durable, Boolean autoDelete, IDictionary`2 arguments)
at ....SendQueryEvent(BroadcastQueryEvent queryEvent)
2023-02-09 16:46:18.006
Level: Error
Description: Failed to send BroadcastQueryEvent
Exception: System.TimeoutException: The operation has timed out.
at RabbitMQ.Util.BlockingCell`1.WaitForValue(TimeSpan timeout)
at RabbitMQ.Client.Impl.SimpleBlockingRpcContinuation.GetReply(TimeSpan timeout)
at RabbitMQ.Client.Impl.ModelBase.ModelRpc(MethodBase method, ContentHeaderBase header, Byte[] body)
at RabbitMQ.Client.Framing.Impl.Model._Private_ExchangeDeclare(String exchange, String type, Boolean passive, Boolean durable, Boolean autoDelete, Boolean internal, Boolean nowait, IDictionary`2 arguments)
at RabbitMQ.Client.Impl.AutorecoveringModel.ExchangeDeclare(String exchange, String type, Boolean durable, Boolean autoDelete, IDictionary`2 arguments)
at ....SendQueryEvent(BroadcastQueryEvent queryEvent)
2023-02-09 16:46:18.006
Level: Error
Description: Failed to send BroadcastQueryEvent
Exception: System.TimeoutException: The operation has timed out.
at RabbitMQ.Util.BlockingCell`1.WaitForValue(TimeSpan timeout)
at RabbitMQ.Client.Impl.SimpleBlockingRpcContinuation.GetReply(TimeSpan timeout)
at RabbitMQ.Client.Impl.ModelBase.ModelRpc(MethodBase method, ContentHeaderBase header, Byte[] body)
at RabbitMQ.Client.Framing.Impl.Model._Private_ExchangeDeclare(String exchange, String type, Boolean passive, Boolean durable, Boolean autoDelete, Boolean internal, Boolean nowait, IDictionary`2 arguments)
at RabbitMQ.Client.Impl.AutorecoveringModel.ExchangeDeclare(String exchange, String type, Boolean durable, Boolean autoDelete, IDictionary`2 arguments)
at ....SendQueryEvent(BroadcastQueryEvent queryEvent)
```
My first guess is there could be some kind of race condition inside the rabbitmq library, as the production system has never gotten these errors in years of running with the 5.1 client, and deploying an update that only changed the Rabbitmq.client nuget version to 6.4 resulted in these sporadic errors.
There are no log entries in RabbitMQ (which was set to log.file.level info) when this happened
This previous posts seem similar, timeouts on 6.x
https://groups.google.com/g/rabbitmq-users/c/3TbIhO9e9fA/m/alvehhUSBQAJDoing testing on the production system is of course problematic as these messages failing to send impacts our business operations, so we have to rollback the deploy at the first sight of these errors
I am continuing to attempt to replicate the issue locally but haven't been able to so far, what steps would you recommend to troubleshoot this?