Azure Service Bus Queue/ICommand : FIFO or not FIFO order

378 views
Skip to first unread message

Bob Petitcolin

unread,
Apr 13, 2015, 7:38:37 PM4/13/15
to particula...@googlegroups.com
Version: 6.2.0
Stacktrace: 
Sender Endpoint trace with IMutateOutgoingTransportMessages and TimeSent Header of transportMessage :
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:52:08:037673 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:52:45:024754 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:52:55:277772 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:53:00:925781 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:53:04:225786 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:53:19:666810 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:53:29:822838 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:53:39:114962 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:53:48:183193 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:53:57:431217 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:54:06:531230 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:54:15:794245 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:54:25:802262 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:54:44:591297 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:54:54:168315 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:55:03:564331 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:55:13:430355 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:55:23:226376 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:55:33:128391 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:55:42:370562 Z
Comand ProceedTNReportsV2 Outgoing with TimeSent Header at 2015-04-13 12:55:51:653093 Z
... 

 Receiver endpoint trace with IMutateIncomingTransportMessages and TimeSent Header of transportMessage :
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:52:08:037673 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:52:45:024754 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:52:55:277772 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:53:00:925781 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:53:04:225786 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:53:19:666810 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:53:29:822838 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:53:48:183193 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:54:06:531230 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:54:15:794245 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:54:25:802262 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:54:44:591297 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:55:03:564331 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:55:13:430355 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:55:33:128391 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:55:42:370562 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:56:00:375254 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:56:09:643340 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:56:56:548547 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:57:16:542582 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:57:21:714591 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:57:43:714626 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:57:53:822643 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:58:13:953673 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:58:33:843722 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:55:51:653093 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:56:28:942482 Z

Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:59:53:533869 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 13:00:13:093900 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 13:00:32:432962 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 13:00:51:008991 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:55:23:226376 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:58:43:621737 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:59:03:181770 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:56:19:250459 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:59:13:331795 Z
Comand ProceedTNReportsV2 Incoming with TimeSent Header at 2015-04-13 12:53:57:431217 Z
... 
 
Other Products versions :
NServiceBus : 5.2.0
NServiceBus.Azure : 6.2.1
NServiceBus.Hosting.Azure : 6.2.1
SDK Azure : 2.5
WindowsAzure.ServiceBus : 6.2.4
WindowsAzure.Storage : 4.3.0 
Microsoft.WindowsAzure.ConfigurationManager : 2.0.3 

Description: 

Hello, 

I'm working with NServiceBus on Azure and Azure Service Bus Transport. We have an endpoint that sends commands ProceedTNReportsV2 (inheriting IComand). There is a second endpoint on another worker role, which processes the commands. There are also other commands and events emitted via both worker role and another Web role. Commands are not processed by the consumer in the FIFO order as seems to show the traces that I inserted earlier in this issue. The traces were realized with two mutators, IMutateOutgoingTransportMessages for the producer endpoint and IMutateIncomingTransportMessages for the consumer endpoint. Trace contain also TimeSent Header of transportMessage.

The azure service bus queues are expected to respect the FIFO order as stated in MSDN or site NServiceBus. Did I miss a configuration setting or something else?

Thanks

Vincent Petitcolin

Sean Feldman

unread,
Apr 14, 2015, 9:52:06 AM4/14/15
to particula...@googlegroups.com
Hi Bob,

Event though Azure Service Bus documentation says that ASB should be used when "Your solution requires the queue to provide a guaranteed first-in-first-out (FIFO) ordered delivery.", you can get messages out of order in scenarios where there was a delivery issue and message had to be resent, or you have more than one receiving thread. 
For ASB the best thing is to build order into your application logic. 

Bob Petitcolin

unread,
Apr 14, 2015, 11:47:06 AM4/14/15
to particula...@googlegroups.com
Hi sean,

Thank you for your answer and your explanation of the scenarios that may cause that the messages are processed out of order. It turns out that there is only one receiving process on the worker role/endpoint that processes commands. Having also believed resent message I put up a push-based notifications error described in the article: Push-based error notifications with MessageSentToErrorQueue, MessageHasFailedAFirstLevelRetryAttempt and MessageHasBeenSentToSecondLevelRetries events. I have not received any notification of errors for this command having first verify proper operation of these notifications. Are notifications to these three types of "errors" allows me to say with certainty that there is no resend message? If so, I do not think finding me in both scenarios you mentioned.

If there is one feature, it is more on the side of the worker role/endpoint that produces the messages. This is classic AsA_Worker type, but the party producing this command happens to be a TCP socket server asynhrone multithreaded connected with multiple mobile devices. Does this could rise as header TimeSent do not correspond to the reality of the time the message was sent in the queue? However, there is such a discrepancy between the order of commands sent and the order in which they are processed (X message sent with a TimeSent with 10 minutes after a Y message TimeSent can be processed first) that I not think it came from there. Moreover, this occurs for other commands that are emitted by handlers who do not have this feature.

What I also noticed is that when the queue is empty or has few messages, orders are processed in the order. Which is not the case when there is a increasement in the number of commands and the queue is filled with hundreds of messages expectations.

I agree with you about managing the order of messages in the application logic, it's just that so far this phenomenon did not happen, and I really do not understand where it comes from, as this does not seem normal. And I do not like not to understand.

Thanks

Sean Feldman

unread,
Apr 14, 2015, 12:07:48 PM4/14/15
to particula...@googlegroups.com
Hi Bob,
Your assumption about push notifications is correct.
Are your queues have partitioning enabled by any chance?

Yves Goeleven

unread,
Apr 14, 2015, 12:54:23 PM4/14/15
to particula...@googlegroups.com
Hi Bob,

I think you can only guarantee FIFO with ASB under following conditions, which definitly does not correspond to the default usage patterns

- Sessions must be set to required (eventhough you can enable this, it is not really supported by nservicebus as you cannot control the sessionid of the underlying brokeredmessage, so it would equal the messageid making sessions moot)
- Partitioning must be disabled on the queue (not sure about this one, could be that it would work if partitionkey== sessionid , but you cannot control this through nsb today as said in the previous remark)
- Message ordering must be enforced on the queue
- Must use receive & delete semantics (corresponds to disabled transactions in nservicebus and has severe implications on your architecture)

In theory we could add some hooks for you to control the sessionid (and maybe partitionkey), but you would still have to program without transactional behavior for it to really work...

Bob Petitcolin

unread,
Apr 14, 2015, 2:15:16 PM4/14/15
to particula...@googlegroups.com
Sean and Yves,

First of all thank you for your help and clarification. The queue is not partitioned and sessions are not enabled. Ordering is activated. Below the queue properties.

My English is not very good and I do not know well the mechanisms and semantics associated with ASB (sessions, partitionning, brokered message ...). So I have a little trouble to understanding the overall solution design for a FIFO guarantee. Moreover, we agree that the deactivation of transactions is not a good thing. So I do not think we should go that way.

However, what I do not understand is why, if I did not error in the processing of commands and therefore no resent messages, do I have a timing delay so important? Is this a normal behavior? I think I must have a problem somewhere in my solution.

Also, I noticed today when reproducing the problem in debug on my computer some messages going in the "deadlettermessages." But I have not received any notification on the event MessageSentToErrorQueue or notifications for the other two évennements (MessageHasFailedAFirstLevelRetryAttempt and MessageHasBeenSentToSecondLevelRetries). Finally, I note that there have been several MessageLockLostException (envrion ten during the debug phase) and the number of messages in the "deadlettermessages" over a hundred, which shall show that it is not related.
Would i not have received a notification for MessageLockLostException and for messages moved into the deadlettermessages?

Thanks

Yves Goeleven

unread,
Apr 14, 2015, 2:21:41 PM4/14/15
to particula...@googlegroups.com
I don't think you did anything wrong, ASB just isn't guaranteeing FIFO by default. 

The reason for this is that, as far as I can tell, ASB internally has been constructed for failover & scaleout, meaning that it runs multiple front-end in parallel, which probably prefetch some of the messages when the connection is active and load goes up. Running machines in parallel, will increase the chances of messages getting out of order internally in the ASB servers...


Sean Feldman

unread,
Apr 14, 2015, 3:10:30 PM4/14/15
to particula...@googlegroups.com
Hi Bob,

Quick note on deadletterqueue filling up - you've got a setting, MaxDeliveryCount. If what happens is that your endpoint is grabbing a batch of messages for processing (1K messages by default) and some don't get processed within LockDuration time (3 min), they will be "unlocked" and available for processing again. If that happens MaxDeliveryCount, then those messages will be send to dead letter queue and no push notifications will happen. Push notifications indicate when an error with message happens during NSB processing. Since no error took place (we just never got to processing of the message), no push notification is issued.

You could reduce your BatchSize so that no messages go into dead letter queue. Optimal batch size would be equivalent to number of threads which should be CPU count x 12.
Reply all
Reply to author
Forward
0 new messages